Freeform

Warnings

Warning

freeform -calc conf takes over 6 Gigabytes of memory by default!

The conformer generation step of freeform -calc conf uses a default maximum conformer limit of 40000 conformers, leading to a high memory usage of over 6 Gb of memory. Reducing the maximum conformer limit with the -maxconfs flag will significantly reduce memory usage, although this may reduce the accuracy of the conformer free energies, especially with molecules containing many rotatable bonds. If memory usage is an issue, we recommend running a calculation using maxconfs 20000, and if this maximum is not reached in the initial conformer generation step (check the line containing “generating conformers... ” in the log file) then there will be no loss in accuracy compared to the default 40000.

Warning

freeform is not distributed for 32-bit architectures (Due to the memory requirements)

Warning

The graph in the pdf output of freeform -calc conf may not show all minima

The “dG vs Erel” graph in the pdf output of freeform -calc conf uses default axes spanning only 0 to 7 kcal/mol for conformer dG and 0 to 5 kcal/mol for Erel; generally the whole unbound ensemble contains minima above these ranges (sometimes the majority of minima). The data for all the minima are contained in both the output log and csv files; the graph is intended only to show minima that could potentially contribute significantly to the room-temperature ensemble, hence the limited range of the axes. The axes are extended somewhat to show tracked minima, but beyond 30 kcal/mol in either dG or Erel not even these data are shown in the graph.

Warning

It is possible to get a negative Local Strain Energy with -solvent PB

Ordinarily, Local Strain Energy can never be negative because the energy can only decrease once the restraints are removed and the ligand is allowed to freely minimize. With -solvent PB, PB single-point solvation energies are taken of the restrained minimum and the corresponding freely minimized minimum. There are two reasons why this could result in a negative Local Strain Energy. The first is due to the precision of the PB calculation being around 0.2 kcal/mol: if the true restrained and free minima are within 0.2 kcal/mol the restrained minimum could end up with a slightly higher PB solvation energy simply due to the imprecision. The second reason is that the Sheffield solvation energy used in minimizing from the restrained to the free minimum may simply differ in the opposite direction from the PB solvation energy for the same minima.

Warning

AM1BCC ELF10 charging may use fewer than 10 conformers for charging

AM1BCC ELF10 charging is designed to average the AM1BCC charges from 10 conformers chosen from the 2% of the conformer population having the Electrostatically Least-interacting Functional (ELF) groups. 10 conformers from 2% means there must be at least 500 conformers to start with; ligands which have fewer rotatable bonds may not have this many. In such cases, the AM1BCC ELF10 method is designed to take all the conformers in the 2% ELF population. For example if the starting conformer set has 327 conformers, all 6 conformers in the 2% ELF population will be used. With 50 or fewer conformers, a single ELF conformer with be used. When more than one conformer is used, the log file line starting with “Semiempirical charge averaging” tells how many were used.

Command Line Interface

A description of the command line interface can be obtained by executing freeform with the --help all option.

prompt> freeform --help all

This will generate the following output:

Calculation type
  -calc : Calculation type to be performed: conf or solv (default: conf)

Input/output options
  -in : Input molecular filename for constructing the ensemble for the
        partition function.
  -out : Output molecular filename
  -param : Parameter file of *freeform* settings
  -prefix : Prefix for generic output files
  -report : Graphic output filename

Other general options
  -ensemble : The conformers from the -in flag will be used directly as the
              starting conformational ensemble
  -ionic : Determines the ionic state of the molecule: pH74, input,
           uncharged (default: pH74)
  -rms : RMSD threshold (0.0 to 5.0) for de-duplication in conformation
         generation with Omega (default: calculation dependent)
  -solvcharges : Type of partial charges used for solvation: am1bccSymOpt,
                 mmff94, input (default: calculation dependent)

Options for only -calc conf
  -PBsolvent_dielectric : Solvent dielectric to use in PB solvation: between
                          2.0 and 80.0 (default: 80)
  -ewindow : Energy window to be used in the initial conformational search:
             0 to 200.0 (default: 15)
  -ff : Forcefield used for minimization and thermodynamics: mmff94s or
        mmff94 (default: mmff94s)
  -maxconfs : Maximum number of starting conformations to be generated: 0 to
              40000 (default: 40000)
  -solvent : Solvent model: sheffield, PB, or vacuum (default: sheffield)
  -track : Input filename of molecular conformers for which strain energies
           will be estimated.

Required Parameters

There are two required parameters -calc and -in.

Calculation type

-calc <run_type>

Selects type of calculations to be performed. There are two possible strings for parameter run_type: conf and solv. The first one asks freeform for the evaluation of conformational free energies while the second one tells freeform that solvation free energy is to be calculated. Default flag value is set at conf.

Input/output options

-in <filename>

Molecular input file name in any format supported by OpenEye product OMEGA. Warning: File name: freeform.oeb is used as the default output file name, so should be avoided as an input file name.

-i <filename>

An alias to -in

-out <filename>

Molecular output file name in any format supported by OEChem. The default name is freeform.oeb. For free energy estimation of the conformations for the input molecule, the output file contains all unique solution conformations structures identified by freeform. In the case of solvation free energy estimation, the output file contains the lowest energy vacuum conformation for which the PB solvation calculation was actually done.

-param <filename>

Command line options will be read from the specified file. This file may have been generated from a previous run or may be constructed de novo. The default name of the file is freeform.param. Any parameter in the parameter setup file is superseded by the parameter on the command line.

-prefix <p>

Replaces freeform prefix in .log, .param, .oeb and .pdf output files, with the input string p.

-report <filename>

Name of the graphical output file. Currently only pdf and ps formats are allowed. The default name is freeform.pdf.

-track <filename>

The molecular input of conformers to be tracked is read from the file name; any 3D molecule file format supported by OpenEye is acceptable. Of critical importance is that the molecule graph be identical to that specified in the -in input file, and that atom and bond ordering also be identical. For the most common uses of freeform, the same input file is specified for both the -in and -track options.

Advanced options

-ensemble

The ensemble of conformations is read from the -in input file instead of being generated internally. In both cases of -calc option, this ensemble is treated as if it already contains all the conformations needed. In conformer free energy estimation, when used together with -track, the tracked conformations are included with the ensemble.

-ewindow

The -ewindow flag sets the energy window (in kcal/mol) used as an energy cutoff in the initial conformation generation stage of freeform. An ewindow of at least 15.0 is encouraged to better cover accessible conformational space; increasing ewindow will increase the number of conformers. [default = 15.0]

-ff <type>

This option specifies which forcefield to use in the energy minimizations and thermodynamics calculations following the initial conformer search. By default MMFF94S is used; this can be specified explicitly by setting parameter type to mmff94s). Alternatively, setting parameter type to mmff94 means the MMFF94 forcefield will be used instead. The two differ in how conjugated trivalent nitrogens are treated: with MMFF94S they tend to be more planar whereas with MMFF94 they tend to be more pyramidal.

-ionic <type>

By default a charge state used in calculations corresponds to pH = 7.4 (the value of parameter type is set at pH74). Two remaining values are uncharged and input. The former corresponds to an uncharged ionic state (i.e. no formal charges); the latter to the preexisting ionic state determined from the molecule input file.

-maxconfs <value>

The -maxconfs flag sets the maximum number of conformations to be generated using OMEGA for the initial conformation generation stage of freeform. A large set is desirable to cover the conformational space necessary to include all reasonable conformers that might contribute to the partition function. When this initial set is minimized a large reduction is expected in the number of unique minima. [default = 40000]

-rms <value>

RMS threshold for conformations generations with OMEGA. The default value is calculationDependent meaning that the RMS threshold used depends on whether “-calc conf” or “-calc solv” is being run. 0.3 Å is used with “-calc conf” and 0.6 Å with “-calc solv”. Values ranging from 0 to 5 are accepted.

-solvent <type>

By default Sheffield solvation is used with dielectric 80.0 to account for aqueous solvation of the unbound ensemble; this can be specifically requested by setting the parameter type to sheffield. Setting type to PB will result in Poisson-Boltzmann (PB) single-point solvation energies being calculated at the Sheffield-based minima. The default solvent dielectric for the PB calculation is 80.0, but this can be changed with the -PBsolvent_dielectric option. Setting type to vacuum means that no solvation energy will be calculated.

-PBsolvent_dielectric <value>

This option allows the user to change the solvent dielectric used in the PB single-point solvation energies requested with -solvent PB. The default value is 80.0 to approximate aqueous solvation; it can adopt any value between 1.0 and 80.0 .

-solvcharges <type>

Selects the type of partial charges to be used for the solvent forces. By default the value of type is calculationDependent meaning that the charge type used depends on whether “-calc conf” or “-calc solv” is being run. AM1BCC charges that are AM1-optimized (Opt) (constrained to starting geometry) and symmetric by 2D-bond symmetry (Sym) are used for conformer free energy estimation and non-symmetric (NoSym) AM1-single-point (SPt) for solvation free energy calculations. For net-charged species however, mmff94 charges are used for the conformer free energy estimation, even if the flag value selected is different than mmff94. The defaults are different because the science behind the solvation free energy calculation ([Nicholls-2010]) was specifically developed using the “NoSymSPt” variant of AM1BCC charges whereas the conformer free energies require the robustness of the canonical AM1BCC charging scheme (symmetric charges from a constrained AM1 optimization) towards large changes in conformation over the course of geometry optimization. In addition to the default calculationDependent, the other possible values of the parameter type are: am1bccSymOpt, am1bccNoSymOpt, am1bccSymSPt, am1bccNoSymSPt, mmff94 and input. The first four refer to different variants of the AM1BCC charging scheme; the next applies MMFF94 charges and the last is used to specify user-defined charges (which are read in from the input structure). If input is selected then the value of “-ionic” is automatically set to input.

Examples

Conformer free energies

The default calculation type in free form is -calc conf, so this is what it will run if no -calc option is specified. However, in the following examples we will specify this calculation type explicitly. The two main types of conformer free energy calculations are 1) when the input 3D structure is not known or not of special significance, and 2) when there is one or more input 3D structure(s) we would like to track, e.g. if it they are believed to be bioactive conformers. In the latter case, -track is specified, and freeform reports specifically on the global and local strain energies for these conformers.

Example 1

Let us begin with case 1 above: when the input 3D structure is not known or not of special significance. In this example the input is the SMILES string for Januvia in file januvia.smi:

(Fc1cc(c(F)cc1F)C[C@@H]([NH3+])CC(=O)N3Cc2nnc(n2CC3)C(F)(F)F)

The command is:

prompt> freeform -calc conf -in januvia.smi -prefix januvia.pfn

This produces five output files:

januvia.pfn.pdf: the results report summarizing key results for each molecule in the run.

januvia.pfn.log: the log file of per-molecule results. This also shows the progress of the calculation if it is taking a long time, for example if the run has to minimize a conformer ensemble containing 20000 conformers (this can take over half an hour).

januvia.pfn.csv: the csv file of per-conformer results (energies, entropies, and free energies)

januvia.pfn.oeb: the file of all the final minima for the molecule.

januvia.pfn.param: all the input parameters of the calculation are written here.

Let us first look at the results report in file januvia.pfn.pdf, shown below in Figure: Results report from Januvia SMILES input:

Results report from Januvia SMILES input

Results report from Januvia SMILES input

A 2-D depiction of the molecule is given in the upper left, and underneath is a table with two entries giving the relative energies (E(MMFF) + E(solv)) and conformer free energies (including entropy) for the relative energy (Erel) and free energy (dG) minimum conformations. All the energy units in freeform are in kcal/mol. The graph on the right-hand side shows the conformer free energy versus the relative energy for all the conformers produced in the calculation for the unbound ensemble. In general the graph cuts off at 5 kcal/mol for Erel and 7 kcal/mol for the conformer free energy; higher energy conformers are simply not shown. At the top of the graph we note that this molecule has six rotors (rotatable bonds) sampled in the conformation generation, and ultimately after the entropy calculation there were 116 unique minima for this molecule. The yellow dot on the graph corresponds to the relative energy minimum (conformer 7 in the logfile table, the .oeb file, and the .csv file), And we can see from the table and the graph that while it is the global energy minimum in terms of enthalpy, it would cost 2.34 kcal/mol in conformer free energy to select this conformer from the aqueous free-ligand ensemble. In contrast, conformer 0 with a relative energy of 0.93 kcal/mol is the lowest free energy conformer (corresponding to the blue dot on the graph), costing only 0.82 kcal/mol to select it from the ensemble.

Example 2

In the second example, the input file is sitagliptin.mol2; it is in .mol2 format and contains the Xray crystal coordinates for Januvia (sitagliptin). Here we are particularly interested in the conformer free energy for this input Xray structure since it is the bioactive conformation. We would like to know how much free energy it costs to select this bioactive conformation from the aqueous free ligand ensemble. To track the input conformation throughout the calculation, we specify -track sitagliptin.mol2 to indicate that we wish to use the 3D coordinates of the input structure and to track this conformer. Note that -in sitagliptin.mol2 must also be specified so that we use the same structure to derive the conformers for the unbound aqueous ensemble. Just to be clear: the -in is always required to specify the molecule to be used for the unbound aqueous ensemble; the -track is optional, only needed for tracking one or more specific conformers. The command is thus:

prompt> freeform -calc conf -track sitagliptin.mol2 -in sitagliptin.mol2 \
-prefix sitagliptin.3D.pfn

This produces five output files analogous to those above for Januvia and in addition produces these three files related to the tracked conformer, each with:

sitagliptin.3D.pfn.tracked_input.oeb the input tracked conformer.

sitagliptin.3D.pfn.tracked_rstr.oeb the restrained minimum from the input tracked conformer.

sitagliptin.3D.pfn.tracked_free.oeb the unrestrained minimum from the input tracked conformer.

the results report is in file sitagliptin.3D.pfn.pdf:

Results report from sitagliptin.3D.pfn

Results report from sitagliptin.mol2, using the “-track” flag

The results report in this case looks very similar to that of our first example except for two extra lines added to the table and the corresponding extra dot (in red) on the graph. The two extra lines in the table relate to the bioactive conformer specified in the input with the -track flag. “TrConf0” is an abbreviation for “Tracked Conformer 0”, that is the first of the tracked conformers (in this case there was only one). The line beginning “TrConf0 Free” gives information on the nearest unbound minimum to the tracked conformer, locating the unrestrained minimum from that conformer (in file sitagliptin.3D.pfn.tracked_free.oeb) as corresponding to conformer 9 the entire unbound ensemble, with relative energy 2.78 kcal/mol and conformer free energy 2.68 kcal/mol. The next line, beginning with “TrConf0 Strain Energies”, gives a local strain energy of 3.02 kcal/mol associated with Tracked Conformer 0 and a global strain energy of 5.70 for the same conformer.

These quantities are defined in the Freeform Theory section and depicted there in figure The components of Local and Global Strain Energies , but to briefly summarize: The local strain energy is the relative energy difference (internal energy + solvation energy) between the restrained energy minimum (the structure in file sitagliptin.3D.pfn.tracked_rstr.oeb) and the nearest unbound minimum (the structure in file sitagliptin.3D.pfn.tracked_free.oeb). Conceptually it corresponds to the energy required to distort the nearest unbound minimum to fit into the active site. The global strain energy is the sum of the local strain energy (3.02 kcal/mol) and the conformer free energy of the nearest unbound minimum (2.68 kcal/mol from the line above in the table), yielding the value of 5.70 kcal/mol as given.

Comparing the “Lowest Erel” and “Lowest dG” lines in the table between this example and the previous one, there are slight differences in the numbers (0.05 kcal/mol or less). These differences are expected between non-identical runs; in this case it is most likely attributable to slight differences in the AM1BCC ELF10 charges due to including the bioactive conformer. This is due to conformational variance in the AM1 charges which is an expected consequence of using any QM-based charging method; the ELF10 approach acts to minimize the effects of this conformational variance.

Example 3

The third example is like the second except that PB single-point solvation energies are requested with the -solvent PB option. This is a higher level of continuum solvation theory compared to Sheffield solvation and so the solvation energies are expected to be more accurate. Sheffield solvation is still used in finding the minima for the unbound ensemble because it is fast and has the necessary analytic second derivatives. The command is:

prompt> freeform -calc conf -solvent PB -track sitagliptin.mol2 \
-in sitagliptin.mol2 -prefix sitagliptin.3D.pb.pfn

This produces eight output files analogous to those above for Example 2.

Results report from sitagliptin.3D.pb.pfn

Results report from sitagliptin.mol2, using “-solvent PB”

Note that the graph differs markedly from that of Example 2, and the energies in the table have also changed markedly. The partial charges, the number of unique minima, and even the minima themselves are identical between Example 2 and this example; in both cases they are found using Sheffield solvation. The difference is simply the solvation energy for each conformer: in Example 2, the Sheffield solvation energy is used whereas in this example the Sheffield solvation energy has been replaced with a PB single-point solvation energy evaluated at the coordinates of the Sheffield-based minimum. The impact of changing to a PB energy is more pronounced with charged ligands as in this case.

Example 4

In the fourth and final example for freeform -calc conf, the same input file is used as in the second example, but now we would like to use the input charges in that file for the solvation energies. Note that in addition to specifying -solvcharges input we also need put -ionic input because the default behavior of potentially changing the ionic state for pH 7.4 cannot be allowed with input charges. The command is thus:

prompt> freeform -calc conf -solvcharges input -ionic input -track sitagliptin.mol2 \
-in sitagliptin.mol2 -prefix sitagliptin.3D.inp.pfn

The results report is in file sitagliptin.3D.inp.pfn.pdf and looks like:

Results report from sitagliptin.3D.inp.pfn

Results report using “-solvcharges input``

Given that the only the atomic partial charges differ between this example and Example 2, we see significant changes in all the energies, yielding quite a different profile for the conformational ensemble. Now the nearest unbound minimum to TrConf0 has a conformer free energy of 4.95 kcal/mol as opposed to 2.68 kcal/mol in Example 2. This follows through to cause a similar increase in the Global Strain Energy up to 7.70 kcal/mol from 5.70 in Example 2. These differences reflect the impact of using different charge models, especially on charged ligands; we recommend the default AM1BCC ELF10 charges.

Solvation free energies

Example 5

In order to estimate solvation free energy the user must use the option -calc followed by value solv. The default charge state of the input compound corresponds to pH = 7.4 (see option -ionic), so the following run:

prompt> freeform -calc solv -prefix januvia_ph74 januvia.smi

will estimate januvia solvation free energy at physiological pH. The graphical output file for this run is displayed in Figure: Solvation of januvia at physiological pH. On the left the result of the solvation free energy calculation is shown, including a depiction of the group-wise decompositions of the solvation free energy. It suggests that, as expected, the cationic \(NH_3\) group is responsible for almost all (-57.2 kcal/mol) of the strongly negative solvation free energy of this molecule, while the trifluorophenyl fragment is the most hydrophobic part of this drug, adding 3.0 kcal/mol to the solvation free energy. To the right, the calculated XlogP partition coefficient is given together with its fragment decomposition into group contributions. Note that the calculated XlogP partition coefficient is always done relative to the uncharged molecule irrespective of its ionic state in water at any pH.

_images/januvia_solv1.png

Solvation of januvia at physiological pH

Example 6

In the next example, the solvation free energy of januvia in its uncharged form is performed with:

prompt> freeform -calc solv -ionic uncharged -prefix januvia_neu januvia.smi

The output is shown in Figure: Solvation of the neutral form of januvia Note that changing the ionic state of the ligand only affects the solvation free energy on the left-hand side; the calculated XlogP partition coefficient is always done relative to the uncharged molecule so it is unchanged from the previous example. Now since the solvation free energy is also on the uncharged form, in this example (unlike the last) the solvation free energy and XlogP calculation correspond to the same (uncharged) form of Januvia.

_images/januvia_solv2.png

Solvation of the neutral form of januvia