OMEGA Theory

OMEGA is a conformation generator for molecules. OMEGA is composed of two main components; model building of molecular fragments and torsion driving. Model generation may be bypassed by importing fragment structures from external sources ([Hawkins-2010], [Stahl-2001], [Stahl-2002]).

OMEGA builds initial models of structures by assembling fragment templates along sigma bonds. Input molecules’ graphs are fragmented at exocyclic sigma, and carbon to heteroatom acyclic (but not exocyclic) sigma bonds. Conformations for the fragments are either retrieved from pregenerated libraries built with makefraglib, or constructed on-the-fly using the same distance constraints followed by geometry optimization protocol that makefraglib uses. Molecule assembly is accomplished by simple vector alignment since all inter-fragment joints are along sigma bonds.

Once an initial model of a structure is constructed, or given as input, OMEGA generates additional models by enumerating ring conformations and invertible nitrogen atoms. Ring conformations are taken from the same fragment library used to build an initial model. OMEGA detaches all exocyclic substituents from a ring system, aligns and attaches them relative to the new ring conformation. OMEGA attempts to generate every possible combination of ring conformations possible for a given structure.

The next step in model generation is to detect and enumerate invertible nitrogens. Nitrogens that have pyramidal geometry, no stereochemistry specified, no more than one hydrogen, are three valent, and have no more than three ring bonds are considered by OMEGA to be invertible. Invertible in this context simply means that at room temperature a pyramidal nitrogen is likely to be able to rapidly (on an NMR timescale) interconvert between two puckered forms. All multiconformer ring models are further expanded by enumerating all possible nitrogen puckers. The resulting model set is the starting point for conformer search by torsion driving.

OMEGA begins the torsion search process by examining the molecular graph and determining the bonds that may freely rotate. By default, OMEGA selects acyclic sigma bonds that have at least one non-hydrogen atom attached to each end of the bond. By default, hydrogen rotors (e.g. hydroxyl groups) are not altered during the torsion search; however, this may be enabled as of version 2.5.1. The final ensemble selection is based on RMS distance of heavy atoms and sampled hydrogen atoms; unsampled hydrogen atoms do not affect this RMS distance. A list of possible dihedral angles are then assigned to each rotatable bond. The current mechanism for assignment is based on SMARTS matching, although alternate strategies for assigning angles based on experimental (i.e. X-ray) or theoretical (i.e. fragment optimization studies) are possible. The molecular graph is then subjected to pattern and geometric symmetry detection. Common patterns such as para-disubstituted benzene are used to reduce the number of symmetry equivalent dihedral angles that need to be searched. All torsions are altered by 120 and 180 degrees, and an RMS calculation is performed taking into account symmetry equivalent atoms in order to detect two and three fold symmetries. Exhaustive depth first torsion search is performed on each of the fragments, and the resulting conformers are placed into a list sorted by energy. Entire structures are assembled by combining the lowest energy set of fragments, and then the next lowest set, until the search is terminated. The search will terminate when the limit on the total number of conformers that may be generated is exceeded, the fragment list is exhausted, or the sum of the fragment energies exceeds the energy window of the global minimum structure. The best conformers identified in the torsion search are rank ordered by energy. A final ensemble is selected by sequentially testing the conformers using the RMS distance cutoff. To be accepted in to the final ensemble, a conformer must have an RMS distance to every other member of the ensemble that exceeds the user defined cutoff value. The final ensemble is populated up to the user defined maximum ensemble size limit, or until the list of low energy conformers is exhausted.

Filtering

Filtering based on graph (and possibly physical) properties should always be carried out prior to generating multi-conformer databases using OMEGA. Eliminating undesirable compounds prior to generating conformers will save execution time of both OMEGA and down stream applications, and space on disk. Large polypeptides or proteins, very flexible molecules, or simply molecules that would never be considered useful for the ultimate modeling application are best eliminated from a data set as early as possible.

The most important graph filter to apply is rotatable bond count. Although OMEGA may be able to generate conformers for molecules with more than 20 rotatable bonds, the results of such an exercise would be dubious at best for any conformer generation method. Number of rings, especially flexible rings, should also be used to exclude irrelevant molecules. OMEGA will handle molecules that have many thousands of possible ring combinations, although the time expenditure may be prohibitive and the results equally questionable to molecules with an unreasonably large number of rotatable bonds. Simple element filters may be useful as well, although OMEGA will discard compounds for which no force field parameters exist. Using element filters beforehand may simply aid in tracking the rationale for discarding compounds instead of searching through OMEGA log files for failure modes.

The filter program from OpenEye Scientific Software provides all of the functionality outlined above, and additional physical property filters. It is highly recommended that filter or a program similar in functionality be used for input file preparation of large datasets.

Stereochemistry Enumeration

Compounds that contain unspecified or ambiguous definitions of stereochemistry may be preprocessed before generating conformers to add explicitly specified stereochemistry. Input molecules that have three dimensional coordinates inherently have stereochemistry specified, but SMILES or two dimensional SD files may have atoms (R/S) or bonds (E/Z) for which the stereochemistry is unknown or unspecified. OMEGA will generate structures for compounds of unknown configuration, however, in virtual screening exercises it may be beneficial to simply enumerate possible stereoisomers and treat each stereoisomer as a separate compound.

OMEGA distributions contain a utility called flipper that enumerates unspecified stereochemistry within user defined limits. For an explanation of how to use flipper, consult the part_flipper chapter in the OMEGA application manual. Stereochemistry enumeration is an exponential task. For every atom or bond (N) in a molecule that has two possible stereochemical ‘states’, there are \(2^{N}\) possible stereoisomers for the molecule. Enumerating all possible stereoisomers for molecules may be unreasonable in terms of CPU time or storage space. flipper has user defined limits to regulate the enumeration process and keep it practical for routine applications.

Enumerating stereoisomers provides an information gain that may aid in operations downstream from conformer generation. flipper, or a similar stereochemistry enumeration tool should be used prior to generating conformers with OMEGA.

Fragment Library Generation

Prior to conformer generation, OMEGA builds a set of three dimensional models of a compound that contain the bond lengths, angles, and ring conformations that will be held fixed during the torsion search. OMEGA is capable of generating fragment templates on-the-fly, however, using pregenerated templates is far more efficient and will speed execution of conformer generation. OMEGA distributions include a program called makefraglib which can be used to create libraries of molecular fragments and ring conformers that can be used by OMEGA to build three dimensional models of molecules. For an explanation of how to use makefraglib, consult the Makefraglib chapter in the OMEGA application manual. In practice, extensive fragment libraries need only to be constructed a single time and rarely need to be altered. Corporate and vendor databases can be used to construct an extensive fragment library. Once built, the fragment libraries will rarely need to be updated, and will provide a significant enhancement in performance.

Macrocycle Conformations

Torsion driving conformational sampling methods such as described in the OMEGA Theory section often perform poorly for macrocyclic molecules due to the problem of ring closure after the torsion driving step. Conformational sampling of macrocycles therefore requires a different approach; for example distance geometry (DG) [Crippen-1988], molecular dynamic (MD/LLMOD) [Watts-2014], MD with perturbation along low energy eigenvectors [Labute-2010], or inverse kinematics [Coutsias-2016] have all been applied to this problem.

OMEGA’s method of conformational sampling of macrocycles is an adaptation of the distance geometry method of Spellmeyer et al. [Spellmeyer-1997]. In this method a traditional embedding DG algorithm is replaced with a direct error function minimization of the random atomic coordinates, followed by force field refinement. Each initial Cartesian atomic coordinate x is assigned by choosing a random number r between -1 and 1, multiplied by a factor \(f\sqrt{N}\) which determines the box of maximum extent for molecule with N atoms:

(1)\[x = f\sqrt{N}r\]

Alternatively, instead of randomly placing atoms in Cartesian space, the method allows for random placement of rigid fragments such as aromatic rings, nitro groups etc. After randomly placing molecules atoms or rigid fragments, an error function of the form:

(2)\[F = \sum_{i,j} (d_{ij}-c_{i,j})^2 + \sum_k V_k\]

is optimized. In the above equation the first sum runs over all pairs of atoms in the molecule, where \(d_{ij}\) are the interatomic distances and \(c_{ij}\) are elements of the constraint matrix respectively; they are obtained from the MMFF94 force field parameters. When atoms (i,j) are bonded or are the first and last atoms in a bond angle, the upper and lower bounds are the same and are taken as the corresponding equilibrium force field distances. When atoms (i,j) are the first and last atoms of a torsion angle, the lower bound corresponds to the cis configuration and the upper bound to the trans configuration of those atoms. Finally, when atoms (i,j) are separated with more than 3 bonds, the lower bound is taken as the sum the vdW radii and the upper bound as the sum of bond lengths which separate the pair. The second summation in equation (2) is over the tetrahedral constraints which result from:

  1. planarity
  2. chirality
  3. cis-trans isomerism

Optimization of the error function (2) leads to a rough conformation. No hydrogen atoms except those which are bonded to chiral atoms are included in the error function minimization. Each rough conformation is checked for chirality correctness before refinement. If the rough conformation passes the chirality checks it is refined against a forcefield, MMFF94 ([Halgren-1996-1], [Halgren-1996-2], [Halgren-1996-3], [Halgren-1996-4], [Halgren-1996-5]). Solvent forces can included in the refinement step using a simple continuum solvation model (the Sheffield model [Grant-2007]). When the sequence of: random placement – error function minimization – force field refinement is repeated for large enough number of times for a given macrocycle, its conformational space is reasonably well covered.

This method of conformation sampling is completely general, and can be used to generate conformations for any molecule, whether or not it contains a large ring. However the ‘macrocycle’ mode in OMEGA has been specifically developed and parameterised to perform well on macrocycles, therefore its performance on linear and small ring molecules is worse than ‘classic’ OMEGA.