Theory

OMEGA Theory

OMEGA is a conformation generator for molecules. OMEGA is composed of two main components; model building of molecular fragments and torsion driving. Model generation may be bypassed by importing fragment structures from external sources ([Hawkins-2010], [Stahl-2001], [Stahl-2002]).

OMEGA builds initial models of structures by assembling fragment templates along sigma bonds. Input molecules’ graphs are fragmented at exocyclic sigma, and carbon to heteroatom acyclic (but not exocyclic) sigma bonds. Conformations for the fragments are either retrieved from pregenerated libraries built with MakeFraglib, or constructed on-the-fly using the same distance constraints followed by the geometry optimization protocol that MakeFraglib uses. Molecule assembly is accomplished by simple vector alignment since all inter-fragment joints are along sigma bonds.

Once an initial model of a structure is constructed, or given as input, OMEGA generates additional models by enumerating ring conformations and invertible nitrogen atoms. Ring conformations are taken from the same fragment library used to build an initial model. OMEGA detaches all exocyclic substituents from a ring system, aligns and attaches them relative to the new ring conformation. OMEGA attempts to generate every possible combination of ring conformations possible for a given structure.

The next step in model generation is to detect and enumerate invertible nitrogens. Nitrogens that have pyramidal geometry, no stereochemistry specified, no more than one hydrogen, are three valent, and have no more than three ring bonds are considered by OMEGA to be invertible. Invertible in this context simply means that at room temperature a pyramidal nitrogen is likely to be able to rapidly (on an NMR timescale) interconvert between two puckered forms. All multiconformer ring models are further expanded by enumerating all possible nitrogen puckers. The resulting model set is the starting point for conformer search by torsion driving.

OMEGA begins the torsion search process by examining the molecular graph and determining the bonds that may freely rotate. By default, OMEGA selects acyclic sigma bonds that have at least one non-hydrogen atom attached to each end of the bond. By default, hydrogen rotors (e.g. hydroxyl groups) are not altered during the torsion search; however, this may be enabled as of version 2.5.1. The final ensemble selection is based on RMS distance of heavy atoms and sampled hydrogen atoms; unsampled hydrogen atoms do not affect this RMS distance. A list of possible dihedral angles are then assigned to each rotatable bond. The current mechanism for assignment is based on SMARTS matching, although alternate strategies for assigning angles based on experimental (i.e. X-ray) or theoretical (i.e. fragment optimization studies) are possible. The molecular graph is then subjected to pattern and geometric symmetry detection. Common patterns such as para-disubstituted benzene are used to reduce the number of symmetry equivalent dihedral angles that need to be searched. All torsions are altered by 120 and 180 degrees, and an RMS calculation is performed taking into account symmetry equivalent atoms in order to detect two and three fold symmetries. Exhaustive depth first torsion search is performed on each of the fragments, and the resulting conformers are placed into a list sorted by energy. Entire structures are assembled by combining the lowest energy set of fragments, and then the next lowest set, until the search is terminated. The search will terminate when the limit on the total number of conformers that may be generated is exceeded, the fragment list is exhausted, or the sum of the fragment energies exceeds the energy window of the global minimum structure. The best conformers identified in the torsion search are rank ordered by energy. A final ensemble is selected by sequentially testing the conformers using the RMS distance cutoff. To be accepted in to the final ensemble, a conformer must have an RMS distance to every other member of the ensemble that exceeds the user defined cutoff value. The final ensemble is populated up to the user defined maximum ensemble size limit, or until the list of low energy conformers is exhausted.

Note

Conformers generated from OMEGA with default options do not maintaining input relative stereo chemistry ie. cis/trans for cyclohexane or other cyclic molecules.

Macrocycle Conformations

Torsion driving conformational sampling methods such as described in the OMEGA Theory section often perform poorly for macrocyclic molecules due to the problem of ring closure after the torsion driving step. Conformational sampling of macrocycles therefore requires a different approach; for example distance geometry (DG) [Crippen-1988], molecular dynamic (MD/LLMOD) [Watts-2014], MD with perturbation along low energy eigenvectors [Labute-2010], or inverse kinematics [Coutsias-2016] have all been applied to this problem.

OMEGA’s method of conformational sampling of macrocycles is an adaptation of the distance geometry method of Spellmeyer et al. [Spellmeyer-1997]. In this method a traditional embedding DG algorithm is replaced with a direct error function minimization of the random atomic coordinates, followed by force field refinement. Each initial Cartesian atomic coordinate x is assigned by choosing a random number r between -1 and 1, multiplied by a factor \(f\sqrt{N}\) which determines the box of maximum extent for molecule with N atoms:

(1)\[x = f\sqrt{N}r\]

Alternatively, instead of randomly placing atoms in Cartesian space, the method allows for random placement of rigid fragments such as aromatic rings, nitro groups etc. After randomly placing molecules atoms or rigid fragments, an error function of the form:

(2)\[F = \sum_{i,j} (d_{ij}-c_{i,j})^2 + \sum_k V_k\]

is optimized. In the above equation the first sum runs over all pairs of atoms in the molecule, where \(d_{ij}\) are the interatomic distances and \(c_{ij}\) are elements of the constraint matrix respectively; they are obtained from the MMFF94 force field parameters. When atoms (i,j) are bonded or are the first and last atoms in a bond angle, the upper and lower bounds are the same and are taken as the corresponding equilibrium force field distances. When atoms (i,j) are the first and last atoms of a torsion angle, the lower bound corresponds to the cis configuration and the upper bound to the trans configuration of those atoms. Finally, when atoms (i,j) are separated with more than 3 bonds, the lower bound is taken as the sum of the vdW radii and the upper bound as the sum of bond lengths which separate the pair. The second summation in equation (2) is over the tetrahedral constraints which result from:

  1. planarity

  2. chirality

  3. cis-trans isomerism

Optimization of the error function (2) leads to a rough conformation. No hydrogen atoms except those which are bonded to chiral atoms are included in the error function minimization. Each rough conformation is checked for chirality correctness before refinement. If the rough conformation passes the chirality checks it is refined against a forcefield, MMFF94 ([Halgren-I-1996], [Halgren-II-1996], [Halgren-III-1996], [Halgren-IV-1996], [Halgren-V-1996]). Solvent forces can be included in the refinement step using a simple continuum solvation model (the Sheffield model [Grant-2007]). When the sequence of random placement followed by error function minimization and force field refinement is repeated for large enough number of times for a given macrocycle, its conformational space is reasonably well covered.

Special measures are taken for zwitterionic molecules (containing both positively and negatively charged groups, e.g. \(CO_2^-\) and \(NH_3^+\)). In order to prevent possible Coulombic collapse in the absence of a stabilizing receptor, zwitterionic molecules are neutralized before performing the distance geometry calculation and the refined structures are recharged before writing them out. This approach meaningfully improves OMEGA’s ability to reproduce solid-state conformations of zwitterionic ligands.

This method of conformation sampling is completely general, and can be used to generate conformations for any molecule, whether or not it contains a large ring. However the ‘macrocycle’ mode in OMEGA has been specifically developed and parameterized to perform well on macrocycles, therefore its performance on linear and small ring molecules is worse than ‘classic’ OMEGA.