When dealing with tautomers there are four primary tasks to be addressed. These include:
- Generating a single representative tautomer suitable for molecular visualization
- Generating a list of tautomers expected to be present in aqueous phase
- Generating a unique (canonical) representation for a set of tautomers
- Generating a substantial list of most possible tautomers suitable for input to further calculations.
Each of these four tasks is covered in a section below. In every case, the tautomerization can be handled alone, but it is generally more useful to handle tautomerization in conjunction with ionization. Each function includes the potential for pKa normalization alongside the tautomer normalization or enumeration.
While the unique or canonical tautomer generated above is useful for storage in a database, they do not necessarily represent a low-energy tautomer suitable for visualization by modelers and chemists. In order to generate a structure suitable for visualization, we recommend you use the function OEGetReasonableProtomer. This function sets the molecule into a low-energy, neutral-pH, aqueous-phase tautomeric state that should be pleasing for visualization in a scientific setting.
Reasonable Tautomer Ensemble¶
In the course of molecular modeling, it is often desirable to generate a small ensemble of low-energy, neutralpH, aqueous-phase tautomers. The function OEGetReasonableTautomers returns such an ensemble. In order to generate low-energy tautomers reliably, the function works with a form of the molecule that has formal charges removed. By default, each tautomer’s ionization state is set to a neutral pH form, but ionization states are not enumerated.
The following depictions show some examples of tautomers that are favored as “reasonable”:
The OEGetUniqueProtomer function is used for canonicalizing the tautomeric forms of a small molecule. Canonicalization converts any of the tautomeric forms of a given molecule into a single unique representation and removes all formal charges that can be appropriately neutralized. This is useful for database registration where alternate representations of tautomeric compounds often leads to duplicate entries in a database.
It is important to remember that a time limit cannot be used to control a canonical process as it might lead to hardware dependent behavior. Thus the identification of a unique protomer can take significant time when the size of the largest contiguous tautomeric atoms approaches or exceeds 30 atoms.
The tautomer returned by OEGetUniqueProtomer as the “canonical” representation often is not the physiologically preferred form. If a representative form is necessary, please use the functions referred to in the Visualization section above. If an ensemble of biologically relevant tautomers are necessary, please see the functions in the Reasonable Tautomer Ensemble section above.
OEGetUniqueProtomer is not a conformer generation function and will not create coordinates for molecules that are read in with no coordinates. When used on molecules with three-dimensional coordinates, OEGetUniqueProtomer attempts to place hydrogens in a reasonable manner. However, OEGetUniqueProtomer does not modify the heavy-atom coordinates of the molecule. In cases where the change in tautomer-state dictates a change in conformation, one will need to use a conformer-generation tool (such as OMEGA) to generate reasonable conformations for the output from tautomers. We recommend that in the preparation of small-molecules for study, charge-state and tautomer enumeration be performed before conformer generation.
The OEEnumerateTautomers function is the core algorithm used to implement all functions listed above. It is useful for enumerating the tautomeric forms of a small molecule. Using the parameters in OETautomerOptions a user can control the behavior of the OEEnumerateTautomers to yield the behavior for their particular application.
It is recommended that before passing a molecule to OEEnumerateTautomers that first any dative bonds are normalized to the hypervalent form using OEHypervalentNormalization and second, formal charges are removed using OERemoveFormalCharge. These two steps improve the tautomers that are returned.
Tautomer generation is a combinatorial process and the time and memory requirements can grow quite rapidly. There are two mechanisms to help a use control this growth. First, the number of tautomers generated and the number of tautomers returned can be controlled with the OETautomerOptions::SetMaxTautomersGenerated and OETautomerOptions::SetMaxTautomersToReturn respectively. Please be aware that if you require that few tautomers be generated, it is possible that no low energy tautomers will be generated. Second, one can limit the time the algorithm spends generating tautomers for each molecule using OETautomerOptions::SetMaxSearchTime.
OEEnumerateTautomers is not a conformer generation function and will not create coordinates for molecules that are read in with no coordinates. When used on molecules with three-dimensional coordinates, OEEnumerateTautomers attempts to place hydrogens in a reasonable manner. However, OEEnumerateTautomers does not modify the heavy-atom coordinates of the molecule. In cases where the change in tautomer-state dictates a change in conformation, one will need to use a conformer-generation tool (such as OMEGA) to generate reasonable conformations for the output from tautomers. We recommend that in the preparation of small-molecules for study, charge-state and tautomer enumeration be performed before conformer generation.