tautomers¶
When dealing with tautomers there are four primary tasks to be addressed. These include:
Generating a single representative tautomer suitable for molecular visualization
Generating a list of tautomers expected to be present in aqueous phase
Generating a unique (canonical) representation for a set of tautomers
Generating a substantial list of most possible tautomers suitable for input to further calculations.
Each of these four tasks is covered in a section below. In every case, the tautomerization can be handled alone, but it is generally more useful to handle tautomerization in conjunction with ionization. Each function includes the potential for pKa normalization alongside the tautomer normalization or enumeration.
Visualization¶
While the unique or canonical tautomer generated above is useful for storage in
a database, they do not necessarily represent a low-energy tautomer suitable
for visualization by modelers and chemists. In order to generate a structure
suitable for visualization, we recommend you use the function
OEGetReasonableProtomer
. This function sets the molecule
into a low-energy, neutral-pH, aqueous-phase tautomeric state that should be
pleasing for visualization in a scientific setting.
Reasonable Tautomer Ensemble¶
In the course of molecular modeling, it is often desirable to generate a small
ensemble of low-energy, neutralpH, aqueous-phase tautomers. The function
OEGetReasonableTautomers
returns such an ensemble. In
order to generate low-energy tautomers reliably, the function works with a form
of the molecule that has formal charges removed. By default, each tautomer’s
ionization state is set to a neutral pH form, but ionization states are not
enumerated.
The following depictions show some examples of tautomers that are favored as “reasonable”:
Canonicalization¶
The OEGetUniqueProtomer
function is used for
canonicalizing the tautomeric forms of a small molecule. Canonicalization
converts any of the tautomeric forms of a given molecule into a single unique
representation and removes all formal charges that can be appropriately
neutralized. This is useful for database registration where alternate
representations of tautomeric compounds often leads to duplicate entries in a
database.
It is important to remember that a time limit cannot be used to control a canonical process as it might lead to hardware dependent behavior. Thus the identification of a unique protomer can take significant time when the size of the largest contiguous tautomeric atoms approaches or exceeds 30 atoms.
The tautomer returned by OEGetUniqueProtomer
as the “canonical” representation often is not the
physiologically preferred form. If a representative form is necessary, please use the
functions referred to in the Visualization section above. If an ensemble
of biologically relevant tautomers are necessary, please see the functions
in the Reasonable Tautomer Ensemble section above.
OEGetUniqueProtomer
is not a conformer generation
function and will not create coordinates for molecules that are read
in with no coordinates. When used on molecules with three-dimensional
coordinates, OEGetUniqueProtomer
attempts to
place hydrogens in a reasonable manner. However,
OEGetUniqueProtomer
does not modify the
heavy-atom coordinates of the molecule. In cases where the change in
tautomer-state dictates a change in conformation, one will need to use
a conformer-generation tool (such as OMEGA) to generate reasonable
conformations for the output from tautomers. We recommend that in
the preparation of small-molecules for study, charge-state and
tautomer enumeration be performed before conformer generation.
Complete Enumeration¶
The OEEnumerateTautomers
function is the core algorithm
used to implement all functions listed above. It is useful for
enumerating the tautomeric forms of a small molecule. Using the parameters in
OETautomerOptions a user can control the behavior of the
OEEnumerateTautomers
to yield the behavior for their particular
application.
It is recommended that before passing a molecule to OEEnumerateTautomers
that first any dative bonds are normalized to the hypervalent form using
OEHypervalentNormalization
and second, formal charges are
removed using OERemoveFormalCharge
. These two steps improve
the tautomers that are returned.
Tautomer generation is a combinatorial process and the time and memory requirements
can grow quite rapidly. There are two mechanisms to help a use control
this growth. First, the number of tautomers generated and the number of tautomers
returned can be controlled with the OETautomerOptions::SetMaxTautomersGenerated
and OETautomerOptions::SetMaxTautomersToReturn
respectively. Please be
aware that if you require that few tautomers be generated, it is possible that no low
energy tautomers will be generated. Second, one can limit the time the algorithm
spends generating tautomers for each molecule using OETautomerOptions::SetMaxSearchTime
.
OEEnumerateTautomers
is not a conformer generation
function and will not create coordinates for molecules that are read
in with no coordinates. When used on molecules with three-dimensional
coordinates, OEEnumerateTautomers
attempts to
place hydrogens in a reasonable manner. However,
OEEnumerateTautomers
does not modify the
heavy-atom coordinates of the molecule. In cases where the change in
tautomer-state dictates a change in conformation, one will need to use
a conformer-generation tool (such as OMEGA) to generate reasonable
conformations for the output from tautomers. We recommend that in
the preparation of small-molecules for study, charge-state and
tautomer enumeration be performed before conformer generation.