OEChem‘s PDB residue perception code now follows the changes required
by the PDB version 3.0 standard. This includes disambiguation of the DNA
residues “DA”, “DC” and “DG” from the RNA residues “D”, “C” and “G”. Nucleic
acid backbone atom names now end in an apostrophe instead of a star/asterisk.
The default ligand name in OEChem is now “UNL” instead of “MOL” or “LIG”.
There have been significant improvements to OEChem‘s bond order
perception code including phosphates, thiophosphates, dithioic acids,
oximes, aldoximes, sulfur oxides, sulfites and iron-sulfur clusters.
The cis/trans detection logic in OE3DToBondStereo has been
robustified to do better with 2D depictions containing bonds that are
co-linear with a chiral double bond or have zero length. The code now
continues searching for additional incident bonds that have non-zero
length and aren’t co-linear.
Significant improvements have been made to OEChem‘s DNA/RNA perception
code. The code now handles/recognizes truncated RNA biopolymers, and
recognizes the bases “1MA”, “2MC”, “5MC”, “5MU”, “7MG”, “M2G”, “OMG”, “OMC”,
OEChem‘s residue perception code now handles/recognizes a much larger
set of common co-factors and ligands, including “ADP”, “ATP”, “DMS”, “EDO”,
“FAD”, “HEM”, “NAD”, “NAG”, “PEO” etc...
OEChem‘s residue perception code now handles the non-standard amino
acid ornithine (ORN).
The OpenEye formal charge model has been extended such that a
four-valent aluminum (aluminium) now has an implicit negative charge,
and aluminum ions have a +3 formal charge. The model has also been
tweaked to consider the sulfurs in iron-sulfur clusters as neutral
The OpenEye hydrogen count model has been tweaked to prefer
five-valent phosphorus, such as O=[PH2]O over three-valent
The values returned by OEGetAverageWeight have been updated
and revised to follow the latest (2007) recommendations of the IUPAC
Commission on Isotopic Abundances and Atomic Weights.
The OEParseSmarts and OEParseSmirks
functions have been enhanced to allow a TAB character \t to be treated
as a separator after a SMARTS pattern. This matches the behavior of the
SMILES parser, OEParseSmiles, and simplifies the task of writing
The OEChem SMILES and SMARTS parsers have been tweaked to allow
the backslash used in specifying cis/trans stereochemistry to be
duplicated in the input string, i.e.C\\C=C\\C is now interpreted as C\C=C\C. This is
convenient when working with programming languages such as C and C++
where the backslash is used as an escape character. Embedding SMILES in
C/C++ source files requires the strings look like
C\\C=C\\C which previously couldn’t be cut’n’paste like regular SMILES strings.
The interpretation of acyclic aromatic elements by OEChem‘s SMILES
reader now more accurately follows the Daylight toolkit. For example,
n is interpreted as [NH2] and not [N], and nn now means N=N
instead of [N]=[N].
Fixed an obscure corner case in the OEChem SMILES writer, when
not performing aromaticity perception and using the low-level SMILES
writer. We need to preserve the explicit single bond (hyphen/minus) in
[cH2]-[cH2] otherwise [cH2][cH2] would get
interpreted like cc and result in c=c and [cH2]=[cH2].
The MDL file format reader has been enhanced to allow TABs in addition
to spaces as separators in MCHG, MRAD and MISO lines.
The Sybyl .mol2 file format reader has been enhanced to recognize
pyrylium-like ring systems, containing charged oxygen atoms. A minor
bug has also been fixed that could assign inappropriate formal charges
to substituted nitrates.
In Sybyl .mol2 format files, the atoms types d and t are now
treated identically to D and T and interpreted as Deuterium and
Tritium respectively. Previously, they’d be interpreted as hydrogen
atoms, but the isotope specification wasn’t getting set.
The Tripos bond types in Sybyl .mol2 format files are now treated
as case-insensitive. We now treat AR and Ar as identical to ar,
and AM and Am as identical to am, etc...
The CambridgeSoft CDX file format reader has been significantly
rewritten to address bugs in the reading/writing of 3D coordinates.
The OpenEye OEB file format reader is now more robust to invalid,
corrupt and/or truncated input files.
If SD data was attached to an OEMCMolBase and an
OEConfBase then written to OEB and read back into an
OEMolBase the data from the OEConfBase
would appear to disappear. This would result in losing the data if then written to SDF.
The OEReadPDBFile and OEWritePDBFile functions are
able to read and write ANISOU records, respectively. ANISOU
records, which are atom property representing anisotropic
temperature factors in PDB, are scaled by a factor of \(10^4\) and
represented as integers.
OEGetCenterOfMass function, which computes the center of
mass of a molecule (with or without atomic weights), was added to
the OEChem namespace.
The algorithm that generates canonical SMILES did not ignore
cis/trans stereo hydrogens and produced [H]N=CC, rather
than the correct N=CC canonical SMILES.
Even though this bug fix has affected only a small
percentage of canonical SMILES, we highly recommend the
regeneration of all canonical SMILES.
Small improvements have been made to the generation canonical
A problem has been fixed in OEChem‘s Kekulization algorithms for
large molecules (with between 250 and 1000 atoms) that can’t be
assigned a valid Kekulé form. The changes in OEChem 1.5.0 that
attempted to assign as much of a Kekulé form as possible upon
failure could occasionally lead to OEKekulize
returning true for an invalid molecule.
A performance problem in OEChem‘s aromaticity perception has been
resolved. Previously pathological substituted fullerenes and PAHs could
cause OEChem‘s aromaticity routines to take over a minute to perceive all
of the conjugated cycles. Algorithmic improvements to OEChem‘s aromaticity
perception now allow all of the reported cases to be processed in a fraction
of a second.
A rare problem interpreting the stereo from wedge/hash bonds around
atoms of degree three has been resolved. When we have two bonds in the
plane, and the third marked as a wedge or a hash, we need to determine
whether the raised/lowered bond is in the larger or smaller sector
subtended by the two in-plane bonds. A bug in this code failed to handle
the case when all three bonds lay in the same half-circle. This problem
is extremely rare, for example, no cases were found in the 250,251 MDL
connection tables distributed by the NCI as the NCI August 2000 database.
The OEChem MDL mol file reader has been improved to allow the
dimension field in the connection table header line to be omitted,
and still correctly decide whether to process wedge/hash bonds or
determine chirality from 3D coordinates. Previously, the molecule’s
stereochemistry would be set incorrectly if the optional header line
The MDL file reader now perceives aromatic cycles using the MDL
aromaticity model prior to calling OEPerceiveChiral. This
ensures that alternate Kekulé forms of substituted phenyl rings (for
example) don’t inappropriately split symmetry groups, causing achiral
double bonds to acquire specified cis/trans stereochemistry.
An aesthetic improvement has been made to the rules used in
the OEMDLPerceiveBondStereo function that assigns wedge and
hash bonds to depictions. For acyclic bonds, we now prefer to place
the wedge or hash on bonds to non-ring atoms. A typo in the previous
rules reversed this priority.
The OEChem SMILES writer was being miscompiled on IBM AIX 5.x
resulting in canonical SMILES that differed from those on other platforms.
The code has been rewritten to avoid the issue in IBM’s xlC compiler,
so the SMILES are once again identical to the other platforms.
The OEChem PDB file parser has been updated to reflect the latest
atom name exceptions in the RCSB/wwPDB database. These changes should
eliminate the spurious Holmium and Helium atoms perceived in recently
added ligand residues.
Numerous small performance improvements have been made to OEChem.
The torsion cutoff values for perceiving cis/trans
bond stereo from 3D are relaxed in OE3DToBondStereo function.
The cis cutoff is increased to 30 from 15, the trans cutoff is lowered
to 150 from 165.
Even when the maximum number of matches is set, the MCS search
can not be terminated upon reaching this limit, since there is no
guarantee that the maximum common substructure has been detected.
Instead, the search continues, then the best N matches are returned,
where N is set by OEMCSSearch.SetMaxMatches.
The exhaustive and the approximate MCS algorithms no longer use different
functions to determine whether a match is unique or not. Several other small
problems were fixed in order to insure that all matches located by the
approximate method are also detected by the exhaustive one.
A rare problem occurred in the substructure search when hydrogen
atoms were matched first. This problem has been solved by
rearranging the order in which atoms are taken into consideration,
moving hydrogens to the end of the match order. Other small
modifications have been made to improve the performance of
OEDeleteSDData was improperly documented in the theory
The old documentation stated that only the first instance of
a tag was deleted when all instances of the tag were actually
deleted. The documentation has been corrected to state that all
instances of a tag are deleted.
Maximum Common Substructure Search has been revised
adding new examples, explaining the difference between the exhaustive
and the approximate methods and providing more details about the built-in
MCS scoring functions.
Figures have been added to OEExprOpts Namespace in order to
demonstrate the effect of various atom and bond expression options
on pattern matching.
C++, Python, and Java manuals brought into closer alignment with
The memory allocation performance of multi-threaded OEChem
applications on both Windows and recent Linux/UNIX distributions
(that use pthreads) has been dramatically improved. A new thread
hashing algorithm is now used in OESystem‘s memory pooling code
which should dramatically reduce contention in allocation heavy
A number of minor performance and numerical stability improvements
have been made to OEMath‘s geometry routines.
When parsing the command line --help-foo is no
longer sensitive to the case of -foo.