Version 1.3.3¶
OEChem 1.3.3¶
New Features¶
Several enhancements have been made to the protein perception algorithms used in
OEPerceiveResidues
. These allow OEChem to recognize the N-terminal capping groupACE
, and the nonstandard amino acid residuesABA
,CGU
,CME
,CSD
,MLY
,MSE
,PCA
,PTR
,SEP
andTPO
. Support for these additional amino acid types has also been added toOEGetResidueIndex
and friends. The sidechain pattern matching algorithm now has improved fallback functionality for better handling of modified/substituted residues.Improved support from aromatic boron and aromatic silicon in
OEKekulize
. The OEChem toolkit currently doesn’t perceive either boron or silicon to be aromatic (with any aromaticity model), but this enhancement allows us to Kekulize structures so specified.Added improved support of parsing SMILES containing aromatic boron and aromatic silicon, allowing the OEChem toolkit to parse
b1ccccc1
(borinine).A new
OEGetDelphiRadius
function has been added to OEChem to return the default radius for a given element used by the Accelrys’ Delphi program for electrostatics calculations.A new function
OEGetAminoAcidCode
can be used to convert an index from theOEResidueIndex
namespace to a IUMB single character code (A
for alanine,R
for arginine, etc…).Several new convenience functions,
OEAssignCovalentRadii
,OEAssignDelphiRadii
,OEAssignBondiVdWRadii
,OEAssignPaulingVdWRadii
andOEAssignHonigIonicCavityRadii
, are now provided to set the radius property on each atom of a molecule to the value specified by the correspondingOEGet...Radius
function.A new function
OEIsBinary
is provided to determine whether the specified file format is binary or not, for example,.oeb
,.bin
and.cdx
.The new function
OEGetFormatExtension
can be used to return a comma separated list of lowercase file format extensions that can be used to aid implementing directory scans and file format dialog boxes.A new OEMCSFunc functor, OEMCSMaxBondsCompleteCycles can be used as an objective function to OEChem’s maximal common subgraph matching algorithms.
Major bug fixes¶
A problem in OEChem’s graph canonicalization algorithm was identified by the NCBI’s PubChem project for the single molecule:
C12C3C4C3C5C4C1C25
. This problem has been fixed in OEChem 1.3.3. Unfortunately, this failure didn’t show up on our testing of 100 random permutations of 2.5 million compound test set. Efforts are now on-going to validate OpenEye’s canonicalization against all theoretical connection tables with less than \(N\) atoms, for some \(N > 10\).A bug in the OEB file format readers and writers that could cause the titles and/or comments attached to molecules or conformers to be lost, has been corrected.
Minor bug fixes¶
Fixed bug in the OEChem SMARTS parser that failed to follow the Daylight semantics for patterns such as
[H]
,[2H]
and[H+]
where theH
specifies the pattern must match a hydrogen, and not the expected hydrogen count on an atom.The OEChem SMILES writers have been modified to prevent them generating atoms such as
[C@H2]
or[C@@H2]
for centers that have stereo explicitly specified (on non-chiral centers) with explicit hydrogens, when the hydrogens are being automatically suppressed by the output SMILES flavor.The methods
OEAtomBase::SetStereo
,OEAtomBase::GetStereo
,OEBondBase::SetStereo
andOEBondBase::GetStereo
have been enhanced such that the internal representation of stereochemistry is invariant of hydrogen suppression. The functionsOESuppressHydrogens
andOEAddExplicitHydrogens
no longer invalidate stereochemistry.The old-style OE binary,
.bin
, file format reader now automatically sets the dimension property of molecules and conformers to 3. Whilst new-style OE binary,.oeb
, files explicitly record the dimensionality of the stored coordinates, the old format didn’t and its contents should be assumed to be 3-dimensional.Correct a minor logic problem in
OEQMolBase::BuildExpressions
when constructing the expressions to match bond orders but not aromaticity.Fixed a problem in the SMILES parser, which would cause a segmentation fault if ever a SMILES string longer than 4096 characters encountered a syntax or Kekulization error. We no longer try to report the location of the syntax error for SMILES strings longer than 2048 characters.
A bug in
OEPerceiveBondOrders
that assumed/required that the incoming molecule not have any aromaticity specified, has been fixed by callingOEClearAromaticFlags
on the incoming molecule. This assumption was valid for its existing use by the high-level file format readers, but meant that callingOEPerceiveBondOrders
twice in a row could sometimes produce different results.Fixed a potential problem in several file format readers that caused a run-time abort in Microsoft’s runtime libraries on Windows when reading corrupt or binary files. The Microsoft implementation of the standard
<ctype.h>
functions, such asisdigit
andisupper
will abort when passed negative values, such as when interpreting the bytes of a file as (signed) char.Fixed a segmentation fault in
OEScrambleMolecule
that was triggered by chiral molecules.Fixed a bug in
OEMDLCorrectBondStereo
that could cause that routine to crash, if the chiral atom on which the stereo chemistry needed to be corrected was degree three instead of degree 4. This routine has been made more robust, and can now correct wedges and hashes around degree three atoms that conflict with the specified MDL parity bit.The OEChem MDL mol file reader has been made more robust by checking for negative values in the atom count, bond count and list count fields. These are now interpreted as being zero. Corrupted SD files could previously cause OEChem to crash.
Calling close on an oemolistream that wraps
OEPlatform::oein
, will now correctly makeoemolistream::operator bool
return false, and stop it reading (even thoughOEPlatform::oein
, itself shouldn’t be closed).The OEChem SMILES parser,
OEParseSmiles
function, has been fixed to set the default bond order of unspecified external bonds, i.e.C&1
, to be single. Previously these were left initialized as bond order zero, althoughC&=1
andC
were correctly handled as double and triple bonds respectively.The function
OEPDBOrderAtoms
has been improved to only compare atoms names for recognized residues when sorting. This prevents atoms being needlessly reordered for no good reason.OEPerceiveResidues
has been improved to assign unique atom names to every atom within an unknown or unrecognized residue. Previously, all six atoms in benzene would be given the same atom nameC `` which confuses software that assumes PDB atom names are unique within a residue. *OEChem* now assigns ``C1
,C2
, etc…Add goof-proofing to return calls to
OEInvertCenter
where the specified atom is not trivially invertible (i.e. a center with 3 or more ring bonds).Improved handling of the hydrogen isotopes
D
andT
when reading MDL connection tables. These symbols now automatically set the isotope field appropriately. Previous versions of OEChem interpreted these symbols as forms of hydrogen, but relied on the MDL’s mass field orM ISO
line being correctly set to specify a/which isotope.A very minor bug in
OEPerceiveResidues
has been fixed that prevented residue information from being assigned to lone protons. The algorithm previously assumed all hydrogens were bonded to a heavy atom parent.In
OESubsetMol
the dummy atoms used to represent attachment points are no assigned map indices starting from one, i.e. R1, R2, R3, instead of from zero, i.e., R1, R2.OESubsetMol
now attempts to preserve or undefine the specified stereochemistry at atoms and bonds affected by attachment points.The performance of
OEDetermineConnectivity
has been dramatically improved for very large molecules. This greatly speeds up the reading of proteins likepdb1jj2.ent
(which contains 98,543 atoms) several fold.Replaced an inefficient \(O(n^2)\) algorithm in the OEChem::OEMolBaseImpl::OrderAtoms method that checked that the input vector was a valid permutation of a subset of the atoms in the molecule. This dramatically improves the performance of writing large PDB files.
The performance of many of the OEMolBase, OEAtomBase and OEBondBase methods has been improved in OEChem 1.3.3.
The methods
oemolistream::operator bool
,oemolostream::operator bool
andoemolistream::eof
have been marked const to enable better compiler optimization.
Java wrappers¶
New Features¶
With this release of OEChem, Java wrappers are now provided. This first version only supports Sun’s JVM version 1.4.2.
Python wrappers¶
New Features¶
The OEInterface class and associated machinery for creating and parsing command lines is now available in Python. While Python has native command line argument support, this provides an alternative that is functionally similar to the C++ OEChem version. The example program
molextract.py
has been updated to demonstrate this new feature.
Major bug fixes¶
Fixed a bug in PyAtomPredicate, PyBondPredicate and PyConfPredicate where a syntax error in the Python callable function would silently fail. Now, if there is an error in the Python function, the exception will propagate back to the Python interpreter.
OESystem 1.3.3¶
New Features¶
By default the OpenEye toolkits now use thread-safe memory management internally to allow multiple molecules (and other objects) to be manipulated by different concurrent threads. Modifying the same object concurrently is still unsafe. On some operating systems, OEChem intensive applications may experience a slight overhead which may be explicitly disabled with the new
OESetThreadSafe
function call. Timings on modern GNU/Linux systems show almost no overhead, and the performance benefits of upgrading to g++ 3.4.x means that most applications should run faster with OEChem 1.3.3 than with previous releases even with thread-safety enabled.The
--help
functionality of the OEInterface class has been improved to indent and wrap the on-line help text at 80 columns. The default screen width can be controlled by specifying the column width on the command line, for example--help all 100
.The OEInterface parser has been improved to allow
!CATEGORY
names to be quoted, allowing names to contain spaces.The OESystem::OEFizzGrid class now has an OESystem::OEFizzGrid::operator bool method, which returns true if either floats or integers have been set.
Major bug fixes¶
The semantics of how quaternions are represented within the OpenEye toolkits have now been standardized, as scalar-first. Hence, of the four floating point values that define a quaternion, the first represents the scalar component and the final three values represent the vector component. The failure to explicitly document which of the two possible forms was used, resulted in some OEMath functions assuming scalar-first whilst others assumed scalar-last. (The quaternion functions in OELib, for example, used scalar-last). Functions affected by this include OEMath::OEGeomQuaternionMultiply, OEMath::OEGeom3DUnitQuaternionRotate OEMath::OEGeom3DQuaternionToRotMatrix and OEMath::OEGeom3DRotMatrixToQuaternion.
Minor bug fixes¶
Fixed a potential memory leak in OEBinaryNot.
OEInterface’s methods
OEInterface::DeleteInterface
andOEInterface::DeleteParameter
now recursively search through sub-interfaces for the object to delete.The
OEStringTokenize
andOEStringTokenizeQuoted
functions have been completely rewritten. Both previous implementations could potentially thrown C++ exceptions, and the latter was just plain broken.A minor bug in the OEInterface class, that in some cases caused the detailed description to end with
!END
, has been fixed.The behavior of the
!REQUIRED
keyword has changed in OEInterface files. If an option has a default value, specified by the!DEFAULT
keyword, then the!REQUIRED
option is ignored.
OEPlatform 1.3.3¶
Minor bug fixes¶
Modified
OEFileDeterminePathAndName
to canonicalize directory separators to the appropriate form for the host operating system.Improved the performance of OEMutex when using g++, by using the low-level
gthr
API, rather than using the higher-level locking primitives used by thelibstdc++
STL library.Fixes to oestream classes to prevent accidentally closing
stdin
. Minor bug fix to oeiwrapperstream implementation.