Version 1.3.3

OEChem 1.3.3

New Features

  • Several enhancements have been made to the protein perception algorithms used in OEPerceiveResidues. These allow OEChem to recognize the N-terminal capping group ACE, and the nonstandard amino acid residues ABA, CGU, CME, CSD, MLY, MSE, PCA, PTR, SEP and TPO. Support for these additional amino acid types has also been added to OEGetResidueIndex and friends. The sidechain pattern matching algorithm now has improved fallback functionality for better handling of modified/substituted residues.

  • Improved support from aromatic boron and aromatic silicon in OEKekulize. The OEChem toolkit currently doesn’t perceive either boron or silicon to be aromatic (with any aromaticity model), but this enhancement allows us to Kekulize structures so specified.

  • Added improved support of parsing SMILES containing aromatic boron and aromatic silicon, allowing the OEChem toolkit to parse b1ccccc1 (borinine).

  • A new OEGetDelphiRadius function has been added to OEChem to return the default radius for a given element used by the Accelrys’ Delphi program for electrostatics calculations.

  • A new function OEGetAminoAcidCode can be used to convert an index from the OEResidueIndex namespace to a IUMB single character code (A for alanine, R for arginine, etc…).

  • Several new convenience functions, OEAssignCovalentRadii, OEAssignDelphiRadii, OEAssignBondiVdWRadii, OEAssignPaulingVdWRadii and OEAssignHonigIonicCavityRadii, are now provided to set the radius property on each atom of a molecule to the value specified by the corresponding OEGet...Radius function.

  • A new function OEIsBinary is provided to determine whether the specified file format is binary or not, for example, .oeb, .bin and .cdx.

  • The new function OEGetFormatExtension can be used to return a comma separated list of lowercase file format extensions that can be used to aid implementing directory scans and file format dialog boxes.

  • A new OEMCSFunc functor, OEMCSMaxBondsCompleteCycles can be used as an objective function to OEChem’s maximal common subgraph matching algorithms.

Major bug fixes

  • A problem in OEChem’s graph canonicalization algorithm was identified by the NCBI’s PubChem project for the single molecule: C12C3C4C3C5C4C1C25. This problem has been fixed in OEChem 1.3.3. Unfortunately, this failure didn’t show up on our testing of 100 random permutations of 2.5 million compound test set. Efforts are now on-going to validate OpenEye’s canonicalization against all theoretical connection tables with less than \(N\) atoms, for some \(N > 10\).

  • A bug in the OEB file format readers and writers that could cause the titles and/or comments attached to molecules or conformers to be lost, has been corrected.

Minor bug fixes

  • Fixed bug in the OEChem SMARTS parser that failed to follow the Daylight semantics for patterns such as [H], [2H] and [H+] where the H specifies the pattern must match a hydrogen, and not the expected hydrogen count on an atom.

  • The OEChem SMILES writers have been modified to prevent them generating atoms such as [C@H2] or [C@@H2] for centers that have stereo explicitly specified (on non-chiral centers) with explicit hydrogens, when the hydrogens are being automatically suppressed by the output SMILES flavor.

  • The methods OEAtomBase::SetStereo, OEAtomBase::GetStereo, OEBondBase::SetStereo and OEBondBase::GetStereo have been enhanced such that the internal representation of stereochemistry is invariant of hydrogen suppression. The functions OESuppressHydrogens and OEAddExplicitHydrogens no longer invalidate stereochemistry.

  • The old-style OE binary, .bin, file format reader now automatically sets the dimension property of molecules and conformers to 3. Whilst new-style OE binary, .oeb, files explicitly record the dimensionality of the stored coordinates, the old format didn’t and its contents should be assumed to be 3-dimensional.

  • Correct a minor logic problem in OEQMolBase::BuildExpressions when constructing the expressions to match bond orders but not aromaticity.

  • Fixed a problem in the SMILES parser, which would cause a segmentation fault if ever a SMILES string longer than 4096 characters encountered a syntax or Kekulization error. We no longer try to report the location of the syntax error for SMILES strings longer than 2048 characters.

  • A bug in OEPerceiveBondOrders that assumed/required that the incoming molecule not have any aromaticity specified, has been fixed by calling OEClearAromaticFlags on the incoming molecule. This assumption was valid for its existing use by the high-level file format readers, but meant that calling OEPerceiveBondOrders twice in a row could sometimes produce different results.

  • Fixed a potential problem in several file format readers that caused a run-time abort in Microsoft’s runtime libraries on Windows when reading corrupt or binary files. The Microsoft implementation of the standard <ctype.h> functions, such as isdigit and isupper will abort when passed negative values, such as when interpreting the bytes of a file as (signed) char.

  • Fixed a segmentation fault in OEScrambleMolecule that was triggered by chiral molecules.

  • Fixed a bug in OEMDLCorrectBondStereo that could cause that routine to crash, if the chiral atom on which the stereo chemistry needed to be corrected was degree three instead of degree 4. This routine has been made more robust, and can now correct wedges and hashes around degree three atoms that conflict with the specified MDL parity bit.

  • The OEChem MDL mol file reader has been made more robust by checking for negative values in the atom count, bond count and list count fields. These are now interpreted as being zero. Corrupted SD files could previously cause OEChem to crash.

  • Calling close on an oemolistream that wraps OEPlatform::oein, will now correctly make oemolistream::operator bool return false, and stop it reading (even though OEPlatform::oein, itself shouldn’t be closed).

  • The OEChem SMILES parser, OEParseSmiles function, has been fixed to set the default bond order of unspecified external bonds, i.e. C&1, to be single. Previously these were left initialized as bond order zero, although C&=1 and C&#1 were correctly handled as double and triple bonds respectively.

  • The function OEPDBOrderAtoms has been improved to only compare atoms names for recognized residues when sorting. This prevents atoms being needlessly reordered for no good reason.

  • OEPerceiveResidues has been improved to assign unique atom names to every atom within an unknown or unrecognized residue. Previously, all six atoms in benzene would be given the same atom name C `` which confuses software that assumes PDB atom names are unique within a residue. *OEChem* now assigns ``C1, C2, etc…

  • Add goof-proofing to return calls to OEInvertCenter where the specified atom is not trivially invertible (i.e. a center with 3 or more ring bonds).

  • Improved handling of the hydrogen isotopes D and T when reading MDL connection tables. These symbols now automatically set the isotope field appropriately. Previous versions of OEChem interpreted these symbols as forms of hydrogen, but relied on the MDL’s mass field or M ISO line being correctly set to specify a/which isotope.

  • A very minor bug in OEPerceiveResidues has been fixed that prevented residue information from being assigned to lone protons. The algorithm previously assumed all hydrogens were bonded to a heavy atom parent.

  • In OESubsetMol the dummy atoms used to represent attachment points are no assigned map indices starting from one, i.e. R1, R2, R3, instead of from zero, i.e., R1, R2.

  • OESubsetMol now attempts to preserve or undefine the specified stereochemistry at atoms and bonds affected by attachment points.

  • The performance of OEDetermineConnectivity has been dramatically improved for very large molecules. This greatly speeds up the reading of proteins like pdb1jj2.ent (which contains 98,543 atoms) several fold.

  • Replaced an inefficient \(O(n^2)\) algorithm in the OEChem::OEMolBaseImpl::OrderAtoms method that checked that the input vector was a valid permutation of a subset of the atoms in the molecule. This dramatically improves the performance of writing large PDB files.

  • The performance of many of the OEMolBase, OEAtomBase and OEBondBase methods has been improved in OEChem 1.3.3.

  • The methods oemolistream::operator bool, oemolostream::operator bool and oemolistream::eof have been marked const to enable better compiler optimization.

Java wrappers

New Features

  • With this release of OEChem, Java wrappers are now provided. This first version only supports Sun’s JVM version 1.4.2.

Python wrappers

New Features

  • The OEInterface class and associated machinery for creating and parsing command lines is now available in Python. While Python has native command line argument support, this provides an alternative that is functionally similar to the C++ OEChem version. The example program molextract.py has been updated to demonstrate this new feature.

Major bug fixes

  • Fixed a memory leak in OENot, OEAnd and OEOr predicates.

  • Fixed a bug in PyAtomPredicate, PyBondPredicate and PyConfPredicate where a syntax error in the Python callable function would silently fail. Now, if there is an error in the Python function, the exception will propagate back to the Python interpreter.

OESystem 1.3.3

New Features

  • By default the OpenEye toolkits now use thread-safe memory management internally to allow multiple molecules (and other objects) to be manipulated by different concurrent threads. Modifying the same object concurrently is still unsafe. On some operating systems, OEChem intensive applications may experience a slight overhead which may be explicitly disabled with the new OESetThreadSafe function call. Timings on modern GNU/Linux systems show almost no overhead, and the performance benefits of upgrading to g++ 3.4.x means that most applications should run faster with OEChem 1.3.3 than with previous releases even with thread-safety enabled.

  • The --help functionality of the OEInterface class has been improved to indent and wrap the on-line help text at 80 columns. The default screen width can be controlled by specifying the column width on the command line, for example --help all 100.

  • The OEInterface parser has been improved to allow !CATEGORY names to be quoted, allowing names to contain spaces.

  • The OESystem::OEFizzGrid class now has an OESystem::OEFizzGrid::operator bool method, which returns true if either floats or integers have been set.

Major bug fixes

  • The semantics of how quaternions are represented within the OpenEye toolkits have now been standardized, as scalar-first. Hence, of the four floating point values that define a quaternion, the first represents the scalar component and the final three values represent the vector component. The failure to explicitly document which of the two possible forms was used, resulted in some OEMath functions assuming scalar-first whilst others assumed scalar-last. (The quaternion functions in OELib, for example, used scalar-last). Functions affected by this include OEMath::OEGeomQuaternionMultiply, OEMath::OEGeom3DUnitQuaternionRotate OEMath::OEGeom3DQuaternionToRotMatrix and OEMath::OEGeom3DRotMatrixToQuaternion.

Minor bug fixes

  • Fixed a potential memory leak in OEBinaryNot.

  • OEInterface’s methods OEInterface::DeleteInterface and OEInterface::DeleteParameter now recursively search through sub-interfaces for the object to delete.

  • The OEStringTokenize and OEStringTokenizeQuoted functions have been completely rewritten. Both previous implementations could potentially thrown C++ exceptions, and the latter was just plain broken.

  • A minor bug in the OEInterface class, that in some cases caused the detailed description to end with !END, has been fixed.

  • The behavior of the !REQUIRED keyword has changed in OEInterface files. If an option has a default value, specified by the !DEFAULT keyword, then the !REQUIRED option is ignored.

OEPlatform 1.3.3

Minor bug fixes

  • Modified OEFileDeterminePathAndName to canonicalize directory separators to the appropriate form for the host operating system.

  • Improved the performance of OEMutex when using g++, by using the low-level gthr API, rather than using the higher-level locking primitives used by the libstdc++ STL library.

  • Fixes to oestream classes to prevent accidentally closing stdin. Minor bug fix to oeiwrapperstream implementation.