Version 1.4.0

OEChem 1.4.0

OEChem 1.4.0 is a major new feature release. OpenEye is introducing OEBio, a new programming library extending OEChem‘s convenience in handling biopolymers. In this initial release, OEBio‘s API is small but useful. Over the life of the 1.4.x OEChem release series the OEBio API will grow. The purpose of OEBio is not to cover Bioinformatics, but to extend OEChem‘s strong cheminformatics foundation to conveniently support protein modeling.

The source-code and examples in /openeye/examples/oechem have long been caught in a conflict. They served both as very useful tools and as didactic coding examples. To fulfill the role as tools, they needed good command-line-interfaces and error reporting. Unfortunately these features lead to more complex code. To fulfill a role as code examples, these programs need to be as simple as possible, highlighting one or two programming principles. In order to better serve both purposes, the example programs have now been split into /openeye/utilities and /openeye/examples, the first includes programs with more complex code and better interfaces and the latter with simple OEChem code examples. In addition, nine new example programs have been included to demonstrate common uses of the OEBio API.

In addition to OEBio, the 1.4.0 release includes many new features and bug fixes in the OEChem, OESystem and OEPlatform libraries.

New Features

  • Split the programs previously in the examples directory into examples and utilities. The utilities directory will contain programs or versions of programs that may be useful and convenient for modelers to carry out common tasks. The examples directory will contain programs that may also be useful, but there primary purpose will be to provide didactic code examples of how to program common tasks using the OEChem library.
  • New support for highly compact rotor-offset compressed oeb files.
  • Added support for MDL ISIS Sketch file format with the .skc suffix.
  • Added support for writing hydrogens that are required for specifying cis-trans stereo.
  • Added support for [Ds] and [Rg] in SMILES and SMARTS.
  • Added support for writing high-atomic number atoms in SMILES using [\#123] notation.
  • New OEWriteConstMolecule function class to support high-level writing of const molecules. Introduced return-codes for the high-level writers that reflect that some molecules are inherently not supported by certain file formats (e.g. >999 atoms in .sdf).
  • Add an OEOFlavor_MOL2_Substructure high-level writer flavor to force an @TRIPOS<SUBSTRUCTURE> idiom in the .mol2 file.
  • New OEHasStereoHydrogens function that determines if an atom has a proton that is required to specify stereochemistry.
  • Added retainStereo=false default argument to OESuppressHydrogens that keeps hydrogens indicated by the OEHasStereoHydrogens function.
  • Added OEMatchBase.Clear method.
  • Dramatically improved efficiency of DeleteConf for deleting large numbers of conformers in order. Worst case behavior of the algorithm was changed from \(N*N\) to \(N\).
  • Allow the SD file reader to handle a blank line between the M END and the ‘$ $ $ $’ lines.
  • Added convenience functions for getting and setting the MDL parity on atoms.
  • Added new bitmask initialization parameters to OEInitDefaultHandler that allow easy specification of which handlers to initialize.
  • New support for h, d, t, [T] and [t] non-standard SMILES representations.
  • Improved support for multiple NMR models in PDB files by reading, retaining and writing model number.
  • Added fully supported OEPDBData and OEPDBDataPair classes as well as the necessary function to store and retrieve them from molecules.
  • Three new convenience functions for clearing tag data: OEClearTagData, OEClearSDData and OEClearPDBData.
  • Added support for determining whether the library is properly licensed with OEChemIsLicensed function.
  • Added OEResidueHydrogens function that will rename hydrogens an a heavy atom to their proper PDB atom names.

Major bug fixes

  • Fixed oemolistream.seek and oemolistream.tell to take into account any cached molecules that may exist in the stream.

  • Fixed low-level MDL reader to accept multiple SD tags with the same tag.

    Note

    It is not clear from the SD file specification if this is a valid SD format.

Minor bug fixes

  • Added PDBData readers and writers to OEBinary file handlers.
  • Added defensive code to OEMolBase.DeleteAtom and OEMolBase.DeleteBond to confirm that the atom or bond are owned by that specific molecule.
  • Fixed rotation bug in inertial frame alignment.
  • Converted inconsistent / and \ into a warning rather than an error, allowing the molecule to be parsed in a racemic fashion.
  • Added an upper bound to the degree of the atoms at either end of a cis-trans chiral double bond.
  • Added defensive code to prevent creation of atoms with atomic number greater than 255.
  • Improved perception of non-aromatic exo double-bonds. This corrects a problem perceiving the progesterone in pdb1a28.
  • Improved the exo-cyclic double bonds to sulfur. This improves the connectivity perception in 1hnv, 1rev, 1usn, 1uwb, 2usn, 3usn and 1zxv.
  • Improved the bond order perception of nitroso, oxime, azide, and arylhydroxylamine functional groups.
  • Improved bond order perception of clashed structures by allowing hydrogens to only bond to their nearest heavy atoms.
  • Prevent alternate conformation representations to be bonded to one another during bond perception.
  • Made Up/Down choice for the first stereo bond in each resonance system canonical for writing isomeric smiles files.
  • Made OECanonicalOrderBonds also order the bonds obtained with the OEAtomBase.GetBonds function call.
  • Fixed bug in binary search for atomic number 0 used in OEIsCommonIsotope, OEGetAverageWeight and OEGetIsotopicWeight.
  • Fixed the high-level pdb writer to preserve residue information found on the molecule.
  • Corrected OEIsReadable to return false for the MOPAC file format.
  • Added MOPAC flavors to the high-level molecule writers.
  • Changed the hybridization assignment of negatively charged resonant nitrogens such as *S(=O)(=O)[N-]C(=O)*.
  • Fix bug in OESet3DHydrogenGeom the could use a hydrogen’s own coordinates as a reference for determining its geometry.
  • Fix ring perception bug in OEMCSMaxAtomsCompleteCycles.
  • Eliminate the redundancy between OEChem::OEMDLSetBondStereo and OE3DToBondStereo by allowing OE3DToBondStereo to take an optional bond mask and work on 2D as well as 3D molecules.
  • Correct a bug in the OEChem interpretation of MDL wedge and hash bonds. In MDL connection tables, wedges and hashes only imply a specified stereo-center at the thin end (i.e. OEBondBase.GetBgn). This has been confirmed by comparing the wedge/hash bonds with the atom stereo parity bit in MDL ISIS output (including large vendor databases such as the entire Asinex 2005 collection).
  • Fixed MDL reader bug where unrecognized atomic symbols would ignore subsequent fields in the atom block such as stereo parity, reaction role and valence.
  • Added copy constructors and assignment operators to OEMiniMols, OEChem::OEMiniBonds and OEMiniAtoms.
  • Fixed a sign error in OESetAngle.
  • Added a length==0.0 check for OESetDistance and OESetAngle.

New Example Programs

These examples show the best feature of OEChem. Though most are less than 100 lines of simple code they demonstrate protein-protein sequence alignment, 2D and 3D structure manipulation, residue perception, robust multi-format I/O, stl integration, canonicalization, chirality perception and manipulation and many other complex cheminformatics tasks. While the main loop of each program is often only 30 lines long, it brings to bear thousand of lines of OEChem code and years of cumulative cheminformatics experience to easily combine 2D and 3D structure analysis and manipulation.

  • backbone.cpp: Code to show the use of functors to select and write the backbone atoms of a protein.
  • cischeck.cpp: Demonstrates how to loop over residues and checking the omega torsion for cis amides.
  • makealpha.cpp: A code example of protein structure manipulation. This example modifies any protein into an alpha-helical structure with extended side-chains.
  • phipsi.cpp: Simple code to report the phi-psi angles of a protein.
  • rescount.cpp: Demonstrates an easy way to loop over the residues of a protein and query their information.
  • reshist.cpp: Demonstrates and easy way to loop over a protein’s residues and integrate the acquired data into an STL dictionary class.
  • seqalign.cpp: This is perhaps the most complex program of the examples. It carries out protein-protein sequence alignment, alignment evaluation and printing as well as 3D structural alignment.
  • subsetres.cpp: Simple code example of how to pull a specific residue out of a protein using its common name(e.g. ARG B 52).
  • swapaieres.cpp: Demonstrates how a user can select a residue using its common name (e.g. GLN 252) and swap the ambiguous iso-electronic atoms.

OEBio 1.4.0

New Features

  • Added OESequenceAlignment class with associated features for pairwise sequence alignment (including PAM250, BLOSUM62 and GONNET), writing an alignment to an oeostream and carrying out RMSD alignment between two proteins based on the sequence alignment.
  • Simple methods for accessing and manipulating the torsion angles of biopolymers.
  • Introduce classes that allow a hierarchical view of the Chains, Fragments and Residues of a protein while maintaining the efficient OEChem internal data structures.
  • Added facility for swapping the terminal atoms of residues that are commonly ambiguous in protein crystal structures (e.g. terminal N,O of ASN).
  • Added nine new example programs demonstrating the use of the new OEBio API points. These include: backbone, cischeck, makealpha, phipsi, rescount, reshist, seqalign, subsetres and swapaieres.

OESystem 1.4.0

New Features

  • Moved superpose and tensor2mat API points from OEChem to OESystem. Added deprecated support for their use in OEChem.
  • Added ability to assign an OEIterBase<foo>* to an OEIter<const foo> object. This allows much wider use of iterators of const objects.
  • Made OEIter.Sort a stable sort.
  • Additional physical constants added to OEConst.
  • Added the ability to parse OEInterface parameter files without use of command-line parsing.
  • New OEChem::OEPDBOFlag::ELEMENT and OEChem::OEPDBOFlag::FORMALCHARGE flavors for pdb writer. OEChem::OEPDBOFlag::ELEMENT adds the atomic symbol to columns 77-78 and OEChem::OEPDBOFlag::FORMALCHARGE add non-zero formal charges in columns 79-80.
  • Extended the OEOFlavor_SMI_ExtBonds option from the .smi writer to the .can and .ism writers.

Major bug fixes

  • Protected the OEIter.Sort function from NaN (not a number) members.

Minor bug fixes

  • Fixed OEGrid and OEMultiGrid constructor bug that could cause no memory to be allocated for the grid elements.
  • Corrected behavior of OEGrid.Clear to clear the OEBase data, remove the title and reinitialize all the elements of the grid.
  • Fixed rotation bug in inertial-frame alignment.
  • Fixed bug in the atom index into coordinates used while calculating the center of mass.
  • Fixed bug in the calculation of OEMultiGrid.SetSpacing and OEMultiGrid.SetMid functions.
  • Fixed OEInterface category name bug, !KEYLESS bug and unterminated category bug.

OEPlatform 1.4.0

New Features

  • Improved binary data handling in streams.
  • Significant improvements for user convenience in licensing code will allow future versions of OpenEye applications to manage licensing failures in a friendly manner.

Major bug fixes

  • Fixed bug that prevented reading the final molecule in a file and then seeking to other positions in the file.
  • Fixed a 64bit stream seek and read bug that could cause memory overflows and crashes.

Minor bug fixes

  • Fixed bug in cross-platform directory searching and checking for files on a file system.
  • Fixed bug in OEPlatform::oeigzstream::size that reported incorrect sizes in some instances.
  • Added the ability to detect moved home directories under Windows.