The meaning of the .smi file extension has changed in OEChem 2.0. The .smi file extension will now retain stereochemistry information. This could cause problems for systems that relied upon .smi to strip away stereo chemistry.
The following changes have been made to how OEChem defines the various flavors of SMILES (see images demonstrating these changes in Table: SMILES File Formats):
OEFormat::CSV file format added to OEChem for round-tripping molecules and SD data to other software packages that support the comma-separate-value, .csv, file format. OEChem supports the .csv file format that has largely become ubiquitous and is now standardized by RFC 4180. The following APIs were added for handling CSV files in OEChem:
Added automatic 2D coordinate generation to the following file formats when invoking the OEWriteMolecule high-level molecule writer:
OEGenerate2DCoordinates added to assign 2D coordinates to the given molecule.
OERoleSet abstract base class added as a mixin class to allow a class to contain a set of OERole objects for classification purposes. This is similar to how the OEBase class provides associative data container behavior as “generic data”. The following classes already derive from OERoleSet:
More classes across the toolkits may be added in the future as dictated by needs.
oemolstreambase::GetFileName method added to all molecule streams to return the file name used to open the stream, if a file name was used.
OEMatchBase::IsValid method added for determining whether the match contains any atoms or bonds.
OEAssignZap7Radii function added to assign radii from the ZAP7 set.
OEParseSmilesOptions class added for adding more complex SMILES parsing options for the OEParseSmiles function. This added the ability to make OEParseSmiles quiet with regard to parsing failures through the OEParseSmilesOptions::SetQuiet method.
OEChem::OEConfBase::operator C * has been removed from the API. C++ users who used this API for direct coordinate access or derived their own classes from OEConfBase will need to change to the following different idiom.
OEChem::OEConfBase::operator C * has been removed in favor of the new OEConfBase::GetCoordsPtr methods. This allows implementations of OEConfBase to use alternative data storage for coordinates, e.g., 64-bit double precision floating point. Users of OEConfBase should not use OEConfBase::GetCoordsPtr directly, instead relying on the OEConstCoords and OEMutableCoords convenience classes.
Due to the above change, the following code will now fail to compile:
OEMol mol; for (OEIter<OEConfBase> conf = mol.GetConfs(); conf; ++conf) mol.NewConf(*conf);
This code was not doing what most programmers thought it was doing. It was actually implicitly casting the OEConfBase & into a float * and then calling the version of NewConf that takes a float pointer. Users should change their code to the following:
OEMol mol; for (OEIter<OEConfBase> conf = mol.GetConfs(); conf; ++conf) mol.NewConf(conf);
The following methods that rely on OEConfBaseT<float, 2> and OEConfBaseT<double, 3> template instatiations were removed and replaced with OEMolBase versions:
This is part of a larger effort to migrate away from the OEChem::OEConfBaseT template class in favor a pure abstract OEConfBase class to allow for conformer storage in formats other than 32-bit precision float. This will only affect C++ users of the toolkit, even then, users who only used OEConfBase as a typedef should not be affected.
Added support for handling the comma-separated-value, CSV, format specified by RFC 4180 with the following two free functions:
The following low-level functions were added support CSV handling but can be ignored by most users:
OEBinaryTagMaxLength constant added and set to 1024, the maximum length of an explicit string tag in the .oeb file format.
OEStringTokenizeQuoted will no longer treat a quote as the end quote of a field if it is escaped by another quote. Table: OEStringTokenizeQuoted Change demonstrates the change to support proper CSV parsing.
The size of OEBitVector object has been increased from 12 bytes to 16 bytes on 64-bit machines. The size is still 8 bytes on a 32-bit machine.