Version 2.0.0¶
OEChem 2.0.0¶
New features¶
OEMolDatabase class added for providing fast random read-only access to all the file formats OEChem supports. Please see the new Molecular Database Handling chapter of the documentation for a more thorough description.
Warning
The meaning of the .smi
file extension has changed in
OEChem 2.0. The .smi
file extension will now retain
stereochemistry information. This could cause problems for systems
that relied upon .smi
to strip away stereo chemistry.
The following changes have been made to how OEChem defines the various flavors of SMILES (see images demonstrating these changes in Table: SMILES File Formats):
The
.smi
,OEFormat::SMI
, file format outputs Canonical isomeric SMILES.OEFormat::USM
file format added (with the.usm
file extension) that allows the generation of non-canonical, non-isomeric SMILES, i.e., the OEChem 1.x definition of the.smi
file extension.OEGetFormatString
returns more descriptive values for the various flavors of SMILES regarding canonicalization and stereo information.The default file format for molecule streams is now
OEFormat::SMI
and is identical to theOEFormat::ISM
format.
¶ OEFormat::CSV
file format added to OEChem for round-tripping molecules and SD data to other software packages that support the comma-separate-value,.csv
, file format. OEChem supports the.csv
file format that has largely become ubiquitous and is now standardized by RFC 4180. The following APIs were added for handling CSV files in OEChem:Added automatic 2D coordinate generation to the following file formats when invoking the
OEWriteMolecule
high-level molecule writer:In case of the
OEFormat::MDL
and theOEFormat::SDF
file formats, 2D coordinates are generated if the molecule has no coordinates i.e. existing 2D or 3D coordinates will be left intact.In case of the
OEFormat::CDX
file format, 2D coordinates are generated if the molecule has no 2D coordinates.
This default behavior can be turned off by using the
OEOFlavor::MDL::Add2D
,OEOFlavor::SDF::Add2D
andOEOFlavor::CDX::Add2D
flags, respectively.OEGetFormatExtension
will now return the most common file extension as the first element in the comma separated list for theOEFormat::PDB
,OEFormat::SDF
, andOEFormat::RDF
file formats.OEGenerate2DCoordinates
added to assign 2D coordinates to the given molecule.OEMatchBase::IsValid
andoperator bool
methods added.OERoleSet abstract base class added as a mixin class to allow a class to contain a set of OERole objects for classification purposes. This is similar to how the OEBase class provides associative data container behavior as “generic data”. The following classes already derive from OERoleSet:
More classes across the toolkits may be added in the future as dictated by needs.
oemolstreambase::GetFileName
method added to all molecule streams to return the file name used to open the stream, if a file name was used.OEMatchBase::IsValid
method added for determining whether the match contains any atoms or bonds.OEAssignZap7Radii
function added to assign radii from the ZAP7 set.OEParseSmilesOptions
class added for adding more complex SMILES parsing options for theOEParseSmiles
function. This added the ability to makeOEParseSmiles
quiet with regard to parsing failures through theOEParseSmilesOptions::SetQuiet
method.
Warning
OEChem::OEConfBase::operator C *
has been removed
from the API. C++ users who used this API for direct coordinate
access or derived their own classes from
OEConfBase will need to change to the
following different idiom.
OEChem::OEConfBase::operator C *
has been removed in favor of the newOEConfBase::GetCoordsPtr
methods. This allows implementations of OEConfBase to use alternative data storage for coordinates, e.g., 64-bitdouble
precision floating point. Users of OEConfBase should not useOEConfBase::GetCoordsPtr
directly, instead relying on the OEConstCoords and OEMutableCoords convenience classes.Due to the above change, the following code will now fail to compile:
OEMol mol; for (OEIter<OEConfBase> conf = mol.GetConfs(); conf; ++conf) mol.NewConf(*conf);
This code was not doing what most programmers thought it was doing. It was actually implicitly casting the
OEConfBase &
into afloat *
and then calling the version ofNewConf
that takes afloat
pointer. Users should change their code to the following:OEMol mol; for (OEIter<OEConfBase> conf = mol.GetConfs(); conf; ++conf) mol.NewConf(conf);
The following methods that rely on
OEConfBaseT<float, 2>
andOEConfBaseT<double, 3>
template instatiations were removed and replaced with OEMolBase versions:
Note
This is part of a larger effort to migrate away from the
OEChem::OEConfBaseT
template class in favor a pure
abstract OEConfBase class to allow for
conformer storage in formats other than 32-bit precision
float
. This will only affect C++ users of the toolkit, even
then, users who only used OEConfBase
as a typedef should not
be affected.
Major bug fixes¶
The
OEFormat::MDL
V3000 file format was unable to read multiple molecules from the same file if they were all in the V3000 format. Note, this did not affect theOEFormat::SDF
files containing V3000 as those files contain ‘$ $ $ $’ to delimit separate molecule records.OEChem::OEReadHeader
will no longer corrupt the stack by arbitrarily zeroing out bytes on the stack whenever no OEHeader record is found in theOEFormat::OEB
file.OEMatch::AddPair
will no longer crash ifOEMatch::Clear
was previously called.oemolthreadbase::PutMol
will now destroy the pointer passed to it if the underlying buffer returnsfalse
, e.g., the buffer has already been closed.OEAddMols
,OESuppressHydrogens
, andOEPerceiveSymmetry
will no longer cause a stack overflow and crash when the molecule has a large number of atoms.
Minor bug fixes¶
Removed unimplemented
OEChem::OEFormat::TDT
file format.Removed
OEChem::OEFormat::BIN
format after being deprecated for 10 years.OEMolToSmiles
no longer outputs the title of the molecule.If
OE3DToAtomStereo
orOE3DToBondStereo
throws a warning message during a call toOEMolToSmiles
, the warning message will no longer erroneously say it is during a call toOEWriteMolecule
.OESetComment
andOEGetComment
have have been slightly optimized for speed. There is also a larger optimization for memory and file space in the.oeb
format. Previously, setting the comment to an empty string would write superfluous data to the.oeb
file.OECopySDData
andOECopyPDBData
will no longer increase the memory consumption of the destination molecule whenever the source molecule does not contain any data. This was causingOEFormat::PDB
files read into a OEMCMolBase to use more memory than necessary.OEFormat::MOL2
parser will no longer create bonds with zero-order from “dummy bonds”. Zero-order bonds cause problems for many OEChem algorithms likeOEKekulize
.OEFormat::MOL2
parser will now properly ignore lines between molecule records that start with the pound sign, “#”. Previously, these lines would cause OEChem to spew a lot of warnings and cause the parser to fail.
Documentation fixes¶
The chapter about InChI failures has been removed since most of the InChI failures were fixed in the last release, OEChem 1.9.3 in 2013.Oct.
Release notes section re-organized to make the current release more prominent.
Added documentation and code example for
OESmartsLexReplace
. The old function namedSmartsLexReplace
without the leadingOE
is considered deprecated and has been removed from the documentation.
OESystem 2.0.0¶
New features¶
Added support for handling the comma-separated-value, CSV, format specified by RFC 4180 with the following two free functions:
OEStringCSVJoin
for creating a CSV recordOEStringCSVTokenize
for parsing a CSV record
The following low-level functions were added support CSV handling but can be ignored by most users:
OEStringCSVQuote
addedOEStringCollapseQuotes
addedOEStringStripQuotes
added
OEStringJoin
now accepts a parameter to determine if the output string should end with the selected delimiter.OEStringTokenizeQuoted
added a parameter for whether consecutive delimiters should be treated as a single delimiter. The default behavior is exactly the same as previous versions.
OEHalfFloat class added for storing floating point data as a 16-bit representation as specified by the IEEE 754-2008 standard to save on memory consumption and bandwidth.
Collections of OEHalfFloat objects,
std::vector
and arrays, can be attached to OEBase objects and round-tripped through.oeb
files to save on disk size.
The performance of reading plain-old-data attached to OEBase objects from
.oeb
files has been improved by about %10.OEErrorLevelToString
free function added.OEBFPosTEndian
added for allowing 64-bit integers to be round-tripped to binary formats regardless of machine endian-ness.
OEWriteData
added to allow easily writing any data type as binary.
Major bug fixes¶
OEThrow
mutex handling has been migrated from OEErrorHandler down a level intoOESystem::OEErrorHandlerStreamImpl
. This fixes the following issues:The mutex can now be properly released during a process exit like
OEErrorHandler::Fatal
.A deadlock will no longer occur if the implementation of
OEErrorHandlerImplBase::Msg
needs to throw a message itself.Allows alternative faster and more scalable implementations of OEErrorHandlerImplBase to be created and used with
OEThrow
.
Minor bug fixes¶
OEBitVectorNumWords
andOEBitVectorNumBytes
now take and returnsize_t
.OESystem::OEBinaryTagMaxLength
constant added and set to 1024, the maximum length of an explicit string tag in the.oeb
file format.OEStringTokenizeQuoted
will no longer treat a quote as the end quote of a field if it is escaped by another quote. Table: OEStringTokenizeQuoted Change demonstrates the change to support proper CSV parsing.¶ OEChem 1.x
“foo””,bar”,blah
"foo""``|``bar"``|``blah
OEChem 2.0+
“foo””,bar”,blah
"foo"",bar"``|``blah
The size of
OEBitVector
object has been increased from 12 bytes to 16 bytes on 64-bit machines. The size is still 8 bytes on a 32-bit machine.
Documentation fixes¶
IsTrue and IsFalse are deprecated and will be removed in
OEChem 3.0
, please convert to using OEIsTrue and OEIsFalse.OEUnaryTrue and OEUnaryFalse documented as acceptable synonyms of OEIsTrue and OEIsFalse.
OEPlatform 2.0.0¶
New features¶
OEAddLicenseData
function added to parse a string as if it is an OpenEye license file and then license the current process with it.
OELockCondition class added to provide a scoped lock on OECondition objects.
Major bug fixes¶
OEPlatform::oeistream::size was incorrectly being truncated to 32-bits on 32-bit machines. This resulted in incorrect file sizes being reported for files over 4 gigabytes in size on 32-bit machines.
OEPlatform::oeifstream::tell would return incorrect values if called after OEPlatform::oeifstream::size and after some bytes were already read from the stream.
Minor bug fixes¶
OEPlatform::OEMallocaPtr::GetPtr
added to allow the object to be explicitly converted to a pointer type.OEMutex and OETryMutex destructors will now destroy themselves on
pthread
based systems. Destroying mutexes is optional according to thepthread
standard, but helpful in debugging possible deadlocks.OEMutex and OETryMutex no longer use
gthreads
, instead usingpthreads
, allowing for integration with non-GCC systems like libc++ on OSX.
OEGrid 1.4.5¶
Major bug fixes¶
OEGridFileType::Ascii
file writer will no longer sometimes corrupt the stack and crash.When reading a CCP4 file, the standard deviation stored in the CCP4 header is now used to normalize the file.
OENormalizeGrid now properly normalizes by sigma (not variance)
Writing CCP4 maps now uses the original map statistics when possible.
Rotated skew grids attached as generic data are now round-trippable when saved to OEB
Fixed a crash when interpolating grids where the rotation matrix inverts the target grids bounding box
MTZ files with more than 18 columns are now read properly
Fixed a memory leak in
OESequenceAlignment
.The constructor for the predicate OEHasResidueNumber now takes an int rather than an unsigned int because residue numbers can be negative.