GraphSim TK 2.3.0¶
New features¶
A new API has been added to perform rapid searching of fingerprints using the popcount method:
OECreateFastFPDatabaseFile
function and OECreateFastFPDatabaseOptions class to generate binary fingerprint filesOEFastFPDatabase class and
OEFastFPDatabaseMemoryType
namespace to perform rapid in-memory or memory-mapped fingerprint searches using the popcount methodOEFastFPDatabaseParams utility class and
OEAreCompatibleDatabases
utility functionOEIsFastFPDatabaseReady
andOEGetPopCountMethod
functions andOEPopCountMethod
namespace to check whether the popcount method is supported on the current hardware
Note
GraphSim TK currently only supports the popcount search method for fingerprints where the size is multiple of
256
. This means that theOEFPType::MACCS166
fingerprint type is currently not supported. See the User-defined Fingerprint section.Note
OEFastFPDatabase gives identical results to OEFPDatabase. However OEFPDatabase calculates similarity scores in single precision (float) while OEFastFPDatabase uses double precision. As a result small similarity score differences can be observed.
API Change¶
The OESimScore class, which is utilized by both the OEFPDatabase class and the new OEFastFPDatabase class, stores a similarity value with a corresponding molecule index. This pair is now stored as “double” and “size_t” rather than “float” and “unsigned int”. This non-breaking API change allows GraphSim TK to support molecular files that contain more than \(2^{32}\) entries and supports double precision similarity scores calculated using the new OEFastFPDatabase class.
C++-specific changes¶
The above-mentioned OESimScore class API change means that the following warning messages could be encountered when compiling old code with the new GraphSim TK:
implicit conversion loses floating point precision: 'double' to 'float'
for (OEIter<OESimScore> si = fpdb.GetScores(mol); si; ++si) float score = si->GetScore(); // GetScore() now returns double
implicit conversion loses integer precision : 'size_t' to 'unsigned int'
for (OEIter<OESimScore> si = fpdb.GetScores(mol); si; ++si) unsigned int molidx = si->GetIdx(); // GetIdx() now returns size_t
Java-specific changes¶
The above-mentioned OESimScore class API change means that a similarity value with a corresponding molecule index is now expected to be “double” and “long”, respectively.
Documentation changes¶
A new section, Building and searching fingerprint database, has been added.
New Generating fingerprint file for fast fingerprint search and Searching fast fingerprint database sections have been added with examples showing generation and searching of fingerprints using the OEFastFPDatabase class.