OECreateIDStrings¶
std::string OECreateIDString(const OEChem::OEMolBase &mol)
OESystem::OEIterBase<const std::string> * OECreateIDStrings(const OEChem::OEMolBase &mol)
This pair of functions exposes the graph-edit similarity used in clustering BROOD hitlists.
Because this is a naturally O(N^2), the API is designed for efficiency using an asymmetric
approach. For any ‘accepted’ molecule, the OECreateIDStrings
function is used to generate
a set of all the canonical SMILES of fragments which would be considered similar to the input
molecule (this can be stored for fast lookup). If one wants to determine if a new molecule is
similar to the original molecule, OECreateIDString
is called, and the single string
generated is compared to all the strings previously created. If the new string is an exact
match to any of the multiple strings created from the prior molecule, then the two molecules
are deemed similar.