This is a preliminary API and may be improved based on user feedback. It is currently available in C++ and Python.

class OESubSearchQuery

The OESubSearchQuery class is used to submit queries to be searched in a database (OESubSearchDatabase).


OESubSearchQuery(const OEQMolBase &query, const size_t maxmatches=1000u)

Creates an OESubSearchQuery object.


The query molecule (OEQMolBase).


The maximum number of matches that will be kept.


void SetFilter(const OESystem::OEUnaryPredicate<OEMolBase>&)

Sets a molecule predicate that can be used to filter out molecules based on molecular properties other than the existence of a certain substructure.

The following code snippet shows how to use the OESubSearchQuery.SetFilter method to identify molecules that are matching the given SMARTS pattern and also have molecule weight in the given range:

ssdb = oechem.OESubSearchDatabase(dbfname, oechem.OESubSearchDatabaseType_Default, nrthreads)

qmol = oechem.OEQMol()
oechem.OEParseSmarts(qmol, "c1c[n,o]cc1")

maxmatches = 100
query = oechem.OESubSearchQuery(qmol, maxmatches)

result = oechem.OESubSearchResult()
ssdb.Search(result, query)
print("Number of total matches = {}".format(result.NumTotalMatches()))

# search filtered by molecule weight

minweight, maxweight = 200.0, 350.0
query.SetFilter(MoleculeWeightPredicate(minweight, maxweight))

filteredresult = oechem.OESubSearchResult()
ssdb.Search(filteredresult, query)
print("Number of total matches (filtered) = {}".format(filteredresult.NumTotalMatches()))

mol = oechem.OEGraphMol()
for index in filteredresult.GetMatchIndices():
    if ssdb.GetMolecule(mol, index):
        print("weight= {:.3f} {}".format(oechem.OECalculateMolecularWeight(mol), oechem.OEMolToSmiles(mol)))

The output of the code snippet above might look like this:

Number of total matches = 20
Number of total matches (filtered) = 7
weight= 204.225 c1ccc2c(c1)c(c[nH]2)C[C@H](C(=O)O)N
weight= 218.252 CN[C@@H](Cc1c[nH]c2c1cccc2)C(=O)O
weight= 245.277 CC(=O)N[C@@H](Cc1c[nH]c2c1cccc2)C(=O)N
weight= 275.303 c1ccc2c(c1)c(c[nH]2)C[C@@H](C(=O)O)NC(=O)CCN
weight= 260.288 CC(=O)N[C@@H](Cc1c[nH]c2c1cccc2)C(=O)OC
weight= 274.315 CCOC(=O)[C@H](Cc1c[nH]c2c1cccc2)NC(=O)C
weight= 254.327 CN1CC(C=C2[C@H]1Cc3c[nH]c4c3c2ccc4)CO

where MoleculeWeightPredicate is a molecule predicate that is defined as:

class MoleculeWeightPredicate(oechem.OEUnaryMolBasePred):
    def __init__(self, minweight, maxweight):
        self.minweight = minweight
        self.maxweight = maxweight

    def __call__(self, mol):
        weight = oechem.OECalculateMolecularWeight(mol)
        return (weight >= self.minweight and weight <= self.maxweight)

    def CreateCopy(self):
        # __disown__ is required to allow C++ to take ownership of this
        # object and its memory
        return MoleculeWeightPredicate(self.minweight, self.maxweight).__disown__()


During the search, the predicate set by the OESubSearchQuery.SetFilter method is utilized after the screening phase and before the atom-by-atom validation of the substructure search match. Note that to minimize the memory footprints, the OESubSearchDatabase only stores the molecular graphs (no coordinates) and the titles of the molecules.

See also


size_t GetMaxMatches() const

Returns the maximum match limit when searching OESubSearchDatabase. The OESubSearchDatabase.GetMatchIndices and OESubSearchDatabase.GetMatchTitles methods will terminate when this limit is reached. The default is 1000.


void SetMaxMatches(const size_t limit)

Sets the maximum match limit.


While there is no upper limit on how many matches can be retrieve by the search, it is not recommended to set this limit very high (>10K). Searching a very large database with a very generic query can result in internally storing millions of indices or titles. The total number of matches can be determined (without storing all matches) either using the OESubSearchDatabase.NumMatches method or via the OESubSearchResult.NumTotalMatches counter when using the OESubSearchDatabase.Search method.