OESubSearchQuery¶
Attention
This API is currently available in C++ and Python.
class OESubSearchQuery
The OESubSearchQuery class is used to submit queries
to be searched in a database (OESubSearchDatabase
).
See also
Code Example
Constructors¶
OESubSearchQuery(const OEQMolBase &query, const size_t maxmatches=1000u)
Creates an OESubSearchQuery object.
- query
The query molecule (OEQMolBase).
- maxmatches
The maximum number of matches that will be kept.
SetFilter¶
void SetFilter(const OESystem::OEUnaryPredicate<OEMolBase>&)
Sets a molecule predicate that can be used to filter out molecules based on molecular properties other than the existence of a certain substructure.
The following code snippet shows how to use the
OESubSearchQuery.SetFilter
method to identify
molecules that are matching the given SMARTS pattern and
also have molecule weight in the given range:
ssdb = oechem.OESubSearchDatabase(dbfname, oechem.OESubSearchDatabaseType_Default, nrthreads)
qmol = oechem.OEQMol()
oechem.OEParseSmarts(qmol, "c1c[n,o]cc1")
maxmatches = 100
query = oechem.OESubSearchQuery(qmol, maxmatches)
result = oechem.OESubSearchResult()
ssdb.Search(result, query)
print("Number of total matches = {}".format(result.NumTotalMatches()))
# search filtered by molecule weight
minweight, maxweight = 200.0, 350.0
query.SetFilter(MoleculeWeightPredicate(minweight, maxweight))
filteredresult = oechem.OESubSearchResult()
ssdb.Search(filteredresult, query)
print("Number of total matches (filtered) = {}".format(filteredresult.NumTotalMatches()))
mol = oechem.OEGraphMol()
for index in filteredresult.GetMatchIndices():
if ssdb.GetMolecule(mol, index):
print("weight= {:.3f} {}".format(oechem.OECalculateMolecularWeight(mol), oechem.OEMolToSmiles(mol)))
The output of the code snippet above might look like this:
Number of total matches = 20
Number of total matches (filtered) = 7
weight= 204.225 c1ccc2c(c1)c(c[nH]2)C[C@H](C(=O)O)N
weight= 218.252 CN[C@@H](Cc1c[nH]c2c1cccc2)C(=O)O
weight= 245.277 CC(=O)N[C@@H](Cc1c[nH]c2c1cccc2)C(=O)N
weight= 275.303 c1ccc2c(c1)c(c[nH]2)C[C@@H](C(=O)O)NC(=O)CCN
weight= 260.288 CC(=O)N[C@@H](Cc1c[nH]c2c1cccc2)C(=O)OC
weight= 274.315 CCOC(=O)[C@H](Cc1c[nH]c2c1cccc2)NC(=O)C
weight= 254.327 CN1CC(C=C2[C@H]1Cc3c[nH]c4c3c2ccc4)CO
where MoleculeWeightPredicate
is a molecule predicate that is defined as:
class MoleculeWeightPredicate(oechem.OEUnaryMolBasePred):
def __init__(self, minweight, maxweight):
oechem.OEUnaryMolBasePred.__init__(self)
self.minweight = minweight
self.maxweight = maxweight
def __call__(self, mol):
weight = oechem.OECalculateMolecularWeight(mol)
return (weight >= self.minweight and weight <= self.maxweight)
def CreateCopy(self):
# __disown__ is required to allow C++ to take ownership of this
# object and its memory
return MoleculeWeightPredicate(self.minweight, self.maxweight).__disown__()
Note
During the search, the predicate set by the
OESubSearchQuery.SetFilter
method is utilized after
the screening phase and before the atom-by-atom validation of the
substructure search match. Note that to minimize the memory footprints, the
OESubSearchDatabase only stores the molecular graphs
(no coordinates) and the titles of the molecules.
See also
Predicate Functors chapter
GetMaxMatches¶
size_t GetMaxMatches() const
Returns the maximum match limit when searching OESubSearchDatabase.
The OESubSearchDatabase.GetMatchIndices
and
OESubSearchDatabase.GetMatchTitles
methods will terminate
when this limit is reached.
The default is 1000
.
SetMaxMatches¶
void SetMaxMatches(const size_t limit)
Sets the maximum match limit.
Note
While there is no upper limit on how many matches can be retrieve by the search,
it is not recommended to set this limit very high (>10K).
Searching a very large database with a very generic query can result in
internally storing millions of indices or titles.
The total number of matches can be determined (without storing all matches) either
using the OESubSearchDatabase.NumMatches
method or
via the OESubSearchResult.NumTotalMatches
counter when using
the OESubSearchDatabase.Search
method.