Predicate Functors¶
A functor (function object) is simply any object that can be called as if it is a
function i.e. an object of a class that defines the __call__()
method.
Functors that return bool are an important special case. A unary function whose return type is bool is called a predicate.
In OEChem TK, these functors are often passed into another function.
The functors are then called from inside the second
function. This is the concept of a callback, because the second
function provides the argument and ‘call’s back’ to the functor which
was passed into the function.
Generator method such as OEMolBase.GetAtoms
can take a functor
as an argument and use the callback mechanism to iterate over
atoms that satisfy the functor passed to it.
See example in Atom or Bond Subset Iteration section.
In the example below, the function CountAtoms
loops over the atoms
and performs a call-back to the predicate functor pred
for each
atom. If the predicate returns true, a counter is incremented.
The main
function passes OEIsOxygen predefined atom predicates
to the CountAtoms
function that counts the number of oxygen atoms in the molecule.
(Please note that this function is already implemented in OEChem TK and called
OECount
.)
Listing 1: Using functor callbacks
from openeye import oechem
def CountAtoms(mol, pred):
counts = 0
for atom in mol.GetAtoms():
if pred(atom):
counts += 1
return counts
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cc[nH]c1CC2COCNC2")
print("Number of oxygen atoms =", CountAtoms(mol, oechem.OEIsOxygen()))
Built-in Functors¶
There are many useful functors already defined in OEChem TK. These can be used by programmers with little or no understanding of the details of how functors work. A programmer can simply pass them to one of the many OEChem TK functions and methods which take predicates as arguments.
Atom Functors¶
Access |
Functor Name |
---|---|
ring atoms |
|
chain atoms |
|
atom with specified atom index |
|
atom with selected atom index |
|
atom with specified atom name |
|
atoms with specified atom stereo |
|
atoms with specified formal charge |
|
atoms with specified number of heavy atom neighbors |
|
aromatic atoms |
|
atoms with specific hybridization |
|
chiral atoms |
|
atoms with anisotropic B-factor parameters |
|
atoms with specified map index |
|
atoms representing R-Groups |
|
valid atoms (by OpenEye valence conventions) |
|
valid atoms (by MDL valence conventions) |
|
\(n^{th}\) atom |
|
atoms that are both terminal and heavy |
|
atom membership in a set of atoms |
Listing 2: Using predefined atom functors
from openeye import oechem
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cc[nH]c1CC2COCNC2")
print("Number of heavy atoms =", oechem.OECount(mol, oechem.OEIsHeavy()))
print("Number of ring atoms =", oechem.OECount(mol, oechem.OEAtomIsInRing()))
The output of the preceding program is the following:
Number of heavy atoms = 12
Number of ring atoms = 11
Atomic Number Functors¶
Access |
Functor Name |
---|---|
atoms with specified atomic number |
|
carbon atoms |
|
halogen atoms |
|
heavy atoms |
|
hetero atoms |
|
explicit hydrogen atoms |
|
metal atoms |
|
nitrogen atoms |
|
oxygen atoms |
|
sulfur atoms |
|
phosphorus atoms |
|
non-carbon atoms |
|
polar hydrogen atoms |
Please note that the following two lines produce the same result.
print("Number of oxygen atoms =", oechem.OECount(mol, oechem.OEHasAtomicNum(oechem.OEElemNo_O)))
print("Number of oxygen atoms =", oechem.OECount(mol, oechem.OEIsOxygen()))
Bond Functors¶
Access |
Functor Name |
---|---|
ring bonds |
|
chain bonds |
|
bond with specified bond index |
|
bond with selected bond index |
|
bonds with specified bond order |
|
rotatable bonds |
|
chiral bonds |
|
bonds with specific bond stereo |
|
aromatic bonds |
|
bond membership in a set of bonds |
Listing 3: Using predefined bond functors
from openeye import oechem
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "CC(=O)Nc1c[nH]cc1")
print("Number of ring bonds =", oechem.OECount(mol, oechem.OEBondIsInRing()))
print("Number of rotor bonds =", oechem.OECount(mol, oechem.OEIsRotor()))
The output of the preceding program is the following:
Number of ring bonds = 5
Number of rotor bonds = 2
Group Functors¶
Access |
Functor Name |
---|---|
groups with a specific atom |
|
groups with a specific bond |
|
groups with a specific type |
|
groups that store MDL stereo information |
Reaction Component Functors¶
Access |
Functor Name |
---|---|
atoms of the catalysts or solvents of a reaction |
|
atoms of the product molecule(s) |
|
atoms of the reactant molecule(s) |
Conformer Functors¶
Access |
Functor Name |
---|---|
conformer with specified index |
|
conformer with selected index |
Residue Data Functors¶
Access |
Functor Name |
---|---|
atoms with specified residue properties |
|
atoms with specified chain id |
|
atoms with specified residue number |
|
atoms with an alternate location |
|
atoms with specified fragment number |
|
atoms with specified PDB index |
|
alpha carbon |
|
backbone atom |
|
water |
|
nucleic acid base |
|
nucleic acid sugar |
|
nucleic acid phosphate |
Composite Functors¶
Occasionally, one may want to use a logical operator to join two or more functors. The following table shows the composite functors defined in OEChem TK.
Logical Not |
Logical or |
Logical and |
Description |
---|---|---|---|
atom composite functors |
|||
bond composite functors |
|||
conformation composite functors |
|||
group composite functors |
|||
roleset composite functors |
Each composite functor takes the appropriate number of predicates as arguments and generates a single unary predicate. The following example demonstrates how to use composite functors to build expressions from OEChem TK’s predefined atom predicates.
Listing 4: Combining predefined atom predicates
from openeye import oechem
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cnc(O)cc1CCCBr")
print("Number of chain atoms =", end=" ")
print(oechem.OECount(mol, oechem.OENotAtom(oechem.OEAtomIsInRing())))
print("Number of aromatic nitrogens =", end=" ")
print(oechem.OECount(mol, oechem.OEAndAtom(oechem.OEIsNitrogen(), oechem.OEIsAromaticAtom())))
print("Number of non-carbons =", end=" ")
print(oechem.OECount(mol, oechem.OENotAtom(oechem.OEHasAtomicNum(oechem.OEElemNo_C))))
print("Number of nitrogen and oxygen atoms =", end=" ")
print(oechem.OECount(mol, oechem.OEOrAtom(oechem.OEHasAtomicNum(oechem.OEElemNo_N),
oechem.OEHasAtomicNum(oechem.OEElemNo_O))))
The OECount
function returns the number or objects (in this case atoms)
matching the given predicate argument.
The output of the preceding program is the following:
Number of chain atoms = 5
Number of aromatic nitrogens = 1
Number of non-carbons = 3
Number of nitrogen and oxygen atoms = 2
Composite functors can be used similarly to combine predefined bond predicates.
Listing 5: Combining predefined bond predicates
from openeye import oechem
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "N#CCC1CCNC=C1")
print("Number of non-rotatable bonds =", end=" ")
print(oechem.OECount(mol, oechem.OENotBond(oechem.OEIsRotor())))
print("Number of ring double bonds =", end=" ")
print(oechem.OECount(mol, oechem.OEAndBond(oechem.OEBondIsInRing(), oechem.OEHasOrder(2))))
print("Number of double or triple bonds =", end=" ")
print(oechem.OECount(mol, oechem.OEOrBond(oechem.OEHasOrder(2), oechem.OEHasOrder(3))))
The output of the preceding program is the following:
Number of non-rotatable bonds = 8
Number of ring double bonds = 1
Number of double or triple bonds = 2
User Defined Functors¶
While many predefined functors exist in OEChem TK, it is not difficult to find a situation which calls for a new user-defined functor.
User-defined functor can be written by deriving from either
the OEUnaryAtomPred
or the OEUnaryBondPred
class.
The following example shows a user defined atom functor which returns true for aliphatic nitrogens.
Listing 6: User defined atom predicate
from openeye import oechem
class PredAliphaticNitrogen(oechem.OEUnaryAtomPred):
def __call__(self, atom):
return atom.IsNitrogen() and not atom.IsAromatic()
def CreateCopy(self):
# __disown__ is required to allow C++ to take ownership of this
# object and its memory
return PredAliphaticNitrogen().__disown__()
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cc[nH]c1CC2COCNC2")
print("Number of aliphatic N atoms =", end=" ")
print(oechem.OECount(mol, PredAliphaticNitrogen()))
The output of the preceding program is the following:
Number of aliphatic N atoms = 1
The previous example can be alternatively rewritten using the PyAtomPredicate class. PyAtomPredicate takes a Python function as the single argument. This passed function has to take a single OEAtomBase argument and return a boolean value. In essence, we are creating a predicate that itself holds a predicate.
def AliphaticNitrogen(atom):
return atom.IsNitrogen() and not atom.IsAromatic()
print("Number of aliphatic N atoms =", end=" ")
print(oechem.OECount(mol, oechem.PyAtomPredicate(AliphaticNitrogen)))
A bond predicate can be similarly defined by deriving from the
OEUnaryBondPred
class.
Listing 7: User defined bond predicate
from openeye import oechem
class PredHasDoubleBondO(oechem.OEUnaryAtomPred):
def __call__(self, atom):
for bond in atom.GetBonds():
if bond.GetOrder() == 2 and bond.GetNbr(atom).IsOxygen():
return True
return False
def CreateCopy(self):
# __disown__ is required to allow C++ to take ownership of this
# object and its memory
return PredHasDoubleBondO().__disown__()
class PredAmideBond(oechem.OEUnaryBondPred):
def __call__(self, bond):
if bond.GetOrder() != 1:
return False
atomB = bond.GetBgn()
atomE = bond.GetEnd()
pred = PredHasDoubleBondO()
if atomB.IsCarbon() and atomE.IsNitrogen() and pred(atomB):
return True
if atomB.IsNitrogen() and atomE.IsCarbon() and pred(atomE):
return True
return False
def CreateCopy(self):
# __disown__ is required to allow C++ to take ownership of this
# object and its memory
return PredAmideBond().__disown__()
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "CC(=O)Nc1c[nH]cc1")
print("Number of amide bonds =", oechem.OECount(mol, PredAmideBond()))
The output of the preceding program is the following:
Number of amide bonds = 1
Similarly, the previous example can be rewritten using the PyBondPredicate class. PyBondPredicate takes a Python function as the single argument. This passed function has to take a single OEBondBase argument and return a boolean value.
def AmideBond(bond):
if bond.GetOrder() != 1:
return False
atomB = bond.GetBgn()
atomE = bond.GetEnd()
pred = PredHasDoubleBondO()
if atomB.IsCarbon() and atomE.IsNitrogen() and pred(atomB):
return True
if atomB.IsNitrogen() and atomE.IsCarbon() and pred(atomE):
return True
return False
print("Number of amide bonds =", oechem.OECount(mol, oechem.PyBondPredicate(AmideBond)))
One advantage of functors over function pointers is that they can hold state. Since this state is held by the instance of the object it can be thread safe (unlike static-variables inside functions used with function pointers). The state of a functor can be initialized at construction. For instance, OEHasAtomicNum functor takes an argument on construction which defines which atomic number is required for the functor to return true.
Listing 8: User defined atom predicate with state
from openeye import oechem
class PredAtomicNumList(oechem.OEUnaryAtomPred):
def __init__(self, alist):
oechem.OEUnaryAtomPred.__init__(self)
self.atomiclist = alist
def __call__(self, atom):
return (atom.GetAtomicNum() in self.atomiclist)
def CreateCopy(self):
# __disown__ is required to allow C++ to take ownership of this
# object and its memory
return PredAtomicNumList(self.atomiclist).__disown__()
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cc[nH]c1CC2COCNC2")
alist = [oechem.OEElemNo_O, oechem.OEElemNo_N]
print("Number of oxygen or nitrogen atoms =", end=" ")
print(oechem.OECount(mol, PredAtomicNumList(alist)))
Functor substructure-based matching¶
The Listing 6
shows an example how to create a
user-defined atom predicate. OEChem TK also provides a functor template,
called OEMatchFunc, that allows convenient substructure-based
atom matching.
In the following example functors are initialized with a SMARTS string. These functors return true only if the atom matches the substructure pattern specified in construction.
Listing 9: Functor substructure-based matching
from openeye import oechem
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "C1(Cl)C(N)C(F)OC1C(=O)NCCCN")
NonAmideNitrogenPred = oechem.OEMatchAtom("[N;!$(NC=O)]")
print("Number of non-amide nitrogen =", oechem.OECount(mol, NonAmideNitrogenPred))
FiveMemberedRingOxygenPred = oechem.OEMatchAtom("[O;r5]")
print("Number of 5-membered ring oxygen =", oechem.OECount(mol, FiveMemberedRingOxygenPred))
CarbonAttachedToHalogenPred = oechem.OEMatchAtom("[#6][Cl,Br,F]")
print("Number of carbon attached to halogen =", oechem.OECount(mol, CarbonAttachedToHalogenPred))
The output of Listing 9
is the following:
Number of non-amide nitrogen = 2
Number of 5-membered ring oxygen = 1
Number of carbon attached to halogen = 2
Molecule Partitioning¶
The OESubsetMol
function can take any atom predicate
as an argument and generate a subset molecule from only atoms for which the
specified predicate returns true. In the following example, ring atoms are
extracted from a molecule by using the OEAtomIsInRing
atom functor.
Listing 10: Ring system extraction
from openeye import oechem
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cc[nH]c1CC2COCNC2")
submol = oechem.OEGraphMol()
oechem.OESubsetMol(submol, mol, oechem.OEAtomIsInRing(), True)
print(oechem.OEMolToSmiles(submol))
The output of Listing 10
is the following:
c1cc[nH]c1.C1CNCOC1
In the following example, ring systems are extracted from a molecule
by using OEPartPred
functor.
Listing 11: Ring system extraction
from openeye import oechem
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cc[nH]c1CC2COCNC2")
nrrings, rings = oechem.OEDetermineRingSystems(mol)
pred = oechem.OEPartPredAtom(rings)
print("Number of rings =", nrrings)
for r in range(1, nrrings + 1):
pred.SelectPart(r)
ringmol = oechem.OEGraphMol()
oechem.OESubsetMol(ringmol, mol, pred, True)
print(r, "->", oechem.OEMolToSmiles(ringmol))
The output of Listing 11
is the following:
Number of rings = 2
1 -> c1cc[nH]c1
2 -> C1CNCOC1