Predicate Functors

A functor (function object) is simply any object that can be called as if it is a function i.e. an object of a class that defines the __call__() method.

Functors that return bool are an important special case. A unary function whose return type is bool is called a predicate.

In OEChem TK, these functors are often passed into another function. The functors are then called from inside the second function. This is the concept of a callback, because the second function provides the argument and ‘call’s back’ to the functor which was passed into the function. Generator method such as OEMolBase.GetAtoms can take a functor as an argument and use the callback mechanism to iterate over atoms that satisfy the functor passed to it. See example in Atom or Bond Subset Iteration section.

In the example below, the function CountAtoms loops over the atoms and performs a call-back to the predicate functor pred for each atom. If the predicate returns true, a counter is incremented. The main function passes OEIsOxygen predefined atom predicates to the CountAtoms function that counts the number of oxygen atoms in the molecule. (Please note that this function is already implemented in OEChem TK and called OECount.)

Listing 1: Using functor callbacks

from openeye import oechem


def CountAtoms(mol, pred):
    counts = 0
    for atom in mol.GetAtoms():
        if pred(atom):
            counts += 1
    return counts


mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cc[nH]c1CC2COCNC2")
print("Number of oxygen atoms =", CountAtoms(mol, oechem.OEIsOxygen()))

Built-in Functors

There are many useful functors already defined in OEChem TK. These can be used by programmers with little or no understanding of the details of how functors work. A programmer can simply pass them to one of the many OEChem TK functions and methods which take predicates as arguments.

Atom Functors

Access

Functor Name

ring atoms

OEAtomIsInRing

chain atoms

OEAtomIsInChain

atom with specified atom index

OEHasAtomIdx

atom with selected atom index

OEAtomIdxSelected

atom with specified atom name

OEHasAtomName

atoms with specified atom stereo

OEHasAtomStereoSpecified

atoms with specified formal charge

OEHasFormalCharge

atoms with specified number of heavy atom neighbors

OEHasHvyDegree

aromatic atoms

OEIsAromaticAtom

atoms with specific hybridization

OEIsAtomHybridization

chiral atoms

OEIsChiralAtom

atoms with anisotropic B-factor parameters

OEHasAnisou

atoms with specified map index

OEHasMapIdx

atoms representing R-Groups

OEIsRGroup

valid atoms (by OpenEye valence conventions)

OEIsValidAtomValence

valid atoms (by MDL valence conventions)

OEIsValidMDLAtomValence

\(n^{th}\) atom

OENthAtom

atoms that are both terminal and heavy

OEIsTermHeavyAtom

atom membership in a set of atoms

OEIsAtomMember

Listing 2: Using predefined atom functors

from openeye import oechem

mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cc[nH]c1CC2COCNC2")

print("Number of heavy atoms =", oechem.OECount(mol, oechem.OEIsHeavy()))
print("Number of ring atoms  =", oechem.OECount(mol, oechem.OEAtomIsInRing()))

The output of the preceding program is the following:

Number of heavy atoms = 12
Number of ring atoms  = 11

Atomic Number Functors

Access

Functor Name

atoms with specified atomic number

OEHasAtomicNum

carbon atoms

OEIsCarbon

halogen atoms

OEIsHalogen

heavy atoms

OEIsHeavy

hetero atoms

OEIsHetero

explicit hydrogen atoms

OEIsHydrogen

metal atoms

OEIsMetal

nitrogen atoms

OEIsNitrogen

oxygen atoms

OEIsOxygen

sulfur atoms

OEIsSulfur

phosphorus atoms

OEIsPhosphorus

non-carbon atoms

OEIsPolar

polar hydrogen atoms

OEIsPolarHydrogen

Please note that the following two lines produce the same result.

print("Number of oxygen atoms =", oechem.OECount(mol, oechem.OEHasAtomicNum(oechem.OEElemNo_O)))
print("Number of oxygen atoms =", oechem.OECount(mol, oechem.OEIsOxygen()))

Bond Functors

Access

Functor Name

ring bonds

OEBondIsInRing

chain bonds

OEBondIsInChain

bond with specified bond index

OEHasBondIdx

bond with selected bond index

OEBondIdxSelected

bonds with specified bond order

OEHasOrder

rotatable bonds

OEIsRotor

chiral bonds

OEIsChiralBond

bonds with specific bond stereo

OEHasBondStereoSpecified

aromatic bonds

OEIsAromaticBond

bond membership in a set of bonds

OEIsBondMember

Listing 3: Using predefined bond functors

from openeye import oechem

mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "CC(=O)Nc1c[nH]cc1")

print("Number of ring bonds  =", oechem.OECount(mol, oechem.OEBondIsInRing()))
print("Number of rotor bonds =", oechem.OECount(mol, oechem.OEIsRotor()))

The output of the preceding program is the following:

Number of ring bonds  = 5
Number of rotor bonds = 2

Group Functors

Access

Functor Name

groups with a specific atom

OEHasAtomInGroup

groups with a specific bond

OEHasBondInGroup

groups with a specific type

OEHasGroupType

groups that store MDL stereo information

OEIsMDLStereoGroup

Reaction Component Functors

Access

Functor Name

atoms of the catalysts or solvents of a reaction

OEAtomIsInAgent

atoms of the product molecule(s)

OEAtomIsInProduct

atoms of the reactant molecule(s)

OEAtomIsInReactant

Conformer Functors

Access

Functor Name

conformer with specified index

OEHasConfIdx

conformer with selected index

OEConfIdxSelected

Residue Data Functors

Access

Functor Name

atoms with specified residue properties

OEAtomMatchResidue

atoms with specified chain id

OEHasChainID

atoms with specified residue number

OEHasResidueNumber

atoms with an alternate location

OEHasAlternateLocation

atoms with specified fragment number

OEHasFragmentNumber

atoms with specified PDB index

OEHasPDBAtomIndex

alpha carbon

OEIsCAlpha

backbone atom

OEIsBackboneAtom

water

OEIsWater

nucleic acid base

OEIsNucleicAcidBase

nucleic acid sugar

OEIsNucleicAcidSugar

nucleic acid phosphate

OEIsNucleicAcidPhosphate

Composite Functors

Occasionally, one may want to use a logical operator to join two or more functors. The following table shows the composite functors defined in OEChem TK.

Composite Functors

Logical Not

Logical or

Logical and

Description

OENotAtom

OEOrAtom

OEAndAtom

atom composite functors

OENotBond

OEOrBond

OEAndBond

bond composite functors

OENotConf

OEOrConf

OEAndConf

conformation composite functors

OENotGroup

OEOrGroup

OEAndGroup

group composite functors

OENotRoleSet

OEOrRoleSet

OEAndRoleSet

roleset composite functors

Each composite functor takes the appropriate number of predicates as arguments and generates a single unary predicate. The following example demonstrates how to use composite functors to build expressions from OEChem TK’s predefined atom predicates.

Listing 4: Combining predefined atom predicates

from openeye import oechem

mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cnc(O)cc1CCCBr")

print("Number of chain atoms =", end=" ")
print(oechem.OECount(mol, oechem.OENotAtom(oechem.OEAtomIsInRing())))

print("Number of aromatic nitrogens =", end=" ")
print(oechem.OECount(mol, oechem.OEAndAtom(oechem.OEIsNitrogen(), oechem.OEIsAromaticAtom())))

print("Number of non-carbons =", end=" ")
print(oechem.OECount(mol, oechem.OENotAtom(oechem.OEHasAtomicNum(oechem.OEElemNo_C))))

print("Number of nitrogen and oxygen atoms =", end=" ")
print(oechem.OECount(mol, oechem.OEOrAtom(oechem.OEHasAtomicNum(oechem.OEElemNo_N),
                                          oechem.OEHasAtomicNum(oechem.OEElemNo_O))))

The OECount function returns the number or objects (in this case atoms) matching the given predicate argument.

The output of the preceding program is the following:

Number of chain atoms = 5
Number of aromatic nitrogens = 1
Number of non-carbons = 3
Number of nitrogen and oxygen atoms = 2

Composite functors can be used similarly to combine predefined bond predicates.

Listing 5: Combining predefined bond predicates

from openeye import oechem

mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "N#CCC1CCNC=C1")

print("Number of non-rotatable bonds =", end=" ")
print(oechem.OECount(mol, oechem.OENotBond(oechem.OEIsRotor())))

print("Number of ring double bonds =", end=" ")
print(oechem.OECount(mol, oechem.OEAndBond(oechem.OEBondIsInRing(), oechem.OEHasOrder(2))))

print("Number of double or triple bonds =", end=" ")
print(oechem.OECount(mol, oechem.OEOrBond(oechem.OEHasOrder(2), oechem.OEHasOrder(3))))

The output of the preceding program is the following:

Number of non-rotatable bonds = 8
Number of ring double bonds = 1
Number of double or triple bonds = 2

User Defined Functors

While many predefined functors exist in OEChem TK, it is not difficult to find a situation which calls for a new user-defined functor.

User-defined functor can be written by deriving from either the OEUnaryAtomPred or the OEUnaryBondPred class.

The following example shows a user defined atom functor which returns true for aliphatic nitrogens.

Listing 6: User defined atom predicate

from openeye import oechem


class PredAliphaticNitrogen(oechem.OEUnaryAtomPred):
    def __call__(self, atom):
        return atom.IsNitrogen() and not atom.IsAromatic()

    def CreateCopy(self):
        # __disown__ is required to allow C++ to take ownership of this
        # object and its memory
        return PredAliphaticNitrogen().__disown__()


mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cc[nH]c1CC2COCNC2")
print("Number of aliphatic N atoms =", end=" ")
print(oechem.OECount(mol, PredAliphaticNitrogen()))

The output of the preceding program is the following:

Number of aliphatic N atoms = 1

The previous example can be alternatively rewritten using the PyAtomPredicate class. PyAtomPredicate takes a Python function as the single argument. This passed function has to take a single OEAtomBase argument and return a boolean value. In essence, we are creating a predicate that itself holds a predicate.

def AliphaticNitrogen(atom):
    return atom.IsNitrogen() and not atom.IsAromatic()


print("Number of aliphatic N atoms =", end=" ")
print(oechem.OECount(mol, oechem.PyAtomPredicate(AliphaticNitrogen)))

A bond predicate can be similarly defined by deriving from the OEUnaryBondPred class.

Listing 7: User defined bond predicate

from openeye import oechem


class PredHasDoubleBondO(oechem.OEUnaryAtomPred):
    def __call__(self, atom):
        for bond in atom.GetBonds():
            if bond.GetOrder() == 2 and bond.GetNbr(atom).IsOxygen():
                return True
        return False

    def CreateCopy(self):
        # __disown__ is required to allow C++ to take ownership of this
        # object and its memory
        return PredHasDoubleBondO().__disown__()


class PredAmideBond(oechem.OEUnaryBondPred):
    def __call__(self, bond):
        if bond.GetOrder() != 1:
            return False
        atomB = bond.GetBgn()
        atomE = bond.GetEnd()
        pred = PredHasDoubleBondO()
        if atomB.IsCarbon() and atomE.IsNitrogen() and pred(atomB):
            return True
        if atomB.IsNitrogen() and atomE.IsCarbon() and pred(atomE):
            return True
        return False

    def CreateCopy(self):
        # __disown__ is required to allow C++ to take ownership of this
        # object and its memory
        return PredAmideBond().__disown__()


mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "CC(=O)Nc1c[nH]cc1")
print("Number of amide bonds =", oechem.OECount(mol, PredAmideBond()))

The output of the preceding program is the following:

Number of amide bonds = 1

Similarly, the previous example can be rewritten using the PyBondPredicate class. PyBondPredicate takes a Python function as the single argument. This passed function has to take a single OEBondBase argument and return a boolean value.

def AmideBond(bond):
    if bond.GetOrder() != 1:
        return False
    atomB = bond.GetBgn()
    atomE = bond.GetEnd()
    pred = PredHasDoubleBondO()
    if atomB.IsCarbon() and atomE.IsNitrogen() and pred(atomB):
        return True
    if atomB.IsNitrogen() and atomE.IsCarbon() and pred(atomE):
        return True
    return False


print("Number of amide bonds =", oechem.OECount(mol, oechem.PyBondPredicate(AmideBond)))

One advantage of functors over function pointers is that they can hold state. Since this state is held by the instance of the object it can be thread safe (unlike static-variables inside functions used with function pointers). The state of a functor can be initialized at construction. For instance, OEHasAtomicNum functor takes an argument on construction which defines which atomic number is required for the functor to return true.

Listing 8: User defined atom predicate with state

from openeye import oechem


class PredAtomicNumList(oechem.OEUnaryAtomPred):
    def __init__(self, alist):
        oechem.OEUnaryAtomPred.__init__(self)
        self.atomiclist = alist

    def __call__(self, atom):
        return (atom.GetAtomicNum() in self.atomiclist)

    def CreateCopy(self):
        # __disown__ is required to allow C++ to take ownership of this
        # object and its memory
        return PredAtomicNumList(self.atomiclist).__disown__()


mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cc[nH]c1CC2COCNC2")
alist = [oechem.OEElemNo_O, oechem.OEElemNo_N]
print("Number of oxygen or nitrogen atoms =", end=" ")
print(oechem.OECount(mol, PredAtomicNumList(alist)))

Functor substructure-based matching

The Listing 6 shows an example how to create a user-defined atom predicate. OEChem TK also provides a functor template, called OEMatchFunc, that allows convenient substructure-based atom matching.

In the following example functors are initialized with a SMARTS string. These functors return true only if the atom matches the substructure pattern specified in construction.

Listing 9: Functor substructure-based matching

from openeye import oechem

mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "C1(Cl)C(N)C(F)OC1C(=O)NCCCN")

NonAmideNitrogenPred = oechem.OEMatchAtom("[N;!$(NC=O)]")
print("Number of non-amide nitrogen =", oechem.OECount(mol, NonAmideNitrogenPred))

FiveMemberedRingOxygenPred = oechem.OEMatchAtom("[O;r5]")
print("Number of 5-membered ring oxygen =", oechem.OECount(mol, FiveMemberedRingOxygenPred))

CarbonAttachedToHalogenPred = oechem.OEMatchAtom("[#6][Cl,Br,F]")
print("Number of carbon attached to halogen =", oechem.OECount(mol, CarbonAttachedToHalogenPred))

The output of Listing 9 is the following:

Number of non-amide nitrogen = 2
Number of 5-membered ring oxygen = 1
Number of carbon attached to halogen = 2

Molecule Partitioning

The OESubsetMol function can take any atom predicate as an argument and generate a subset molecule from only atoms for which the specified predicate returns true. In the following example, ring atoms are extracted from a molecule by using the OEAtomIsInRing atom functor.

Listing 10: Ring system extraction

from openeye import oechem

mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cc[nH]c1CC2COCNC2")
submol = oechem.OEGraphMol()
oechem.OESubsetMol(submol, mol, oechem.OEAtomIsInRing(), True)
print(oechem.OEMolToSmiles(submol))

The output of Listing 10 is the following:

c1cc[nH]c1.C1CNCOC1

In the following example, ring systems are extracted from a molecule by using OEPartPred functor.

Listing 11: Ring system extraction

from openeye import oechem

mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cc[nH]c1CC2COCNC2")
nrrings, rings = oechem.OEDetermineRingSystems(mol)
pred = oechem.OEPartPredAtom(rings)
print("Number of rings =", nrrings)
for r in range(1, nrrings + 1):
    pred.SelectPart(r)
    ringmol = oechem.OEGraphMol()
    oechem.OESubsetMol(ringmol, mol, pred, True)
    print(r, "->", oechem.OEMolToSmiles(ringmol))

The output of Listing 11 is the following:

Number of rings = 2
1 -> c1cc[nH]c1
2 -> C1CNCOC1