Identifying Acceptor and Donor Atoms new¶
Problem¶
You want to identify acceptor and donor atoms in a molecule.
Difficulty Level¶
Solution¶
While the MolProp TK provides the functions to count the number of acceptor and donor atoms in the molecules, it does not identify which atoms are counted.
Lipinski [Lipinski-1997] provides a simple way to identify them. A Lipinski acceptor is either an oxygen or a nitrogen:
class IsLipinskiAcceptor(oechem.OEUnaryAtomPred):
def __call__(self, atom):
if atom.GetAtomicNum() in [oechem.OEElemNo_O, oechem.OEElemNo_N]:
return True
return False
A Lipinski donor is either an oxygen or a nitrogen with at least one hydrogen.
class IsLipinskiDonor(oechem.OEUnaryAtomPred):
def __call__(self, atom):
if atom.GetAtomicNum() not in [oechem.OEElemNo_O, oechem.OEElemNo_N]:
return False
return has_hydrogen(atom)
Where has_hydrogen is defined as the following to handle both implicit and explicit hydrogens:
def has_hydrogen(atom):
if atom.GetImplicitHCount() > 0:
return True
for neigh in atom.GetAtoms():
if neigh.IsHydrogen():
return True
return False
After implementing the definitions as predicates we can use them in any APIs that takes atom predicates. For example:
def num_lipinsky_acceptors(mol):
return oechem.OECount(mol, IsLipinskiAcceptor())
def print_lipinsky_donors(mol):
oechem.OETriposAtomTypeNames(mol)
print ("Lipinski donor atoms in molecule '{}'".format(oechem.OEMolTSmiles(mol)))
for atom in mol.GetAtoms(IsLipinskiDonor):
print (atom.GetName())
For more information see Predicate Functors chapter of the OEChem TK manual.
Discussion¶
Another possible implementation is based on Daylight SMARTS definitions.
This following predicate only identifies hydrogen bond acceptor atoms in carbonyl and nitroso functional groups:
class IsHBondAcceptorCarbonylNitroso(oechem.OEUnaryAtomPred):
# [#6,#7;R0]=[#8]
def __call__(self, atom):
if not atom.IsOxygen():
return False
if atom.GetHvyDegree() != 1:
return False
for bond in atom.GetBonds():
if bond.GetOrder() != 2:
return False
for neigh in atom.GetAtom():
if neigh.GetAtomicNum not in [oechem.OEElemNo_O, oechem.OEElemNo_N]:
return False
return True
The following predicate identifies hydrogen bond donor as either a nitrogen, oxygen or fluorine atom with at least one hydrogen:
class IsHBondDonor(oechem.OEUnaryAtomPred):
# [!H0;#7,#8,#9]
def __call__(self, atom):
if atom.GetAtomicNum() not in [oechem.OEElemNo_O, oechem.OEElemNo_N, oechem.OEElemNo_F]:
return False
return has_hydrogen(atom)
The following predicate identifies hydrogen bond donor as a non-negatively charged hetero-atom with at least one hydrogen:
class IsHBondDonorInclusive(oechem.OEUnaryAtomPred):
# [!$([#6,H0,-,-2,-3])]
def __call__(self, atom):
if atom.GetAtomicNum() in [oechem.OEElemNo_C, oechem.OEElemNo_H]:
return False
if atom.GetFormalCharge() != 0:
return False
return has_hydrogen(atom)
Also if a SMARTS pattern exists, it is very easy to create a predicate using the OEMatchAtom built-in predicate:
def print_donors(mol):
oechem.OETriposAtomNames(mol)
donorpred = oechem.OEMatchAtom("[!H0;#7,#8,#9]")
print ("Donor atoms in molecule '{}'".format(oechem.OEMolToSmiles(mol)))
for atom in mol.GetAtoms(donorpred):
print (atom.GetName())
For comparison, the simplified SMARTS definitions of acceptor and donor atoms in OpenEye’s ROCS application are the following:
- donor: [$([#7,#8,#15,#16]);H]
- acceptor: [#8&!$(\*~N~[OD1]),#7&H0;!$([D4]);!$([D3]-\*=,:[$([#7,#8,#15,#16])])]
Please note that the built-in Mills and Dean [MillsDean-1996] definitions used in ROCS are significantly more precise and elaborate.
Figure 1. Example of color definition in ROCS
See also in OEChem TK manual¶
Theory
- Predicate Functors chapter
API
- OEMatchAtom atom predicate
- OEUnaryAtomPred base atom predicate
See also in OEMolProp TK manual¶
- OEGetLipinskiAcceptorCount function
- OEGetLipinskiDonorCount function
- OEGetHBondAcceptorCount function
- OEGetHBondDonorCount function