Atom and Bond Traversal¶
OEChem TK* molecules contain atoms and bonds which have APIs described
by the OEAtomBase
and
OEBondBase
abstract base-classes
respectively. Atoms and bonds in OEChem TK can only be created and
destroyed in the context of an OEChem TK molecule. While they can be
accessed as pointers through various member functions of molecules,
their memory is owned by the molecules and they are deallocated during
the molecules’ destruction. Attempting to use references to atoms or
bonds of a molecule after the molecule has gone out of scope results
in undefined behavior.
Iterators¶
The standard way of processing each item or member of a set or collection in OEChem TK is by the use of an iterator. The use of iterators is a common abstraction (or design pattern) in object oriented programming because it hides the way the collection/container is implemented from the user. Hence a set of atoms could be implemented internally as an array, a linked list, a hash table, or any similar data structure, but its behavior to the programmer is independent of the actual implementation. An iterator can be thought of as a current position indicator.
OEChem TK makes extensive effort to support the Python iteration syntax.
for x in y:
# do something with x
The normal Python user does not have to care that an iterator is providing this convenient, yet powerful, abstraction.
Atom and Bond Iteration¶
Listing 1
shows the minimal use of
OEChem TK’s iterators. These examples use the
OEMolBase
methods
GetAtoms
and
GetBonds
, which return
iterators over the atoms and bonds of a molecule, respectively.
Listing 1: Using iterators to loop over atoms and bonds
from openeye import oechem
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cocc1")
print("atoms")
for atom in mol.GetAtoms():
print(atom.GetAtomicNum())
print("bonds")
for bond in mol.GetBonds():
print(bond.GetOrder())
Note
Listing 1
introduced the
GetAtomicNum
and
GetOrder
methods. These and
other OEAtomBase and OEBondBase
methods will be covered in more detail in chapters
Atom Properties and Bond Properties,
respectively.
Bonds of an Atom Iteration¶
The exact same idiom is used for iterating over the bonds attached to
an atom. The GetBonds
method
returns an iterator over the bonds connected to that
atom. Listing 2
shows how to use this
iterator to determine the explicit degree of an atom, i.e. the
number of bonds to it, not including bonds to implicit hydrogen atoms.
Listing 2: Looping over the bonds of an atom
from openeye import oechem
def MyGetExplicitDegree(atom):
result = 0
for bond in atom.GetBonds():
result += 1
return result
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cocc1Br")
for atom in mol.GetAtoms():
print("Atom", atom.GetIdx(), "has degree", MyGetExplicitDegree(atom))
Atom Neighbor Iteration¶
Often it is not the bonds around the atoms that you wish to loop over,
but the neighboring atoms. One way to do this would be to use the
GetBonds
method described in
the previous section and use the
GetNbr
method on each
OEBondBase
to get the atom across the
bond from the input atom.
Listing 3: Finding the neighbors of an atom (version 1)
from openeye import oechem
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cocc1Br")
for atom in mol.GetAtoms():
print("Atom:", end=" ")
print(atom.GetIdx(), end=" ")
print("Neighbors:", end=" ")
for bond in atom.GetBonds():
nbor = bond.GetNbr(atom)
print(nbor.GetIdx(), end=" ")
print()
However this can be done even more conveniently using the
GetAtoms
method of an
OEAtomBase
directly, which allows
loops over the neighbor atoms.
Listing 4: Finding the neighbors of an atom (version 2)
from openeye import oechem
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1cocc1Br")
for atom in mol.GetAtoms():
print("Atom:", end=" ")
print(atom.GetIdx(), end=" ")
print("Neighbors:", end=" ")
for nbor in atom.GetAtoms():
print(nbor.GetIdx(), end=" ")
print()
Atom or Bond Subset Iteration¶
It can sometimes be useful to loop over a subset of the atoms or bonds
of a molecule. Traditionally, this is done with if statements
inside a loop, but it can sometimes be cleaner and more convenient to
subset the members being looped over inside the iterator itself. To
do this, many of OEChem TK’s iterator generation functions (such as
GetAtoms
) can take an argument
which determines which subset of the object to loop over (these
functions are called functors are detailed in the chapter
Predicate Functors). The details of these functions are not
important here. Instead, a programmer can simply use the predefined
functors to control their loops.
Listing 5
shows the use of the predicate
OEHasAtomicNum
to loop over only
carbon atoms in a molecule.
Listing 5: Looping over carbon atoms only
from openeye import oechem
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1c(Br)occ1CCC")
print("Carbon atoms:", end=" ")
for atom in mol.GetAtoms(oechem.OEHasAtomicNum(oechem.OEElemNo_C)):
print(atom.GetIdx(), end=" ")
print()
See also
For a complete list of built-in predicates, see Built-in Functors section.
Iterator Methods¶
Iterators offer a much wider range of iteration possibilities. For
example, the iterator can be reused by using the
ToFirst
method. Or, the order of
iteration can be rearranged with the
Sort
method.
The following table describes how to use the same rich set of iterator operations that C++ offers.
Description |
C++ |
Python |
---|---|---|
Increment |
|
|
Increment by n |
|
|
Decrement |
|
|
Decrement by n |
|
|
Go to first |
|
|
Go to last |
|
|
Current Access |
|
|
Validity |
|
|
Listing 6
shows how to use an
OEAtomBase iterator to loop over the atoms in a
molecule in reverse order and print their atomic numbers.
Note
The order of the atoms returned by
OEMolBase.GetAtoms
can be controlled by
OEMolBase.OrderAtoms
.
Listing 6: Looping over atoms in reverse order
from openeye import oechem
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "n1ccccc1")
aitr = mol.GetAtoms()
aitr.ToLast()
while aitr.IsValid():
print(aitr.Target().GetAtomicNum())
aitr.Prev()