Fingerprint CoverageΒΆ

Fingerprints are usually generated by enumerating various fragments of a molecule and then hashing them into a fixed-length bitvector. The OEGetFPCoverage function provides access to these fragments by returning an iterator over OEAtomBondSet objects, each of which storing the atoms and bonds of a specific fragment.

The following example shows how the retrieve the unique fragments that are enumerated when generating a path fingerprint. The obtained fragments are depicted in Table: Example of path fragments

Listing 17: Example of accessing patterns encoded into a fingerprint

#!/usr/bin/env python
from openeye.oechem import *
from openeye.oegraphsim import *

mol = OEGraphMol()
OESmilesToMol(mol, "CCNCC")

fptype = OEGetFPType(OEFPType_Path)
unique = True
for idx, abset in enumerate(OEGetFPCoverage(mol, fptype, unique)):
    print("%2d %s" % ((idx + 1), "".join([str(a) for a in abset.GetAtoms()])))

The output of Listing 17 is the following:

 1  0 C
 2  0 C  1 C
 3  0 C  1 C  2 N
 4  0 C  1 C  2 N  3 C
 5  0 C  1 C  2 N  3 C  4 C
 6  1 C
 7  1 C  2 N
 8  1 C  2 N  3 C
 9  1 C  2 N  3 C  4 C
10  2 N
Example of unique path fragments. The numbers displayed next to atoms are the atom indices.
../_images/FPCoverage-01.png ../_images/FPCoverage-02.png ../_images/FPCoverage-03.png
../_images/FPCoverage-04.png ../_images/FPCoverage-05.png ../_images/FPCoverage-06.png
../_images/FPCoverage-07.png ../_images/FPCoverage-08.png ../_images/FPCoverage-09.png
../_images/FPCoverage-10.png    

The OEGetFPCoverage function in the Listing 17 example is called with a unique options. This means that it returns only unique fragments, where a fragment (i.e. subgraph) is considered unique, if it differs from all other subgraphs identified previously by at least one atom or bond. For example, executing the same code with a non-unique option would generate five additional paths depicted in Table: Example of additional non-unique path fragments

Example of additional non-unique path fragments. The numbers displayed next to atoms are the atom indices.
../_images/FPCoverage-NonUnique-11.png ../_images/FPCoverage-NonUnique-12.png ../_images/FPCoverage-NonUnique-13.png
../_images/FPCoverage-NonUnique-14.png ../_images/FPCoverage-NonUnique-15.png  

See also