Fingerprint CoverageΒΆ

Fingerprints are usually generated by enumerating various fragments of a molecule and then hashing them into a fixed-length bitvector. The OEGetFPCoverage function provides access to these fragments by returning an iterator over OEAtomBondSet objects, each of which storing the atoms and bonds of a specific fragment.

The following example shows how the retrieve the unique fragments that are enumerated when generating a path fingerprint. The obtained fragments are depicted in Table: Example of path fragments

Listing 17: Example of accessing patterns encoded into a fingerprint

package openeye.docexamples.oegraphsim;

import openeye.oechem.*;
import openeye.oegraphsim.*;

public class FPCoverage {

    public static void main(String argv[]) {

        OEGraphMol mol = new OEGraphMol();
        oechem.OESmilesToMol(mol, "CCNCC");

        OEFPTypeBase fptype = oegraphsim.OEGetFPType(OEFPType.Path);

        boolean unique = true;
        int idx = 0;
        for (OEAtomBondSet abset : oegraphsim.OEGetFPCoverage(mol, fptype, unique)) {
            idx++;
            System.out.printf("%2d ", idx);
            for (OEAtomBase a : abset.GetAtoms()) {
                System.out.printf("% d %s", a.GetIdx(), oechem.OEGetAtomicSymbol(a.GetAtomicNum()));
            }
            System.out.printf("\n");
        }
    }
}

The output of Listing 17 is the following:

 1  0 C
 2  0 C  1 C
 3  0 C  1 C  2 N
 4  0 C  1 C  2 N  3 C
 5  0 C  1 C  2 N  3 C  4 C
 6  1 C
 7  1 C  2 N
 8  1 C  2 N  3 C
 9  1 C  2 N  3 C  4 C
10  2 N
Example of unique path fragments. The numbers displayed next to atoms are the atom indices.
../_images/FPCoverage-01.png ../_images/FPCoverage-02.png ../_images/FPCoverage-03.png
../_images/FPCoverage-04.png ../_images/FPCoverage-05.png ../_images/FPCoverage-06.png
../_images/FPCoverage-07.png ../_images/FPCoverage-08.png ../_images/FPCoverage-09.png
../_images/FPCoverage-10.png    

The OEGetFPCoverage function in the Listing 17 example is called with a unique options. This means that it returns only unique fragments, where a fragment (i.e. subgraph) is considered unique, if it differs from all other subgraphs identified previously by at least one atom or bond. For example, executing the same code with a non-unique option would generate five additional paths depicted in Table: Example of additional non-unique path fragments

Example of additional non-unique path fragments. The numbers displayed next to atoms are the atom indices.
../_images/FPCoverage-NonUnique-11.png ../_images/FPCoverage-NonUnique-12.png ../_images/FPCoverage-NonUnique-13.png
../_images/FPCoverage-NonUnique-14.png ../_images/FPCoverage-NonUnique-15.png  

See also