Fingerprint CoverageΒΆ

Fingerprints are usually generated by enumerating various fragments of a molecule and then hashing them into a fixed-length bitvector. The OEGetFPCoverage function provides access to these fragments by returning an iterator over OEAtomBondSet objects, each of which storing the atoms and bonds of a specific fragment.

The following example shows how the retrieve the unique fragments that are enumerated when generating a path fingerprint. The obtained fragments are depicted in Table: Example of path fragments

Listing 17: Example of accessing patterns encoded into a fingerprint

using System;

using OpenEye.OEChem;
using OpenEye.OEGraphSim;

public class FPCoverage
{
    public static int Main(string[] args)
    {
        OEGraphMol mol = new OEGraphMol();
        OEChem.OESmilesToMol(mol, "CCNCC");

        OEFPTypeBase fptype = OEGraphSim.OEGetFPType(OEFPType.Path);

        bool unique = true;
        uint idx = 0;
        foreach (OEAtomBondSet abset in OEGraphSim.OEGetFPCoverage(mol, fptype, unique))
        {
            idx++;
            Console.Write("{0,2} ", idx);
            foreach (OEAtomBase a in abset.GetAtoms())
            {
                Console.Write(" {0} {1}", a.GetIdx(), OEChem.OEGetAtomicSymbol(a.GetAtomicNum()));
            }
            Console.WriteLine();
        }
        return 0;
    }
}

The output of Listing 17 is the following:

 1  0 C
 2  0 C  1 C
 3  0 C  1 C  2 N
 4  0 C  1 C  2 N  3 C
 5  0 C  1 C  2 N  3 C  4 C
 6  1 C
 7  1 C  2 N
 8  1 C  2 N  3 C
 9  1 C  2 N  3 C  4 C
10  2 N
Example of unique path fragments. The numbers displayed next to atoms are the atom indices.
../_images/FPCoverage-01.png ../_images/FPCoverage-02.png ../_images/FPCoverage-03.png
../_images/FPCoverage-04.png ../_images/FPCoverage-05.png ../_images/FPCoverage-06.png
../_images/FPCoverage-07.png ../_images/FPCoverage-08.png ../_images/FPCoverage-09.png
../_images/FPCoverage-10.png    

The OEGetFPCoverage function in the Listing 17 example is called with a unique options. This means that it returns only unique fragments, where a fragment (i.e. subgraph) is considered unique, if it differs from all other subgraphs identified previously by at least one atom or bond. For example, executing the same code with a non-unique option would generate five additional paths depicted in Table: Example of additional non-unique path fragments

Example of additional non-unique path fragments. The numbers displayed next to atoms are the atom indices.
../_images/FPCoverage-NonUnique-11.png ../_images/FPCoverage-NonUnique-12.png ../_images/FPCoverage-NonUnique-13.png
../_images/FPCoverage-NonUnique-14.png ../_images/FPCoverage-NonUnique-15.png  

See also