Fingerprint Coverage

Fingerprints are usually generated by enumerating various fragments (patterns) of a molecule and then hashing them into a fixed-length bitvector. The OEGetFPCoverage function provides access to these fragments by returning an iterator over OEAtomBondSet objects, each of which storing the atoms and bonds of a specific fragment.

See also

  • Fingerprint Patterns chapter that shows how to access more information about the fragments enumerated during the fingerprint generation

The following example shows how the retrieve the unique fragments that are enumerated when generating a path fingerprint. The obtained fragments are depicted in Table: Example of path fragments

Listing 17: Example of accessing patterns encoded into a fingerprint

int main( )
{
  OEGraphMol mol;
  OESmilesToMol(mol, "CCNCC");

  const OEFPTypeBase* fptype = OEGetFPType(OEFPType::Path);

  const auto unique = true;
  auto idx = 0u;
  for (OEIter<const OEAtomBondSet> abset = OEGetFPCoverage(mol, fptype, unique); abset; ++abset)
  {
    idx++;
    printf("%2d ", idx);
    for (OEIter<const OEAtomBase> atom = abset->GetAtoms(); atom; ++atom)
    {
      printf(" %d %s", atom->GetIdx(), OEGetAtomicSymbol(atom->GetAtomicNum()));
    }
    printf("\n");
  }
  return 0;
}

The output of Listing 17 is the following:

 1  0 C
 2  0 C  1 C
 3  0 C  1 C  2 N
 4  0 C  1 C  2 N  3 C
 5  0 C  1 C  2 N  3 C  4 C
 6  1 C
 7  1 C  2 N
 8  1 C  2 N  3 C
 9  1 C  2 N  3 C  4 C
10  2 N
Example of unique path fragments. The numbers displayed next to atoms are the atom indices.
../_images/FPCoverage-01.png ../_images/FPCoverage-02.png ../_images/FPCoverage-03.png
../_images/FPCoverage-04.png ../_images/FPCoverage-05.png ../_images/FPCoverage-06.png
../_images/FPCoverage-07.png ../_images/FPCoverage-08.png ../_images/FPCoverage-09.png
../_images/FPCoverage-10.png

The OEGetFPCoverage function in the Listing 17 example is called with a unique options. This means that it returns only unique fragments, where a fragment (i.e. subgraph) is considered unique, if it differs from all other subgraphs identified previously by at least one atom or bond. For example, executing the same code with a non-unique option would generate five additional paths depicted in Table: Example of additional non-unique path fragments

Example of additional non-unique path fragments. The numbers displayed next to atoms are the atom indices.
../_images/FPCoverage-NonUnique-11.png ../_images/FPCoverage-NonUnique-12.png ../_images/FPCoverage-NonUnique-13.png
../_images/FPCoverage-NonUnique-14.png ../_images/FPCoverage-NonUnique-15.png

See also