Fingerprints are usually generated by enumerating various fragments (patterns)
of a molecule and then hashing them into a fixed-length bitvector.
OEGetFPCoverage function provides access to these
fragments by returning an iterator over OEAtomBondSet objects,
each of which storing the atoms and bonds of a specific fragment.
Fingerprint Patterns chapter that shows how to access more information about the fragments enumerated during the fingerprint generation
The following example shows how the retrieve the unique fragments that are enumerated when generating a path fingerprint. The obtained fragments are depicted in Table: Example of path fragments
Listing 17: Example of accessing patterns encoded into a fingerprint
mol = oechem.OEGraphMol() oechem.OESmilesToMol(mol, "CCNCC") fptype = oegraphsim.OEGetFPType(oegraphsim.OEFPType_Path) unique = True for idx, abset in enumerate(oegraphsim.OEGetFPCoverage(mol, fptype, unique)): print("%2d %s" % ((idx + 1), "".join([str(a) for a in abset.GetAtoms()])))
The output of
Listing 17 is the following:
1 0 C 2 0 C 1 C 3 0 C 1 C 2 N 4 0 C 1 C 2 N 3 C 5 0 C 1 C 2 N 3 C 4 C 6 1 C 7 1 C 2 N 8 1 C 2 N 3 C 9 1 C 2 N 3 C 4 C 10 2 N
OEGetFPCoverage function in the
Listing 17 example is called with a unique options.
This means that it returns only unique fragments, where a fragment
(i.e. subgraph) is considered unique, if it differs from all other
subgraphs identified previously by at least one atom or bond.
For example, executing the same code with a non-unique option would
generate five additional paths depicted in
Table: Example of additional non-unique path fragments