Fingerprints are usually generated by enumerating various fragments (patterns) of a molecule and then hashing them into a fixed-length bitvector. The OEGetFPCoverage function provides access to these fragments by returning an iterator over OEAtomBondSet objects, each of which storing the atoms and bonds of a specific fragment.
The following example shows how the retrieve the unique fragments that are enumerated when generating a path fingerprint. The obtained fragments are depicted in Table: Example of path fragments
Listing 17: Example of accessing patterns encoded into a fingerprint
mol = oechem.OEGraphMol() oechem.OESmilesToMol(mol, "CCNCC") fptype = oegraphsim.OEGetFPType(oegraphsim.OEFPType_Path) unique = True for idx, abset in enumerate(oegraphsim.OEGetFPCoverage(mol, fptype, unique)): print("%2d %s" % ((idx + 1), "".join([str(a) for a in abset.GetAtoms()])))
The output of Listing 17 is the following:
1 0 C 2 0 C 1 C 3 0 C 1 C 2 N 4 0 C 1 C 2 N 3 C 5 0 C 1 C 2 N 3 C 4 C 6 1 C 7 1 C 2 N 8 1 C 2 N 3 C 9 1 C 2 N 3 C 4 C 10 2 N
The OEGetFPCoverage function in the Listing 17 example is called with a unique options. This means that it returns only unique fragments, where a fragment (i.e. subgraph) is considered unique, if it differs from all other subgraphs identified previously by at least one atom or bond. For example, executing the same code with a non-unique option would generate five additional paths depicted in Table: Example of additional non-unique path fragments