Fingerprint Types¶
A fingerprint is a bitvector. To reflect this the OEFingerPrint class derives from the OEBitVector class. The difference is that OEFingerPrint has a type that represents how the fingerprint is generated. Fingerprints may only be compared if they are generated in the same way. Therefore, the following restriction is introduced:
Warning
When two fingerprints are subjected to similarity calculation their type has to be identical.
Listing 1
shows how to create different fingerprint
objects (OEFingerPrint) and identify or
compare their types.
Listing 1: Fingerprint type
fpA = oegraphsim.OEFingerPrint()
fpB = oegraphsim.OEFingerPrint()
if not fpA.IsValid():
print("uninitialized fingerprint")
mol = oechem.OEGraphMol()
oechem.OESmilesToMol(mol, "c1ccccc1")
oegraphsim.OEMakeFP(fpA, mol, oegraphsim.OEFPType_Path)
oegraphsim.OEMakeFP(fpB, mol, oegraphsim.OEFPType_Lingo)
if oegraphsim.OEIsFPType(fpA, oegraphsim.OEFPType_Lingo):
print("Lingo")
if oegraphsim.OEIsFPType(fpA, oegraphsim.OEFPType_Path):
print("Path")
if oegraphsim.OEIsSameFPType(fpA, fpB):
print("same fingerprint types")
else:
print("different fingerprint types")
The output of Listing 1
is the following:
uninitialized fingerprint
Path
different fingerprint types
Two fingerprints are considered to be equivalent only if they have the same fingerprint type (OEFPTypeBase) and have identical bit-vectors (OEBitVector). The following code snippet shows how to compare two OEFingerPrint objects.
if fpA == fpB:
print("same fingerprints")
else:
print("different fingerprints")
The following code snippet shows how to initialize a
OEFingerPrint object by
using the type of another fingerprint.
The type of a fingerprint is accessed by the
OEFingerPrint.GetFPTypeBase
method.
fpA = oegraphsim.OEFingerPrint()
oegraphsim.OEMakePathFP(fpA, mol)
fpB = oegraphsim.OEFingerPrint()
oegraphsim.OEMakeFP(fpB, mol, fpA.GetFPTypeBase())
Fingerprint parameters¶
The User-defined Fingerprint chapter gives examples of how user defined fingerprints can be generated by defining, for example, the atom and bond properties that will be encoded into the fingerprints.
In order to ensure that only equivalent fingerprints can be compared,
the fingerprint type stores the parameters being used in the
generation process.
The OEFPTypeBase.GetFPTypeString
method returns
the string representation of the fingerprint type that includes
information about the parameters being used.
fp = oegraphsim.OEFingerPrint()
oegraphsim.OEMakeFP(fp, mol, oegraphsim.OEFPType_Path)
print(fp.GetFPTypeBase().GetFPTypeString())
The output of the preceding snippet is the following:
Path,ver=2.0.0,size=4096,bonds=0-5,atype=AtmNum|Arom|Chiral|FCharge|HvyDeg|Hyb|EqHalo,
btype=Order|Chiral
Note
The returned string does not include newline characters, the string was broken into two separate lines here only for better readability.
The following Listing 2
shows how to extract
the parameters of a fingerprint from a string representation by
using the OEFPTypeParams class.
Listing 2: Fingerprint parameters
fptype = oegraphsim.OEGetFPType(oegraphsim.OEFPType_Path)
prms = oegraphsim.OEFPTypeParams(fptype.GetFPTypeString())
print("version = %s" % oegraphsim.OEGetFingerPrintVersionString(prms.GetVersion()))
print("number of bits = %d" % prms.GetNumBits())
print("min bonds = %d" % prms.GetMinDistance())
print("max bonds = %d" % prms.GetMaxDistance())
print("atom types = %s" % oegraphsim.OEGetFPAtomType(prms.GetAtomTypes()))
print("bond types = %s" % oegraphsim.OEGetFPBondType(prms.GetBondTypes()))
The output of Listing 2
is the following:
version = 2.0.0
number of bits = 4096
min bonds = 0
max bonds = 5
atom types = AtmNum|Arom|Chiral|FCharge|HvyDeg|Hyb|EqHalo
bond types = Order|Chiral
See also
User-defined Fingerprint chapter
OEIsValidFPTypeString
functionOEGetFPType
functionOEFPAtomType
namespaceOEGetFPAtomType
functionOEFPBondType
namespaceOEGetFPBondType
function
Fingerprint version number¶
Each fingerprint type additionally has a version number. Version
numbers are introduced in order to keep track of changes in the
fingerprint generation algorithm itself.
The OEFPTypeBase.GetFPVersionString
method
returns the string representation of the fingerprint version.
fp = oegraphsim.OEFingerPrint()
oegraphsim.OEMakeFP(fp, mol, oegraphsim.OEFPType_Path)
print(fp.GetFPTypeBase().GetFPVersionString())
The output of the preceding snippet is the following:
2.0.0
Warning
The version number of the fingerprints will not be changed with each release. It will be incremented only if modifications or bug fixes to the corresponding algorithm would result in generating a different bit-vector for the same molecules.
Fingerprints with an old version number will be still readable and comparable with each other but not with fingerprints which have different version number.