Fingerprint Overlap

The OEGetFPOverlap function provides access to the fragments of two molecules that are considered equivalent based on a specific fingerprint type. This means that the returned fragment-pairs set the same bit “on” when fingerprints are generated. The following example shows how to retrieve the common five bond-length patterns of two molecules.

Listing 18: Example of accessing common patterns based on a fingerprint

int main( )
  OEGraphMol pmol;
  OESmilesToMol(pmol, "c1cnc2c(c1)CC(CC2O)CF");

  OEGraphMol tmol;
  OESmilesToMol(tmol, "c1cc2c(cc1)CC(CCl)CC2N");

  const OEFPTypeBase* fptype = OEGetFPType("Tree,ver=2.0.0,size=4096,bonds=5-5,atype=AtmNum|HvyDeg|EqHalo,btype=Order");

  auto idx = 0u;
  for (OEIter<const OEMatchBase> match = OEGetFPOverlap(pmol, tmol, fptype); match; ++match)
    printf("match %2d:", idx);
    for (OEIter<const OEMatchPair<OEAtomBase> > mpair = match->GetAtoms(); mpair; ++mpair)
      printf(" %d%s-%d%s", mpair->pattern->GetIdx(), OEGetAtomicSymbol(mpair->pattern->GetAtomicNum()),
             mpair->target->GetIdx(), OEGetAtomicSymbol(mpair->target->GetAtomicNum()));
  return 0;

The first three matches returned by the Listing 18 are depicted in the next table. The output of code is the following:

match  1: 3C-2C  9C-11C  4C-3C  8C-10C  7C-7C  11C-8C
match  2: 3C-2C  4C-3C  5C-4C  6C-6C  7C-7C  8C-8C
match  3: 3C-2C  4C-3C  5C-4C  6C-6C  7C-7C  8C-10C
match  4: 3C-2C  4C-3C  5C-4C  6C-6C  7C-7C  11C-8C
match  5: 3C-2C  4C-3C  5C-4C  6C-6C  7C-7C  11C-10C
match  6: 3C-2C  9C-11C  4C-3C  6C-6C  7C-7C  11C-8C

... truncated  ...
Example of matches returned by the OEGetFPOverlap function. The numbers depicted next to the atoms are the atom indices.


Even though the OEGetFPOverlap function returns an iterator of OEMatchBase objects, they are not matches in the traditional sense, i.e. the atom-pair and bond-pair correspondences between the pattern and the target atoms and bonds are not guaranteed. See example depicted below.


The two highlighted patterns set the same bit when fingerprints are generated, but the returned match is not pairwise. The numbers depicted next to the atoms are the atom indices.

See also


The OEGetFPOverlap can be used to visualize molecule similarity based on a given fingerprint type. See more details in: