• Docs »
  • 2D Depiction »
  • Depicting Molecule Similarity Based on Fingerprints updated

Depicting Molecule Similarity Based on Fingerprints updated

Problem

You want to depict the 2D similarity of two molecules based on their fingerprints. See example in Figure 1.

../_images/fpoverlap2img1.png

Figure 1. Example of depiction of 2D molecule similarity

Ingredients

Note

Requires OpenEye toolkits version 2017.Feb or later.

Difficulty Level

../_images/chilly1.png ../_images/chilly1.png

Solution

The GraphSim TK not only provides functionality to encode 2D molecular graph information into fingerprints, but it also gives access to the fragments that are being enumerated during the fingerprint generation process. The OEGetFPOverlap function, used in this example, returns all common fragments found between two molecules based on the given fingerprint type. These fragments are used in the SetFingerPrintSimilarity function to assess the similar parts of two molecules. Iterating over the bonds of the common fragments, the occurrence of each bond is counted and used as an overlap score (lines 6-10). These scores are then attached to the corresponding bonds as generic data (lines 15-18). The maximum overlap score is also calculated and returned by the function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
def SetFingerPrintSimilarity(qmol, tmol, fptype, tag, maxvalue=0):

    qbonds = OEUIntArray(qmol.GetMaxBondIdx())
    tbonds = OEUIntArray(tmol.GetMaxBondIdx())

    for match in OEGetFPOverlap(qmol, tmol, fptype):
        for bond in match.GetPatternBonds():
            qbonds[bond.GetIdx()] += 1
        for bond in match.GetTargetBonds():
            tbonds[bond.GetIdx()] += 1

    maxvalue = max(maxvalue, max(qbonds))
    maxvalue = max(maxvalue, max(tbonds))

    for bond in qmol.GetBonds():
        bond.SetData(tag, qbonds[bond.GetIdx()])
    for bond in tmol.GetBonds():
        bond.SetData(tag, tbonds[bond.GetIdx()])

    return maxvalue

These bond overlap scores can be used to highlight the similar and dissimilar parts of the molecules. The ColorBondByOverlapScore bond annotation class takes a linear color gradient and draws a “stick” underneath each bond (lines 20-24). The color of the “stick” is determined by the overlap score of the bond (lines 17-18).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
class ColorBondByOverlapScore(OEBondGlyphBase):
    def __init__(self, cg, tag):
        OEBondGlyphBase.__init__(self)
        self.colorg = cg
        self.tag = tag

    def RenderGlyph(self, disp, bond):

        bdisp = disp.GetBondDisplay(bond)
        if bdisp is None or not bdisp.IsVisible():
            return False

        if not bond.HasData(self.tag):
            return False

        linewidth = disp.GetScale() / 3.0
        color = self.colorg.GetColorAt(bond.GetData(self.tag))
        pen = OEPen(color, color, OEFill_Off, linewidth)

        adispB = disp.GetAtomDisplay(bond.GetBgn())
        adispE = disp.GetAtomDisplay(bond.GetEnd())

        layer = disp.GetLayer(OELayerPosition_Below)
        layer.DrawLine(adispB.GetCoords(), adispE.GetCoords(), pen)

        return True

    def ColorBondByOverlapScore(self):
        return ColorBondByOverlapScore(self.colorg, self.tag).__disown__()

The DepictMoleculeOverlaps shows how to depict the 2D similarity of the two molecules:

  1. Calculate the bond overlap scores for both molecules, an OELinearColorGradient object is constructed that is used by the ColorBondByOverlapScore class to annotate the bonds based on their overlap score (lines 6-10).
  2. Prepare both molecules for depiction. The target molecule is aligned to the query by calling the OEPrepareMultiAlignedDepiction function (lines 12-14). The OEGetFPOverlap function is utilized to return all common fragments found between two molecules based on a given fingerprint type. These common fragments reveal the similar parts of the two molecules being compared that are used by the OEPrepareMultiAlignedDepiction function to find the best alignment between the molecules.
  3. Divide the image into two cells using the OEImageGrid class, and the molecules are rendered next to each other (lines 16-30).
  4. Generate fingerprints and calculate the similarity score calling the OETanimoto function (lines 32-38).
  5. Render the score into the image (lines 40-42).

You can see the result in Figure 1. The DepictMoleculeOverlaps function uses “yellow to dark green” linear color gradient. Where there is 2D similarity detected between the two molecules, the color green is used to highlight the bonds and the color gets darker with increasing similarity. The color pink is used to highlight parts of the molecules that are not sharing any common fragments i.e. they are 2D dissimilar.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
def DepictMoleculeOverlaps(image, qmol, tmol, fptype, opts):

    tag = OEGetTag("fpoverlap")
    maxvalue = SetFingerPrintSimilarity(qmol, tmol, fptype, tag)

    colorg = OELinearColorGradient()
    colorg.AddStop(OEColorStop(0.0, OEPinkTint))
    colorg.AddStop(OEColorStop(1.0, OEYellow))
    colorg.AddStop(OEColorStop(maxvalue, OEDarkGreen))
    bondglyph = ColorBondByOverlapScore(colorg, tag)

    OEPrepareDepiction(qmol)
    overlaps = OEGetFPOverlap(qmol, tmol, fptype)
    OEPrepareMultiAlignedDepiction(tmol, qmol, overlaps)

    grid = OEImageGrid(image, 1, 2)
    grid.SetMargin(OEMargin_Bottom, 10)
    opts.SetDimensions(grid.GetCellWidth(), grid.GetCellHeight(), OEScale_AutoScale)
    opts.SetAtomColorStyle(OEAtomColorStyle_WhiteMonochrome)

    molscale = min(OEGetMoleculeScale(qmol, opts), OEGetMoleculeScale(tmol, opts))
    opts.SetScale(molscale)

    qdisp = OE2DMolDisplay(qmol, opts)
    OEAddGlyph(qdisp, bondglyph, IsTrueBond())
    OERenderMolecule(grid.GetCell(1, 1), qdisp)

    tdisp = OE2DMolDisplay(tmol, opts)
    OEAddGlyph(tdisp, bondglyph, IsTrueBond())
    OERenderMolecule(grid.GetCell(1, 2), tdisp)

    qfp = OEFingerPrint()
    OEMakeFP(qfp, qmol, fptype)

    tfp = OEFingerPrint()
    OEMakeFP(tfp, tmol, fptype)

    score = OETanimoto(qfp, tfp)

    font = OEFont(OEFontFamily_Default, OEFontStyle_Default, 16, OEAlignment_Center, OEBlack)
    center = OE2DPoint(image.GetWidth() / 2.0, image.GetHeight() - 10)
    image.DrawText(center, "Tanimoto score = %.3f" % score, font)

Download code

fpoverlap2img.py

Usage:

prompt > python3 fpoverlap2img.py -query query.mol -target target.mol -out similarity.png

Discussion

Hint

Visualizing similarity of two molecules based on their fingerprints provides insight into molecule similarity beyond a single numerical score and reveals information about the underlying fingerprint methods.

The images in the following tables illustrate how changing a core or a terminal atom in a molecule effects the Tanimoto similarity scores.

Table 1. Example of the effects of changing a core atom using various fingerprint types
Path Tree Circular
../_images/corechange-path.png ../_images/corechange-tree.png ../_images/corechange-circular.png
Table 2. Example of the effects of changing a terminal atom using various fingerprint types
Path Tree Circular
../_images/terminalchange-path.png ../_images/terminalchange-tree.png ../_images/terminalchange-circular.png

See Also in OEChem TK Manual

Theory

API

See Also in GraphSim TK Manual

Theory

API

See Also in Grapheme TK Manual

Theory

API