Visualizing Protein-Ligand B-factor

Problem

You want to visualize the B-factor of a ligand and its environment in order to reveal regions with high flexibility at the active site. See example in Figure 1.

hover the mouse over any ligand atom to display its B-factor

Figure 1. Example of depicting the B-factor of 1A1B complex

../_images/bfactor2img-pdb1a1b.svg

Ligand atoms with high B-factor (i.e. high flexibility) are colored red, while blue color indicates low B-factors.

The 2D molecule surface (arcs around the ligand) represents the environment of the ligand in 3D. Each arc is associated with a ligand atom that is the center of the arc. A red arc indicates that the average B-factor of the nearby protein atoms is high i.e. the environment is more flexible. When this average B-factor is calculated, only protein atoms that are closer than 4.0 Ångströms (user-adjustable parameter) to the given ligand is considered. Blue arcs represent low average B-factor values of the nearby protein atoms. The lack of arc indicates that there is no protein heavy atom closer than 4.0 Ångströms to the corresponding ligand atom.

The linear color gradient, used for coloring B-factors, is drawn at the bottom of the image. The number (that is 18.00) marked on the color gradient is the average B-factor of the whole complex. The black box shown on the color gradient indicates the range of B-factor values close at the active size i.e. the minimum and maximum B-factor values of any protein atom that is closer than 4.0 Ångströms to any ligand atoms.

Ingredients

  • OEChem TK - cheminformatics toolkit (including OEBio TK)
  • OEDepict TK - molecule depiction toolkit
  • Grapheme TK - molecule and property visualization toolkit

Difficulty level

../_images/chilly6.png ../_images/chilly6.png ../_images/chilly6.png

Download

Download code

bfactor2img.py

See also the Usage subsection.

Solution

The get_average_bfactor function is used to calculate the average B-factor for the whole complex. When a protein-ligand complex is read from a PDB file, an OEResidue object is attached to each atom as generic data. This OEResidue object stores a number of fields specific to processing macromolecules including the the B-factor.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def get_average_bfactor(mol):
    nratoms, sumbfactor = 0, 0.0

    for atom in mol.GetAtoms():
        if not oechem.OEHasResidue(atom):
            continue
        res = oechem.OEAtomGetResidue(atom)
        sumbfactor += res.GetBFactor()
        nratoms += 1

    avgbfactor = sumbfactor / nratoms
    return avgbfactor

The get_min_and_max_bfactor function is utilized to calculate the minimum and maximum B-factors at the region of the active site. Therefore, only protein atoms that are close to the ligand are considered (as defined by the maxdistance parameter). These nearby protein atoms are identified by using the OENearestNbrs class. The user defined atom predicate, NotHydrogenOrWater, is also used to ignore any hydrogen atoms or water molecules.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
def get_min_and_max_bfactor(ligand, protein, maxdistance):

    minbfactor, maxbfactor = float("inf"), float("-inf")

    # ligand atoms

    for latom in ligand.GetAtoms(oechem.OEIsHeavy()):
        if not oechem.OEHasResidue(latom):
            continue
        res = oechem.OEAtomGetResidue(latom)
        minbfactor = min(minbfactor, res.GetBFactor())
        maxbfactor = max(maxbfactor, res.GetBFactor())

    # protein atoms close to ligand atoms
    considerbfactor = NotHydrogenOrWater()

    nn = oechem.OENearestNbrs(protein, maxdistance)
    for latom in ligand.GetAtoms(oechem.OEIsHeavy()):
        for neigh in nn.GetNbrs(latom):
            ratom = neigh.GetBgn()

            if considerbfactor(ratom):
                res = oechem.OEAtomGetResidue(ratom)
                minbfactor = min(minbfactor, res.GetBFactor())
                maxbfactor = max(maxbfactor, res.GetBFactor())

    return minbfactor, maxbfactor

The NotHydrogenOrWater atom predicate returns true if the given atom is not a hydrogen and not part of a water molecule.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
class NotHydrogenOrWater(oechem.OEUnaryAtomPred):
    def __call__(self, atom):

        if atom.GetAtomicNum() == oechem.OEElemNo_H:
            return False
        if not oechem.OEHasResidue(atom):
            return False

        waterpred = oechem.OEIsWater()
        if waterpred(atom):
            return False

        return True

Grapheme TK’s 2D molecule surface depiction style is used to visualize the B-factor of the protein around the ligand. The set_average_bfactor_of_nearby_protein_atoms function below loops over the ligand atoms and calculate the average B-factor of nearby protein atoms. This average B-factor will later be used to determine the color of the arc drawn by the BFactorArcFxn class.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
def set_average_bfactor_of_nearby_protein_atoms(ligand, protein, itag, maxdistance):

    considerbfactor = NotHydrogenOrWater()

    nn = oechem.OENearestNbrs(protein, maxdistance)
    for latom in ligand.GetAtoms(oechem.OEIsHeavy()):
        sumbfactor = 0.0
        neighs = []
        for neigh in nn.GetNbrs(latom):
            ratom = neigh.GetBgn()
            if considerbfactor(ratom):
                res = oechem.OEAtomGetResidue(ratom)
                sumbfactor += res.GetBFactor()
                neighs.append(ratom)

        avgbfactor = 0.0
        if len(neighs) > 0:
            avgbfactor = sumbfactor / len(neighs)
        latom.SetDoubleData(itag, avgbfactor)

The linear color gradient used in the depiction is initialized by the get_bfactor_color_gradient function. The color gradient does not depend on the depicted complex thereby allowing to easily compare sets of B-factor images.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
def get_bfactor_color_gradient():

    colorg = oechem.OELinearColorGradient()
    colorg.AddStop(oechem.OEColorStop(0.0, oechem.OEDarkBlue))
    colorg.AddStop(oechem.OEColorStop(10.0, oechem.OELightBlue))
    colorg.AddStop(oechem.OEColorStop(25.0, oechem.OEYellowTint))
    colorg.AddStop(oechem.OEColorStop(50.0, oechem.OERed))
    colorg.AddStop(oechem.OEColorStop(100.0, oechem.OEDarkRose))

    return colorg

The ColorLigandAtomByBFactor class is used to draw the colored circles below the ligand atoms indicating their B-factor. The color gradient used in this class are generated by the get_bfactor_color_gradient function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class ColorLigandAtomByBFactor(oegrapheme.OEAtomGlyphBase):
    def __init__(self, colorg):
        oegrapheme.OEAtomGlyphBase.__init__(self)
        self.colorg = colorg

    def RenderGlyph(self, disp, atom):
        adisp = disp.GetAtomDisplay(atom)
        if adisp is None or not adisp.IsVisible():
            return False

        if not oechem.OEHasResidue(atom):
            return False

        res = oechem.OEAtomGetResidue(atom)
        bfactor = res.GetBFactor()
        color = self.colorg.GetColorAt(bfactor)

        pen = oedepict.OEPen(color, color, oedepict.OEFill_On, 1.0)
        radius = disp.GetScale() / 3.0

        layer = disp.GetLayer(oedepict.OELayerPosition_Below)
        circlestyle = oegrapheme.OECircleStyle_Default
        oegrapheme.OEDrawCircle(layer, circlestyle, adisp.GetCoords(), radius, pen)
        return True

    def CreateCopy(self):
        return ColorLigandAtomByBFactor(self.colorg).__disown__()

The BFactorArcFxn class draws the arc segments around the depicted ligand. The color of each arc represents the average B-factor of the protein atoms around the corresponding ligand atom. These values are calculated by the set_average_bfactor_of_nearby_protein_atoms function and attached to the ligand atoms as generic data. The color gradient used in this class are generated by the get_bfactor_color_gradient function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
class BFactorArcFxn(oegrapheme.OESurfaceArcFxnBase):
    def __init__(self, colorg, itag):
        oegrapheme.OESurfaceArcFxnBase.__init__(self)
        self.colorg = colorg
        self.itag = itag

    def __call__(self, image, arc):
        adisp = arc.GetAtomDisplay()
        if adisp is None or not adisp.IsVisible():
            return False

        atom = adisp.GetAtom()
        if atom is None:
            return False

        avgresiduebfactor = atom.GetDoubleData(self.itag)
        if avgresiduebfactor == 0.0:
            return True
        color = self.colorg.GetColorAt(avgresiduebfactor)

        pen = oedepict.OEPen(color, color, oedepict.OEFill_Off, 5.0)

        center = arc.GetCenter()
        bAngle, eAngle = arc.GetBgnAngle(), arc.GetEndAngle()
        radius = arc.GetRadius()

        oegrapheme.OEDrawDefaultSurfaceArc(image, center, bAngle, eAngle, radius, pen)

        return True

    def CreateCopy(self):
        return BFactorArcFxn(self.colorg, self.itag).__disown__()

depict_bfactor is the main function that does the depiction of the ligand and draws molecule surface around it to represent its environment.

  1. First, the depiction coordinates of the ligand are generated by calling the OEPrepareDepictionFrom3D function (line 5). Then the OEPrepareDepictionOptions class and OEPrepareDepiction function of the OEDepict TK are utilized to orient the generated 2D coordinates horizontally (lines 7-10).
  2. The BFactorArcFxn arc drawing functor is added to each ligand (see also step 7)
  3. Scaling of the depicted ligand is reduced by using the OEGetMoleculeSurfaceScale function to make sure that arcs representing the molecule surface of the ligand will not be drawn outside the image (lines 12-15).
  4. The molecule display for the ligand is constructed (line 19).
  5. If the output image file format is svg, then the OEDrawSVGHoverText function can be used to associate each display atom with a text of its B-factor value. The generated svg image will be interactive i.e. the B-factor texts will be displayed when the mouse is hovered over the ligand atoms in the image (lines 21-31).
  6. The OEAddGlyph function is used to draw a glyph (in this case a circle colored by the B-factor) underneath each ligand atom as defined in the ColorLigandAtomByBFactor class (lines 32-33).
  7. The OEDraw2DSurface function is called that will invoke the arc drawing functor set on each ligand atom in step 2. resulting in the rendering of the the molecule into the display (line 35).
  8. Finally, the OERenderMolecule function renders the molecule display into the image (line 37).
def depict_bfactor(image, ligand, opts, colorg, itag, issvg):
    """
    :type image: oedepict.OEImageBase
    :type ligand: oechem.OEMolBase
    :type opts: oechem.OE2DMolDisplayOptions
    :type colorg: oechem.OELinearColorGradient
    :type itag: int
    :type issvg: boolean
    """

    # prepare ligand for depiction

    oegrapheme.OEPrepareDepictionFrom3D(ligand)

    clearcoords, suppressH = False, False
    popts = oedepict.OEPrepareDepictionOptions(clearcoords, suppressH)
    popts.SetDepictOrientation(oedepict.OEDepictOrientation_Horizontal)
    oedepict.OEPrepareDepiction(ligand, popts)

    arcfxn = BFactorArcFxn(colorg, itag)
    for atom in ligand.GetAtoms():
        oegrapheme.OESetSurfaceArcFxn(ligand, atom, arcfxn)
    opts.SetScale(oegrapheme.OEGetMoleculeSurfaceScale(ligand, opts))

    # render ligand and visualize BFactor

    disp = oedepict.OE2DMolDisplay(ligand, opts)

    if issvg:
        font = oedepict.OEFont(oedepict.OEFontFamily_Default, oedepict.OEFontStyle_Default,
                               14, oedepict.OEAlignment_Center, oechem.OEBlack)
        for adisp in disp.GetAtomDisplays():
            atom = adisp.GetAtom()
            if not oechem.OEHasResidue(atom):
                continue
            res = oechem.OEAtomGetResidue(atom)
            hovertext = "bfactor=%.1f" % res.GetBFactor()
            oedepict.OEDrawSVGHoverText(disp, adisp, hovertext, font)

    colorbfactor = ColorLigandAtomByBFactor(colorg)
    oegrapheme.OEAddGlyph(disp, colorbfactor, oechem.OEIsTrueAtom())

    oegrapheme.OEDraw2DSurface(disp)

    oedepict.OERenderMolecule(image, disp)

Usage

Usage

bfactor2img.py

prompt > wget https://files.rcsb.org/download/1a1b.pdb
prompt > python3 bfactor2img.py -complex 1a1b.pdb -out 1a1b.svg

The above commands will generate the image shown in Figure 1 and generate the following output:

Avg B-factor of the complex = +18.00
Min B-factor of the ligand and its environment (4.0A) = +4.58
Max B-factor of the ligand and its environment (4.0A) = +37.40

Discussion

The B-factor (temperature factor) is a measure of how much an atom vibrates around the position specified in the PDB file. Atoms at side-chain termini are expected to exhibit more freedom of movement than main-chain atoms. Similarly, chain ligand atoms exposed to solvent can have more freedom of movement (see 1A4K example below).

B-factors can indicate not only the mobility of the atoms but they can also reveal where there are errors in model building. Visualization helps to make instant judgment on the quality of model.

PDB: 1A4K PDB: 1ATL

../_images/bfactor2img-pdb1a4k.svg ../_images/bfactor2img-pdb1atl.svg

See also in OEChem TK manual

Theory

API

See also in OEDepict TK manual

Theory

API

See also in GraphemeTM TK manual

Theory

API