Depicting Activities of Molecules
Problem
You want to organize and depict a set of molecules according to given
activity data and common substructure.
See the separate pages of a multi-page PDF
in Table 1.
page 1 |
page 2 |
.. |
page 8 |
.. |
Ingredients
|
Difficulty level
Solution
First you have to import the molecules being depicted with their activity information.
In this example the ImportMolecule
functions reads the molecules and extracts the activity information attached
to the molecule as SD data with the “activity” tag (line 14).
(See example in the
activity.sdf
input file.)
After converting the SD string to a floating point number the number is added to
the title of the molecule in order to depict it later along with the molecule
(line 17).
Each imported molecule, along with its corresponding activity, is inserted into
a list of (molecule, activity) tuples (line 18).
1def ImportMolecules(ifs, mollist):
2
3 if ifs.GetFormat() != oechem.OEFormat_SDF:
4 oechem.OEThrow.Fatal("The input file has to be an SDF file")
5
6 tag = "activity"
7
8 for mol in ifs.GetOEGraphMols():
9
10 id = mol.GetTitle()
11 if not oechem.OEHasSDData(mol, tag):
12 oechem.OEThrow.Warning("Non activity data found for molecule '%s'."
13 "Compound will be ignored." % id)
14 else:
15 activitystr = oechem.OEGetSDData(mol, tag)
16 try:
17 activity = float(activitystr)
18 mol.SetTitle("%s -- Activity: %s uM" % (id, activitystr))
19 mollist.append((oechem.OEGraphMol(mol), activity))
20 except ValueError:
21 oechem.OEThrow.Warning("Non-numeric activity data '%s' found for molecule '%s'."
22 "Compound will be ignored." % (activitystr, id))
The DepictMoleculesWithActivity function, that takes the following parameters, shows how to depict the molecules along with their activities.
- report
An OEReport object which is a layout manager that allows generation of multi-page images.
- mollist
A list of (molecule, activity) tuples that are constructed by the ImportMolecule function above.
- coresubs
An OESubSearch object that is constructed from a SMARTS pattern that defines the common core substructure of the molecules.
- opts
An OE2DMolDisplayOptions object that defines the style of the molecule depiction.
First, the activity numbers are extracted from the list of (molecule, activity) tuples in order to find the minimum and maximum activity number of the dataset (lines 3-5). These number are used to construct a color gradient (lines 7-9) that will be used to color the molecules by their activity. The molecules are then sorted in decreasing order of their activities (lines 11). This will be the order in which they are rendered.
Before rendering a substructure search is performed to find the common core substructure of the molecule. The match returned by the substructure search is used to align the molecule by this common core (line 19), Then the bonds of the common core are highlighted by calling the FadeCoreSubstructure function (line 26), while the bonds that are not in the common core are highlighted by the activity of the molecule by calling the HighlightByActivity function (line 27). After rendering the molecules, the color gradient is depicted in the footer of each page (lines 34-47) along with a box that indicates the range of the activities of the molecule in each page.
1def DepictMoleculesWithActivity(report, mollist, coresubs, opts):
2
3 activities = [activity for mol, activity in mollist]
4 minactivity = min(activities)
5 maxactivity = max(activities)
6
7 midvalue = (minactivity + maxactivity) / 2.0
8 colorg = oechem.OELinearColorGradient(oechem.OEColorStop(midvalue, oechem.OEYellow))
9 colorg.AddStop(oechem.OEColorStop(maxactivity, oechem.OERed))
10 colorg.AddStop(oechem.OEColorStop(minactivity, oechem.OEGreen))
11
12 sortedmollist = sorted(mollist, key=itemgetter(1))
13
14 for mol, activity in sortedmollist:
15
16 unique = True
17 match = None
18 for mi in coresubs.Match(mol, unique):
19 match = mi
20 oedepict.OEPrepareAlignedDepiction(mol, coresubs.GetPattern(), match)
21 break
22
23 cell = report.NewCell()
24 disp = oedepict.OE2DMolDisplay(mol, opts)
25
26 if match is not None:
27 FadeCoreSubstructure(disp, match)
28 HighlightByActivity(disp, match, activity, colorg)
29
30 oedepict.OERenderMolecule(cell, disp)
31
32 cellsperpage = report.NumRowsPerPage() * report.NumColsPerPage()
33 opts = oegrapheme.OEColorGradientDisplayOptions()
34
35 for pageidx, footer in enumerate(report.GetFooters()):
36
37 bgnidx = pageidx * cellsperpage
38 endidx = (pageidx + 1) * cellsperpage
39 pageactivities = [activity for mol, activity in sortedmollist[bgnidx:endidx]]
40 minpageactivities = min(pageactivities)
41 maxpageactivities = max(pageactivities)
42
43 opts.ClearMarkedValues()
44 if minpageactivities == maxpageactivities:
45 opts.AddMarkedValue(minpageactivities)
46 else:
47 opts.SetBoxRange(minpageactivities, maxpageactivities)
48 oegrapheme.OEDrawColorGradient(footer, colorg, opts)
The FadeCoreSubstructure function shows how to “fade” the bonds detected to be part of the core structure by using the OEHighlightByColor highlighting style.
1def FadeCoreSubstructure(disp, corematch):
2
3 bondpred = oechem.OEIsBondMember(corematch.GetTargetBonds())
4 lineWidthScale = 0.75
5 highlightstyle = oedepict.OEHighlightByColor(oechem.OEGrey, lineWidthScale)
6 oedepict.OEAddHighlighting(disp, highlightstyle, bondpred)
The HighlightByActivity function shows how to color the bonds not part of the core structure by the activity of the molecule by using the OEHighlightByStick highlighting style.
1def HighlightByActivity(disp, corematch, activity, colorg):
2
3 bondpred = oechem.OENotBond(oechem.OEIsBondMember(corematch.GetTargetBonds()))
4 color = colorg.GetColorAt(activity)
5 highlightstyle = oedepict.OEHighlightByStick(color)
6 oedepict.OEAddHighlighting(disp, highlightstyle, bondpred)
Download code
activity2pdf.py
and
supporting data set activity.sdf
Usage:
prompt > python3 activity2pdf.py -in activity.sdf -out activity.pdf -core "c1ccc2cc(ccc2c1)N"
See also in OEChem TK manual
Theory
SD Tagged Data Manipulation section
API
OEGetSDData function
OEHasSDData function
OELinearColorGradient class
OEMatch class
OESubSearch class
See also in OEDepict TK manual
Theory
Highlighting chapter
Multi Page Reports section
API
OE2DMolDisplay class
OE2DMolDisplayOptions class
OEAddHighlighting function
OEHighlightByColor class
OEHighlightByStick class
OEPrepareAlignedDepiction function
OERenderMolecule function
OEReport class
See also in GraphemeTM TK manual
API
OEDrawColorGradient function