Depicting Activities of Molecules¶
You want to organize and depict a set of molecules according to given
activity data and common substructure.
See the separate pages of a multi-page
First you have to import the molecules being depicted with their activity information.
In this example the ImportMolecule
functions reads the molecules and extracts the activity information attached
to the molecule as SD data with the “activity” tag (line 14).
(See example in the
activity.sdf input file.)
After converting the SD string to a floating point number the number is added to
the title of the molecule in order to depict it later along with the molecule
Each imported molecule, along with its corresponding activity, is inserted into
a list of (molecule, activity) tuples (line 18).
1def ImportMolecules(ifs, mollist): 2 3 if ifs.GetFormat() != oechem.OEFormat_SDF: 4 oechem.OEThrow.Fatal("The input file has to be an SDF file") 5 6 tag = "activity" 7 8 for mol in ifs.GetOEGraphMols(): 9 10 id = mol.GetTitle() 11 if not oechem.OEHasSDData(mol, tag): 12 oechem.OEThrow.Warning("Non activity data found for molecule '%s'." 13 "Compound will be ignored." % id) 14 else: 15 activitystr = oechem.OEGetSDData(mol, tag) 16 try: 17 activity = float(activitystr) 18 mol.SetTitle("%s -- Activity: %s uM" % (id, activitystr)) 19 mollist.append((oechem.OEGraphMol(mol), activity)) 20 except ValueError: 21 oechem.OEThrow.Warning("Non-numeric activity data '%s' found for molecule '%s'." 22 "Compound will be ignored." % (activitystr, id))
The DepictMoleculesWithActivity function, that takes the following parameters, shows how to depict the molecules along with their activities.
An OEReport object which is a layout manager that allows generation of multi-page images.
A list of (molecule, activity) tuples that are constructed by the ImportMolecule function above.
An OESubSearch object that is constructed from a SMARTS pattern that defines the common core substructure of the molecules.
An OE2DMolDisplayOptions object that defines the style of the molecule depiction.
First, the activity numbers are extracted from the list of (molecule, activity) tuples in order to find the minimum and maximum activity number of the dataset (lines 3-5). These number are used to construct a color gradient (lines 7-9) that will be used to color the molecules by their activity. The molecules are then sorted in decreasing order of their activities (lines 11). This will be the order in which they are rendered.
Before rendering a substructure search is performed to find the common core substructure of the molecule. The match returned by the substructure search is used to align the molecule by this common core (line 19), Then the bonds of the common core are highlighted by calling the FadeCoreSubstructure function (line 26), while the bonds that are not in the common core are highlighted by the activity of the molecule by calling the HighlightByActivity function (line 27). After rendering the molecules, the color gradient is depicted in the footer of each page (lines 34-47) along with a box that indicates the range of the activities of the molecule in each page.
1def DepictMoleculesWithActivity(report, mollist, coresubs, opts): 2 3 activities = [activity for mol, activity in mollist] 4 minactivity = min(activities) 5 maxactivity = max(activities) 6 7 midvalue = (minactivity + maxactivity) / 2.0 8 colorg = oechem.OELinearColorGradient(oechem.OEColorStop(midvalue, oechem.OEYellow)) 9 colorg.AddStop(oechem.OEColorStop(maxactivity, oechem.OERed)) 10 colorg.AddStop(oechem.OEColorStop(minactivity, oechem.OEGreen)) 11 12 sortedmollist = sorted(mollist, key=itemgetter(1)) 13 14 for mol, activity in sortedmollist: 15 16 unique = True 17 match = None 18 for mi in coresubs.Match(mol, unique): 19 match = mi 20 oedepict.OEPrepareAlignedDepiction(mol, coresubs.GetPattern(), match) 21 break 22 23 cell = report.NewCell() 24 disp = oedepict.OE2DMolDisplay(mol, opts) 25 26 if match is not None: 27 FadeCoreSubstructure(disp, match) 28 HighlightByActivity(disp, match, activity, colorg) 29 30 oedepict.OERenderMolecule(cell, disp) 31 32 cellsperpage = report.NumRowsPerPage() * report.NumColsPerPage() 33 opts = oegrapheme.OEColorGradientDisplayOptions() 34 35 for pageidx, footer in enumerate(report.GetFooters()): 36 37 bgnidx = pageidx * cellsperpage 38 endidx = (pageidx + 1) * cellsperpage 39 pageactivities = [activity for mol, activity in sortedmollist[bgnidx:endidx]] 40 minpageactivities = min(pageactivities) 41 maxpageactivities = max(pageactivities) 42 43 opts.ClearMarkedValues() 44 if minpageactivities == maxpageactivities: 45 opts.AddMarkedValue(minpageactivities) 46 else: 47 opts.SetBoxRange(minpageactivities, maxpageactivities) 48 oegrapheme.OEDrawColorGradient(footer, colorg, opts)
1def FadeCoreSubstructure(disp, corematch): 2 3 bondpred = oechem.OEIsBondMember(corematch.GetTargetBonds()) 4 lineWidthScale = 0.75 5 highlightstyle = oedepict.OEHighlightByColor(oechem.OEGrey, lineWidthScale) 6 oedepict.OEAddHighlighting(disp, highlightstyle, bondpred)
1def HighlightByActivity(disp, corematch, activity, colorg): 2 3 bondpred = oechem.OENotBond(oechem.OEIsBondMember(corematch.GetTargetBonds())) 4 color = colorg.GetColorAt(activity) 5 highlightstyle = oedepict.OEHighlightByStick(color) 6 oedepict.OEAddHighlighting(disp, highlightstyle, bondpred)
prompt > python3 activity2pdf.py -in activity.sdf -out activity.pdf -core "c1ccc2cc(ccc2c1)N"
See also in OEChem TK manual¶
SD Tagged Data Manipulation section
See also in OEDepict TK manual¶
Multi Page Reports section