Depicting CSV or SDF in PDF
Problem
You want to depict molecules along with their associated data read from a
CSV file in a multi-page PDF file.
See example in drugs.pdf
and in Table 1.
| page 1 | page 2 | 
|   |   | 
Ingredients
| 
 | 
Difficulty Level
 
 
Solution
The CSV file format is a text file format containing comma-separated values. In OEChem TK this file format is implemented to enable data exchange with a wide variety of other software. Each line of a CSV file stores data for a molecule that is represented by a SMILES string.
See also
- CSV File Format section of the OEChem TK documentation about the layout of the CSV file format. 
When reading a CSV file, the fields of the file are attached to each molecule as SD data. This data can be accessed by the OEGetSDDataIter function that returns an iterator over all the SD data (tag - value) pairs of a molecule. The CollectDataTags function iterates over a list of molecules and returns the unique tags of the data attached to the molecules.
1def CollectDataTags(mollist):
2
3    tags = []
4    for mol in mollist:
5        for dp in oechem.OEGetSDDataIter(mol):
6            if not dp.GetTag() in tags:
7                tags.append(dp.GetTag())
8
9    return tags
The DepictMoleculesWithData function takes a list of molecules read from a CSV file along with the data tags returned by the CollectDataTags function. Each molecule and its corresponding data is rendered into adjacent cells of an OEReport object (lines 3-16). The OEReport class is a layout manager allowing generation of multi-page images in a convenient way. After rendering the molecules, the input filename is rendered into page headers (lines 22-27) while the page number is rendered at the bottom of each page (lines 31-36).
 1def DepictMoleculesWithData(report, mollist, iname, tags, opts):
 2
 3    for mol in mollist:
 4
 5        # render molecule
 6
 7        cell = report.NewCell()
 8        oedepict.OEPrepareDepiction(mol)
 9        disp = oedepict.OE2DMolDisplay(mol, opts)
10        oedepict.OERenderMolecule(cell, disp)
11        oedepict.OEDrawCurvedBorder(cell, oedepict.OELightGreyPen, 10.0)
12
13        # render corresponding data
14
15        cell = report.NewCell()
16        RenderData(cell, mol, tags)
17
18    # add input filnename to headers
19
20    headerfont = oedepict.OEFont(oedepict.OEFontFamily_Default, oedepict.OEFontStyle_Default,
21                                 12, oedepict.OEAlignment_Center, oechem.OEBlack)
22    headerpos = oedepict.OE2DPoint(report.GetHeaderWidth() / 2.0, report.GetHeaderHeight() / 2.0)
23
24    for header in report.GetHeaders():
25        header.DrawText(headerpos, iname, headerfont)
26
27    # add page number to footers
28
29    footerfont = oedepict.OEFont(oedepict.OEFontFamily_Default, oedepict.OEFontStyle_Default,
30                                 12, oedepict.OEAlignment_Center, oechem.OEBlack)
31    footerpos = oedepict.OE2DPoint(report.GetFooterWidth() / 2.0, report.GetFooterHeight() / 2.0)
32
33    for pageidx, footer in enumerate(report.GetFooters()):
34        footer.DrawText(footerpos, "- %d -" % (pageidx + 1), footerfont)
The RenderData function shows how ease is to render the (tag - value) tuples using the OEImageTable class.
 1def RenderData(image, mol, tags):
 2
 3    data = []
 4    for tag in tags:
 5        value = "N/A"
 6        if oechem.OEHasSDData(mol, tag):
 7            value = oechem.OEGetSDData(mol, tag)
 8        data.append((tag, value))
 9
10    nrdata = len(data)
11
12    tableopts = oedepict.OEImageTableOptions(nrdata, 2, oedepict.OEImageTableStyle_LightBlue)
13    tableopts.SetColumnWidths([10, 20])
14    tableopts.SetMargins(2.0)
15    tableopts.SetHeader(False)
16    tableopts.SetStubColumn(True)
17    table = oedepict.OEImageTable(image, tableopts)
18
19    for row, (tag, value) in enumerate(data):
20        cell = table.GetCell(row + 1, 1)
21        table.DrawText(cell, tag + ":")
22        cell = table.GetBodyCell(row + 1, 1)
23        table.DrawText(cell, value)
Download code
csv2pdf.py
and drugs.csv
supporting data file
Usage
Running the above command will generate the
drugs.pdf
multi-page pdf file.
prompt > python3 csv2pdf.py drugs.csv drugs.pdf
Discussion
Reading the columns of an CSV file into SD data fields, means
that the OEChem TK provides a meta-data interchange between sdf files and
CSV files.
Consequently,  the same Python script can be used to generate a pdf file
reading an sdf file.
See also in OEChem TK manual
Theory
- SD Tagged Data Manipulation section 
- CSV File Format section 
API
- OEGetSDDataIter function 
See also in OEDepict TK manual
Theory
- Molecule Depiction chapter 
- Molecule Layout chapter 
API
- OE2DMolDisplay class 
- OE2DMolDisplayOptions class 
- OEDrawBorder function 
- OEFont class 
- OEImage class 
- OEImageTable class 
- OEImageTableOptions class 
- OEPrepareDepiction function 
- OERenderMolecule function 
- OEReport class