Creating apo OEDesignUnits from a PDB file

Preparation of a biological structure file (PDB, mmCIF) to a fully charged, hydrogenated, molecular componentized object (design unit; DU; OEDesignUnit) is one of the more advanced functionalities offered through Spruce TK. This example shows how to construct apo DUs using an input PDB file and the OEMakeDesignUnits function.

Command Line Interface

This example uses an input PDB file, and will output a set of DUs from it to a temporary directory (see OEMakeDesignUnits for details on the API).

make_apo_design_units <input biomolecular PDB> <site_residue> [<electron density mtz>] <LoopModelingTemplateDB>]

Code

Download code

make_apo_design_units.py and both the 1w50.pdb (the input PDB file), 1w50.mtz (the input MTZ file), and spruce_bace.loop_db (the input loopDB file)

#!/usr/bin/env python
# (C) 2022 Cadence Design Systems, Inc. (Cadence) 
# All rights reserved.
# TERMS FOR USE OF SAMPLE CODE The software below ("Sample Code") is
# provided to current licensees or subscribers of Cadence products or
# SaaS offerings (each a "Customer").
# Customer is hereby permitted to use, copy, and modify the Sample Code,
# subject to these terms. Cadence claims no rights to Customer's
# modifications. Modification of Sample Code is at Customer's sole and
# exclusive risk. Sample Code may require Customer to have a then
# current license or subscription to the applicable Cadence offering.
# THE SAMPLE CODE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED.  OPENEYE DISCLAIMS ALL WARRANTIES, INCLUDING, BUT
# NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
# PARTICULAR PURPOSE AND NONINFRINGEMENT. In no event shall Cadence be
# liable for any damages or liability in connection with the Sample Code
# or its use.

#############################################################################
# Script to prepare proteins into design units
#############################################################################
import sys
import os
from openeye import oechem
from openeye import oegrid
from openeye import oespruce


def main(argv=sys.argv):

    if len(argv) < 3 or len(argv) > 5:
        oechem.OEThrow.Usage(
            "%s <infile> <site_residue> [<mtzfile>] [<loopdbfile>]" % argv[0]
        )

    ifs = oechem.oemolistream()
    ifile = argv[1]
    if not ifs.open(ifile):
        oechem.OEThrow.Fatal("Unable to open %s for reading" % ifile)

    site_residue = argv[2]

    include_loop = False
    include_ed = False
    if len(argv) > 3:
        if len(argv) == 5 or (len(argv) == 4 and "mtz" in argv[3]):
            edfile = argv[3]
            ed = oegrid.OESkewGrid()
            if not oegrid.OEReadMTZ(edfile, ed, oegrid.OEMTZMapType_Fwt):
                oechem.OEThrow.Fatal(
                    "Unable to read electron density file %s" % edfile
                )  # noqa
            include_ed = True
        if len(argv) == 5:
            loopfile = argv[4]
            include_loop = True
        elif len(argv) == 4 and "mtz" not in argv[3]:
            loopfile = argv[3]
            include_loop = True

    if ifs.GetFormat() not in [oechem.OEFormat_PDB, oechem.OEFormat_CIF]:
        oechem.OEThrow.Fatal("Only works for .pdb or .cif input files")

    ifs.SetFlavor(oechem.OEFormat_PDB, oechem.OEIFlavor_PDB_SpruceDefault)
    ifs.SetFlavor(oechem.OEFormat_MMCIF, oechem.OEIFlavor_MMCIF_SpruceDefault)

    mol = oechem.OEGraphMol()
    if not oechem.OEReadMolecule(ifs, mol):
        oechem.OEThrow.Fatal("Unable to read molecule from %s" % ifile)

    metadata = oespruce.OEStructureMetadata()
    opts = oespruce.OEMakeDesignUnitOptions()
    opts.GetPrepOptions().GetBuildOptions().GetLoopBuilderOptions().SetBuildTails(False)
    if include_loop:
        opts.GetPrepOptions().GetBuildOptions().GetLoopBuilderOptions().SetLoopDBFilename(
            loopfile
        )

    if include_ed:
        design_units = oespruce.OEMakeDesignUnits(mol, ed, metadata, opts, site_residue)
    else:
        design_units = oespruce.OEMakeDesignUnits(mol, metadata, opts, site_residue)

    base_name = os.path.basename(ifile)[:-4] + "_DU_{}.oedu"
    for i, design_unit in enumerate(design_units):
        oechem.OEWriteDesignUnit(base_name.format(i), design_unit)


if __name__ == "__main__":
    sys.exit(main(sys.argv))

Example

make_apo_design_units.py 1w50.pdb "ASP:228: :A" 1w50.mtz  spruce_bace.loop_db

will generate the following output (failing to add partial charges to iodine):

DPI: 0.13, RFree: 0.28, Resolution: 1.75
Processing BU # 0 with title: BETA-SECRETASE 1, chains A
Found gap between ALA 157   A 1   and ALA 168   A 1  , with sequence GFPLNQSEVL
Opened database spruce_bace.loop_db
LoopDatabase Info:
    276 loops from RSCB last synced on 03-19-2020, were added to LoopTemplateDatabase on 03-19-2020 using Spruce Toolkit 1.0.0.a
    The loop database was built with a max loop length of 22, a termini crop length of 2, and excluding regular secondary structures
Warning: Failed to charge miscellaneous components
Warning: Unable to add partial charges and radii to DesignUnit: BETA-SECRETASE 1(A)__DU__biounit
Iridium Category: NA, DPI: 0.13, RFree: 0.28, Resolution: 1.75

See also