Creating apo OEDesignUnits from a PDB file
Preparation of a biological structure file (PDB, mmCIF) to a fully charged,
hydrogenated, molecular componentized object (design unit; DU; OEDesignUnit)
is one of the more advanced functionalities offered through Spruce TK.
This example shows how to construct apo DUs using an input PDB file and the
OEMakeDesignUnits function.
Command Line Interface
This example uses an input PDB file, and will output a set of DUs from it to a temporary
directory (see OEMakeDesignUnits for details on the API).
make_apo_design_units <input biomolecular PDB> <site_residue> [<electron density mtz>] <LoopModelingTemplateDB>]
Code
Download code
make_apo_design_units.py
and both the
1w50.pdb
(the input PDB file),
1w50.mtz
(the input MTZ file), and
spruce_bace.loop_db
(the input loopDB file)
#!/usr/bin/env python
# (C) 2022 Cadence Design Systems, Inc. (Cadence)
# All rights reserved.
# TERMS FOR USE OF SAMPLE CODE The software below ("Sample Code") is
# provided to current licensees or subscribers of Cadence products or
# SaaS offerings (each a "Customer").
# Customer is hereby permitted to use, copy, and modify the Sample Code,
# subject to these terms. Cadence claims no rights to Customer's
# modifications. Modification of Sample Code is at Customer's sole and
# exclusive risk. Sample Code may require Customer to have a then
# current license or subscription to the applicable Cadence offering.
# THE SAMPLE CODE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED. OPENEYE DISCLAIMS ALL WARRANTIES, INCLUDING, BUT
# NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
# PARTICULAR PURPOSE AND NONINFRINGEMENT. In no event shall Cadence be
# liable for any damages or liability in connection with the Sample Code
# or its use.
#############################################################################
# Script to prepare proteins into design units
#############################################################################
import sys
import os
from openeye import oechem
from openeye import oegrid
from openeye import oespruce
def main(argv=sys.argv):
if len(argv) < 3 or len(argv) > 5:
oechem.OEThrow.Usage(
"%s <infile> <site_residue> [<mtzfile>] [<loopdbfile>]" % argv[0]
)
ifs = oechem.oemolistream()
ifile = argv[1]
if not ifs.open(ifile):
oechem.OEThrow.Fatal("Unable to open %s for reading" % ifile)
site_residue = argv[2]
include_loop = False
include_ed = False
if len(argv) > 3:
if len(argv) == 5 or (len(argv) == 4 and "mtz" in argv[3]):
edfile = argv[3]
ed = oegrid.OESkewGrid()
if not oegrid.OEReadMTZ(edfile, ed, oegrid.OEMTZMapType_Fwt):
oechem.OEThrow.Fatal(
"Unable to read electron density file %s" % edfile
) # noqa
include_ed = True
if len(argv) == 5:
loopfile = argv[4]
include_loop = True
elif len(argv) == 4 and "mtz" not in argv[3]:
loopfile = argv[3]
include_loop = True
if ifs.GetFormat() not in [oechem.OEFormat_PDB, oechem.OEFormat_CIF]:
oechem.OEThrow.Fatal("Only works for .pdb or .cif input files")
ifs.SetFlavor(oechem.OEFormat_PDB, oechem.OEIFlavor_PDB_SpruceDefault)
ifs.SetFlavor(oechem.OEFormat_MMCIF, oechem.OEIFlavor_MMCIF_SpruceDefault)
mol = oechem.OEGraphMol()
if not oechem.OEReadMolecule(ifs, mol):
oechem.OEThrow.Fatal("Unable to read molecule from %s" % ifile)
metadata = oespruce.OEStructureMetadata()
opts = oespruce.OEMakeDesignUnitOptions()
opts.GetPrepOptions().GetBuildOptions().GetLoopBuilderOptions().SetBuildTails(False)
if include_loop:
opts.GetPrepOptions().GetBuildOptions().GetLoopBuilderOptions().SetLoopDBFilename(
loopfile
)
if include_ed:
design_units = oespruce.OEMakeDesignUnits(mol, ed, metadata, opts, site_residue)
else:
design_units = oespruce.OEMakeDesignUnits(mol, metadata, opts, site_residue)
base_name = os.path.basename(ifile)[:-4] + "_DU_{}.oedu"
for i, design_unit in enumerate(design_units):
oechem.OEWriteDesignUnit(base_name.format(i), design_unit)
if __name__ == "__main__":
sys.exit(main(sys.argv))
Example
make_apo_design_units.py 1w50.pdb "ASP:228: :A" 1w50.mtz spruce_bace.loop_db
will generate the following output (failing to add partial charges to iodine):
DPI: 0.13, RFree: 0.28, Resolution: 1.75
Processing BU # 0 with title: BETA-SECRETASE 1, chains A
Found gap between ALA 157 A 1 and ALA 168 A 1 , with sequence GFPLNQSEVL
Opened database spruce_bace.loop_db
LoopDatabase Info:
276 loops from RSCB last synced on 03-19-2020, were added to LoopTemplateDatabase on 03-19-2020 using Spruce Toolkit 1.0.0.a
The loop database was built with a max loop length of 22, a termini crop length of 2, and excluding regular secondary structures
Warning: Failed to charge miscellaneous components
Warning: Unable to add partial charges and radii to DesignUnit: BETA-SECRETASE 1(A)__DU__biounit
Iridium Category: NA, DPI: 0.13, RFree: 0.28, Resolution: 1.75
See also
OEMakeDesignUnitsfunctionOEReadDesignUnitandOEWriteDesignUnitfunctions in the OEChem TK manualOEDesignUnit class in the OEChem TK manual