SPRUCE - Import Prepared PDB Files¶
This floe uses Spruce to generate an OEDesignUnit object used by downstream OpenEye modelling applications in Orion, such as docking, posit, gameplan, or short-trajectory MD.
The required input for this floe is PDB and/or MMCIF files that have already been prepared for modeling applications outside of Orion. Required preparation steps for downstream applications are, to name a few, enumerated or collapsed alternate locations, explicit and optimized hydrogen atoms, properly treated terminal residues (either charged or capped), .
If a ligand cannot be detected during the run, consider specifying the ligand residue name, increasing the size of the input variable “max_residues”, given as an optional input to this floe. Or if this is a known apo structure, you can provide the definition of a residue in the binding site.
You can read more about Spruce in the toolkit documentation.
Extra Required Parameters
Include density based depictions (boolean) : Include density based depictions.Default: True Log Field (Field Type: String) : The field to store messages to floe reportDefault: Log Field Log Field (Field Type: String) : The field to store messages to floe reportDefault: Log Field Input structure (PDB/MMCIF) files (file_in) : Output Dataset (dataset_out) : Output dataset to write toDefault: Spruce_prep_dataset Output Dataset (dataset_out) : Output dataset to write toDefault: Failed_Spruce_prep_dataset Log Field (Field Type: String) : The field to store messages to floe reportDefault: Log Field Add interaction hints (boolean) : Option add interactions to the design units.Default: True Add style (boolean) : Option add style to the design units.Default: True Allow cap residue truncation (boolean) : Option to allow terminal residue to converted to cap, if cap will otherwise clash.Default: True Alternate location handling method (string) : Option to pick method of handling alternate locations.Default: DefaultChoices: Primary, Enumerate, Default Loop backbone clash threshold (decimal) : Loops from the database where more than the threshold fraction of the backbone atoms clash, are rejected.Default: 0.25 Build C-terminal caps (boolean) : Option to cap broken C-termini in protein chains.Default: True Option to build disulfide bridges (boolean) : Allow the loop builder to build disulfide bridges during loop modeling (if possible).Default: True Build missing loops (boolean) : Option to build missing loops (if information is available to do so)Default: True Build N-terminal caps (boolean) : Option to cap broken N-termini in protein chains.Default: True Build partial sidechains (boolean) : Option to build missing or partial protein sidechains.Default: True Build missing tails (boolean) : Option to build missing tails (if information is available to do so)Default: False Loop builder include crystal packing (boolean) : Include packing residues when building loops.Default: False Assign charges and radii (boolean) : Option to assign partial charge and radii.Default: True Collapse non-site alts (boolean) : Option to deduplicate structures with different alts, if the alt locations are not near the binding site.Default: True Loop crop length (integer) : Anchor residues on the protein to crop back for a better fit, results in longer loops being built.Default: 1 Delete clashing solvent (boolean) : Option to allow build steps to remove clashing solvent.Default: True Duplicate removal (boolean) : Option to deduplicate identical structures resulting from symmetry operation.Default: True Enumerate co-factor sites (boolean) : Option to generate individual design units based on the recognized co-factors.Default: False Enumerate pockets (boolean) : Option to enumerate pockets when no ligand is foundDefault: False Fix backbone atom issues (boolean) : Option to fix backbone atom issues in protein chains.Default: True Generate Tautomers (boolean) : Option to generate and use tautomers in the hydrogen network optimization.Default: True Hetgroup cluster distance (decimal) : Distance between heterogens used to determine optimization clusters for protonation.Default: 3.5 Include SA term (boolean) : Include solvent accessible surface area term when ranking the loops.Default: True Include solvation (boolean) : Include simple solvation model when building loops.Default: True Include Binding Site Grids (boolean) : Include electron density and difference density maps around the binding siteDefault: True Log Field (Field Type: String) : The field to store messages to floe reportDefault: Log Field Loop clash threshold (decimal) : Loops from the database where more than the threshold fraction of the loops atoms in addition to the bacbkone clashing ones clash, are rejected.Default: 0.2 Loop anchor atom distance buffer (decimal) : Fuzzy matches in the loop database has to have distance between anchor atoms correct, +/- buffer distance.Default: 1.0 Make packing residues (boolean) : Generate packing residues from an asymmetric unit.Default: True Maximum atoms in biological unit (integer) : Option to limit the size of BUs processed based on number of atoms.Default: 50000 Maximum parts in biological unit (integer) : Option to limit the size of BUs processed based on number of parts (chains).Default: 24 Number of loops to minimize and evaluate (integer) : Maximum number of loops to connect and minimize.Default: 5 Max atoms for a ligand (integer) : Maximum number of atoms in a molecule to be detected as a ligand. For peptides we recommend 200Default: 100 Max residues for a ligand (integer) : Maximum number of residues in a molecule to be detected as a ligand. For peptides we recommend 20Default: 5 Max system atoms (integer) : Maximum number of atoms in the system.Default: 50000 Minimum alignment score for BU extraction (integer) : Option to specify minimum sequence alignment score for biounit extraction.Default: 200 Min atoms for a ligand (integer) : Minimum number of atoms in a molecule to be detected as a ligand. For fragments we recommend setting to 5Default: 8 Optimize Experimental Protons (boolean) : Option to optimize hydrogens assigned in the experiment.Default: False Loop optimization shell (decimal) : Include atoms within this distance in the loop optimization, larger distance results in slower optimizations.Default: 15.0 Opt stage 1 step/residue multiplier (integer) : Number of steps per number of residues in the loop for the first stage optimizer.Default: 5 Opt stage 2 step/residue multiplier (integer) : Number of steps per number of residues in the loop for the second stage optimizer.Default: 10 Loop optimization tolerance (decimal) : Tolerance for the loop optimization, smaller numbers result in slower optimizations.Default: 0.001 Output BioDesignUnits (boolean) : Option to write intermediate work produce bio design unitsDefault: False Prefer author BIOMT records (boolean) : Option where the author BIOMT record is prefered over the software generated one.Default: True Protonate (boolean) : Option to add and optimize protons in the system.Default: True Restrict DUs to ref site removal (boolean) : Option to not generate design units with sites not matching the reference (if one is provided).Default: True Rotamer Coverage % (decimal) : Coverage of the rotamers returned from the library in percent.Default: 100.0 Rotamer Library (string) : Rotamer library to use for side-chain building.Default: Richardson2016Choices: Dunbrack, Richardson, Richardson2016 Size used to define binding site (decimal) : Distance used to determine the size of the site.Default: 5.0 Strict Ligand (boolean) : Option to only emit design units with ligands that match the ligand names (if any are provided)Default: True Enforce proline positions in loop templates (boolean) : Fuzzy matches in the loop database have to have proline in exact locations of sequence.Default: True Strict protonation mode (boolean) : Option to fail prep if protons could not be added.Default: False Superpose design units (boolean) : Option to superpose DUs (if multiple), first onto the reference structure (if provided).Default: True Superposition method (string) : Superposition method.Default: SiteSequenceChoices: GlobalSequence, SiteSequence, DDMatrix, SSE, SiteHopper Target classication (string) : Option to pick whether target is protein or nucleic acid component.Default: ProteinChoices: Protein, Nucleic Number to transform (integer) : Number of loops to allow through the sidechain clash checker. No matter this number, will process all with an identical sequence to target.Default: 25 Components to be part of the molecule (string) : Components to make part of the molecule.If set to ‘undefined’, will not be included in outputDefault: [‘protein’]Choices: protein, nucleic, ligand, solvent, metals, counter_ions, lipids, packing_residues, sugars, undefined, cofactors, excipients, polymers, post_translational, other_proteins, other_nucleics, other_ligands, other_cofactors Discard liganded design units (boolean) : Option to discard liganded design units.Default: True Generate surface (boolean) : Option to generate surface for pockets.Default: True Local burial factor (decimal) : Option to set local burial factor.Default: 1.4 Log Field (Field Type: String) : The field to store messages to floe reportDefault: Log Field Max surface area (decimal) : Option to set maximum surface area for pocket finding.Default: 3000.0 Min surface area (decimal) : Option to set minimum surface area for pocket finding.Default: 150.0 Log Field (Field Type: String) : The field to store messages to floe reportDefault: Log Field