Spruce Prep¶
Spruce-prepped OEDesignUnits is generated from input PDB/MTZ files on the input oechem.OERecord.
Main Parameters¶
Parameter Name |
---|
Add interaction hints |
Add style |
Allow cap residue truncation |
Alternate location handling method |
Loop backbone clash threshold |
Build C-terminal caps |
Option to build disulfide bridges |
Build missing loops |
Build N-terminal caps |
Build partial sidechains |
Build missing tails |
Loop builder include crystal packing |
Assign charges and radii |
Collapse non-site alts |
Loop crop length |
Delete clashing solvent |
Duplicate removal |
Enumerate co-factor sites |
Enumerate pockets |
Extended Log Field |
Fix backbone atom issues |
Generate Tautomers |
Hetgroup cluster distance |
Include SA term |
Include solvation |
Include Binding Site Grids |
Ligand Type |
Log Field |
Loop clash threshold |
Loop anchor atom distance buffer |
Make packing residues |
Maximum atoms in biological unit |
Maximum parts in biological unit |
Number of loops to minimize and evaluate |
Max system atoms |
Minimum alignment score for BU extraction |
Optimize Experimental Protons |
Loop optimization shell |
Opt stage 1 step/residue multiplier |
Opt stage 2 step/residue multiplier |
Loop optimization tolerance |
Output biological unit |
Prefer author BIOMT records |
Protonate |
Restrict DUs to ref site removal |
Rotamer Coverage % |
Rotamer Library |
Size used to define binding site |
Strict Ligand |
Enforce proline positions in loop templates |
Strict protonation mode |
Superpose design units |
Superposition method |
Target classification |
Number to transform |
Parameter Details¶
Calculation Parameters¶
Add interaction hints (add_interactions) type: boolean: Option add interactions to the design units.Default: True Add style (add_style) type: boolean: Option add style to the design units.Default: True Allow cap residue truncation (allow_truncate) type: boolean: Option to allow terminal residue to converted to cap, if cap will otherwise clash.Default: True Alternate location handling method (altloc) type: string: Option to pick method of handling alternate locations.Default: DefaultChoices: Primary, Enumerate, Default Loop backbone clash threshold (bb_clash_threshold) type: decimal: Loops from the database where more than the threshold fraction of the backbone atoms clash, are rejected.Default: 0.25 Build C-terminal caps (build_cterm_caps) type: boolean: Option to cap broken C-termini in protein chains.Default: True Option to build disulfide bridges (build_disulfidebridges) type: boolean: Allow the loop builder to build disulfide bridges during loop modeling (if possible).Default: True Build missing loops (build_loops) type: boolean: Option to build missing loops (if information is available to do so)Default: True Build N-terminal caps (build_nterm_caps) type: boolean: Option to cap broken N-termini in protein chains.Default: True Build partial sidechains (build_sidechains) type: boolean: Option to build missing or partial protein sidechains.Default: True Build missing tails (build_tails) type: boolean: Option to build missing tails (if information is available to do so)Default: False Loop builder include crystal packing (build_with_crystalpacking) type: boolean: Include packing residues when building loops.Default: False Assign charges and radii (charge_radii) type: boolean: Option to assign partial charge and radii.Default: True Add Cofactor code(s) (cofactor_codes) type: string: Add uncommon, or custom, cofactor 3-letter codes. Collapse non-site alts (collapse_nonsite_alts) type: boolean: Option to deduplicate structures with different alts, if the alt locations are not near the binding site.Default: True CPUs (cpu_count) type: integer: The number of CPUs to run this cube withDefault: 1 , Min: 1, Max: 128 Loop crop length (crop_length) type: integer: Anchor residues on the protein to crop back for a better fit, results in longer loops being built.Default: 1 Cube Metrics (cube_metrics) type: string: Set of metrics to be collectedChoices: cpu, disk, memory, network Delete clashing solvent (delete_clashing_solvent) type: boolean: Option to allow build steps to remove clashing solvent.Default: True Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 5120.0 , Min: 128.0, Max: 8589934592 Duplicate removal (duplicate_removal) type: boolean: Option to deduplicate identical structures resulting from symmetry operation.Default: True Enumerate co-factor sites (enum_cofactors_sites) type: boolean: Option to generate individual design units based on the recognized co-factors.Default: False Enumerate pockets (enum_pocket) type: boolean: Option to enumerate pockets when no ligand is foundDefault: False Add Excipient code(s) (excipient_codes) type: string: Add uncommon, or custom, excipient 3-letter codes. Fix backbone atom issues (fix_backbone) type: boolean: Option to fix backbone atom issues in protein chains.Default: True Generate Tautomers (generate_tautomers) type: boolean: Option to generate and use tautomers in the hydrogen network optimization.Default: True GPUs (gpu_count) type: integer: The number of GPUs to run this cube withDefault: 0 , Max: 16 Hetgroup cluster distance (het_group_nbr_dist) type: decimal: Distance between heterogens used to determine optimization clusters for protonation.Default: 3.5 Include SA term (incl_SA_term) type: boolean: Include solvent accessible surface area term when ranking the loops.Default: True Include solvation (incl_solvation) type: boolean: Include simple solvation model when building loops.Default: True Include Binding Site Grids (include_bsite_edens_grids) type: boolean: Include electron density and difference density maps around the binding siteDefault: True Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)Default: “” Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on Ligand Type (lig_type) type: string: The type of ligand that is expected for the system. Affects the max/min atom counts and the max residue count (if applicable) for the ligand in the system. Overrides can be individually input. Defaults are as follow: Small Molecule: min_atoms=8, max_atoms=100, max_residues=5; Peptide: min_atoms=8, max_atoms=200, max_residues=2; Macrocycle: min_atoms=8, max_atoms=250, max_residues=20; Fragment: min_atoms=2, max_atoms=35, max_residues=5Default: Small MoleculeChoices: Small Molecule, Peptide, Macrocycle, Fragment Add Ligand Smiles (ligand_metadata) type: string: Add ligand smiles and 3-letter codes, e.g. ‘c1ccccc1 BNZ’. Ligand name(s) (ligand_names) type: string: format 3-letter codes e.g. ‘LIG’, for peptides separate codes with dashes(e.g. ‘SER-VAL-TPO-ALA’. Add Lipid codes(s) (lipid_codes) type: string: Add uncommon, or custom, lipid 3-letter codes Loop clash threshold (loop_clash_threshold) type: decimal: Loops from the database where more than the threshold fraction of the loops atoms in addition to the bacbkone clashing ones clash, are rejected.Default: 0.2 Loop anchor atom distance buffer (loop_distance_buffer) type: decimal: Fuzzy matches in the loop database has to have distance between anchor atoms correct, +/- buffer distance.Default: 1.0 A template loop database file (loop_input_file) type: file_in: (Optional) A template loop database file, if not specified built-in database will be used Make packing residues (make_pack_res) type: boolean: Generate packing residues from an asymmetric unit.Default: True Maximum atoms in biological unit (max_bu_atoms) type: integer: Option to limit the size of BUs processed based on number of atoms.Default: 50000 Maximum parts in biological unit (max_bu_parts) type: integer: Option to limit the size of BUs processed based on number of parts (chains).Default: 24 Number of loops to minimize and evaluate (max_eval_loops) type: integer: Maximum number of loops to connect and minimize.Default: 5 Max atoms for a ligand (max_lig_atoms) type: integer: Override for the maximum number of heavy atoms in a molecule to be detected as a ligand. Max residues for a ligand (max_lig_residues) type: integer: Override for the maximum number of residues in a molecule to be detected as a ligand. Max system atoms (max_system_atoms) type: integer: Maximum number of atoms in the system.Default: 50000 Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 1800 , Min: 256.0, Max: 8589934592 Metric Period (metric_period) type: decimal: How often to sample metrics, in secondsDefault: 60Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300 Minimum alignment score for BU extraction (min_align_score) type: integer: Option to specify minimum sequence alignment score for biounit extraction.Default: 200 Min atoms for a ligand (min_lig_atoms) type: integer: Override for the minimum number of heavy atoms in a molecule to be detected as a ligand. Optimize Experimental Protons (opt_expt_protons) type: boolean: Option to optimize hydrogens assigned in the experiment.Default: False Loop optimization shell (opt_shell) type: decimal: Include atoms within this distance in the loop optimization, larger distance results in slower optimizations.Default: 15.0 Opt stage 1 step/residue multiplier (opt_stage1_iter_multiplier) type: integer: Number of steps per number of residues in the loop for the first stage optimizer.Default: 5 Opt stage 2 step/residue multiplier (opt_stage2_iter_multiplier) type: integer: Number of steps per number of residues in the loop for the second stage optimizer.Default: 10 Loop optimization tolerance (opt_tolerance) type: decimal: Tolerance for the loop optimization, smaller numbers result in slower optimizations.Default: 0.001 Output biological unit (output_bio_designunits) type: boolean: Option to write biological design units. These are intermediaries and should not be used forother applications.Default: False Prefer author BIOMT records (pref_author_record) type: boolean: Option where the author BIOMT record is prefered over the software generated one.Default: True Protonate (protonate) type: boolean: Option to add and optimize protons in the system.Default: True Restrict DUs to ref site removal (restrict_to_refsite) type: boolean: Option to not generate design units with sites not matching the reference (if one is provided).Default: True Rotamer Coverage % (rot_coverage) type: decimal: Coverage of the rotamers returned from the library in percent.Default: 100.0 Rotamer Library (rot_lib) type: string: Rotamer library to use for side-chain building.Default: Richardson2016Choices: Dunbrack, Richardson, Richardson2016 Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to addressDefault: 64 Site residue entry (site_residue) type: string: Single site residue specification for APO structures. Format ‘name:num:insert:chain[:fragno:altloc]’, e.g. ‘ALA:325: :A’ (note the blank/whitespace insert code). The regex ‘.*’ notation can be used as a wildcard. Size used to define binding site (site_size) type: decimal: Distance used to determine the size of the site.Default: 5.0 Spot policy (spot_policy) type: string: Control cube placement on spot market instancesDefault: ProhibitedChoices: Allowed, Preferred, NotPreferred, Prohibited, Required Strict Ligand (strict_ligand) type: boolean: Option to only emit design units with ligands that match the ligand names (if any are provided)Default: True Enforce proline positions in loop templates (strict_proline_match) type: boolean: Fuzzy matches in the loop database have to have proline in exact locations of sequence.Default: True Strict protonation mode (strict_protonate) type: boolean: Option to fail prep if protons could not be added.Default: True Superpose design units (superpose) type: boolean: Option to superpose DUs (if multiple), first onto the reference structure (if provided).Default: True Superposition method (superpose_method) type: string: Superposition method.Default: SiteSequenceChoices: GlobalSequence, SiteSequence, DDMatrix, SSE, SiteHopper Target classification (target) type: string: Option to pick whether target is protein or nucleic acid component.Default: ProteinChoices: Protein, Nucleic Number to transform (transform_threshold) type: integer: Number of loops to allow through the sidechain clash checker. No matter this number, will process all with an identical sequence to target.Default: 25 output verbosity (verbosity) type: string: verbose levelDefault: warningChoices: info, warning, error, debug, ddebug
Field parameters¶
Extended Log Field (ext_log_field) type: Field Type: StringVec: Message extended log fieldDefault: Extended Log Field Log Field (log_field) type: Field Type: String: The field to store messages to floe reportDefault: Log Field
Hardware Parameters¶
- Machine hardware requirements
- Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 1800 , Min: 256.0, Max: 8589934592
- Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to addressDefault: 64
- Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 5120.0 , Min: 128.0, Max: 8589934592
- GPUs (gpu_count) type: integer: The number of GPUs to run this cube withDefault: 0 , Max: 16
- CPUs (cpu_count) type: integer: The number of CPUs to run this cube withDefault: 1 , Min: 1, Max: 128
- Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
- Spot policy (spot_policy) type: string: Control cube placement on spot market instancesDefault: ProhibitedChoices: Allowed, Preferred, NotPreferred, Prohibited, Required
- Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)Default: “”
Metrics Parameters¶
- Cube Metric Parameters
- Metric Period (None) type: decimal: How often to sample metrics, in secondsDefault: 60Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
- Cube Metrics (None) type: string: Set of metrics to be collectedChoices: cpu, disk, memory, network
Parallel Spruce Prep
The parallel version adds these extra parameters.
Number of messages to distribute at a time (item_count) type: integer: The maximum number of messages to bundle together for a parallel cube.Default: 1 , Min: 1, Max: 65535 Maximum Failures (max_failures) type: integer: The maximum number of times to attempt processing a work itemDefault: 10 , Min: 1, Max: 100 Autoscale this Cube (autoscale) type: boolean: If True, let Orion manage the parallelism of this CubeDefault: True Maximum number of Cubes (max_parallel) type: integer: The maximum number of concurrently running copies of this CubeDefault: 1000 , Min: 1 Minimum number of Cubes (min_parallel) type: integer: The minimum number of concurrently running copies of this CubeDefault: 0