Make Bio DUs
Spruce-prepped Bio Design Units are generated from input structure on the input oechem.OERecord.
Calculation Parameters
Add Interaction Hints (add_interactions) type: boolean: Option add interactions to the design units.Default: True Add Style (add_style) type: boolean: Option add style to the design units.Default: True Allow Cap Residue Truncation (allow_truncate) type: boolean: Option to allow terminal residue to converted to cap, if cap will otherwise clash.Default: True Alternate Location Handling Method (altloc) type: string: Option to pick method of handling alternate locations.Default: DefaultChoices: Primary, Enumerate, Default Loop Backbone Clash Threshold (bb_clash_threshold) type: decimal: Loops from the database where more than the threshold fraction of the backbone atoms clash, are rejected.Default: 0.25 Build C-Terminal Caps (build_cterm_caps) type: boolean: Option to cap broken C-termini in protein chains.Default: True Option to Build Disulfide Bridges (build_disulfidebridges) type: boolean: Allow the loop builder to build disulfide bridges during loop modeling (if possible).Default: True Build Missing Loops (build_loops) type: boolean: Option to build missing loops (if information is available to do so)Default: True Build N-Terminal Caps (build_nterm_caps) type: boolean: Option to cap broken N-termini in protein chains.Default: True Build Partial Sidechains (build_sidechains) type: boolean: Option to build missing or partial protein sidechains.Default: True Build Missing Tails (build_tails) type: boolean: Option to build missing tails (if information is available to do so)Default: False Loop Builder Include Crystal Packing (build_with_crystalpacking) type: boolean: Include packing residues when building loops.Default: False Assign Charges and Radii (charge_radii) type: boolean: Option to assign partial charge and radii.Default: True Add Cofactor Code(s) (cofactor_codes) type: string: Add uncommon, or custom, cofactor 3-letter codes. Collapse Non-Site Alternates (collapse_nonsite_alts) type: boolean: Option to deduplicate structures with different alts, if the alt locations are not near the binding site.Default: True CPUs (cpu_count) type: integer: The number of CPUs to run this cube withDefault: 1 , Min: 1, Max: 128 Loop Crop Length (crop_length) type: integer: Anchor residues on the protein to crop back for a better fit, results in longer loops being built.Default: 1 Cube Metrics (cube_metrics) type: string: Set of metrics to be collectedChoices: cpu, disk, memory, network Delete Clashing Solvent (delete_clashing_solvent) type: boolean: Option to allow build steps to remove clashing solvent.Default: True Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 5120.0 , Min: 128.0, Max: 8589934592 Duplicate Removal (duplicate_removal) type: boolean: Option to deduplicate identical structures resulting from symmetry operation.Default: True Enumerate Cofactor Sites (enum_cofactors_sites) type: boolean: Option to generate individual design units based on the recognized cofactors.Default: False Add Excipient Code(s) (excipient_codes) type: string: Add uncommon, or custom, excipient 3-letter codes. Fix Backbone Atom Issues (fix_backbone) type: boolean: Option to fix backbone atom issues in protein chains.Default: True Generate Tautomers (generate_tautomers) type: boolean: Option to generate and use tautomers in the hydrogen network optimization.Default: True GPUs (gpu_count) type: integer: The number of GPUs to run this cube withDefault: 0 , Max: 16 Hetgroup Cluster Distance (het_group_nbr_dist) type: decimal: Distance between heterogens used to determine optimization clusters for protonation.Default: 3.5 Include Solvent Accessible Surface Area Term (incl_SA_term) type: boolean: Include solvent accessible surface area term when ranking the loops.Default: True Include Solvation (incl_solvation) type: boolean: Include simple solvation model when building loops.Default: True Include Binding Site Grids (include_bsite_edens_grids) type: boolean: Include electron density and difference density maps around the binding siteDefault: True Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)Default: “” Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on Ligand Type (lig_type) type: string: The type of ligand that is expected for the system. Affects the max/min atom counts and the max residue count (if applicable) for the ligand in the system. Overrides can be individually input. Defaults are as follow: Small Molecule: min_atoms=8, max_atoms=100, max_residues=5; Peptide: min_atoms=8, max_atoms=200, max_residues=2; Macrocycle: min_atoms=8, max_atoms=250, max_residues=20; Fragment: min_atoms=2, max_atoms=35, max_residues=5Default: Small MoleculeChoices: Small Molecule, Peptide, Macrocycle, Fragment Add Ligand Smiles (ligand_metadata) type: string: Add ligand smiles and 3-letter codes, e.g. ‘c1ccccc1 BNZ’. Ligand Name(s) (ligand_names) type: string: format 3-letter codes e.g. ‘LIG’, for peptides separate codes with dashes (e.g. ‘SER-VAL-TPO-ALA’. Add Lipid Codes(s) (lipid_codes) type: string: Add uncommon, or custom, lipid 3-letter codes Loop Clash Threshold (loop_clash_threshold) type: decimal: Loops from the database where more than the threshold fraction of the loops atoms in addition to the bacbkone clashing ones clash, are rejected.Default: 0.2 Loop Anchor Atom Eistance Buffer (loop_distance_buffer) type: decimal: Fuzzy matches in the loop database has to have distance between anchor atoms correct, +/- buffer distance.Default: 1.0 Loop Database File (loop_input_file) type: file_in: (Optional) A template loop database file, if not specified built-in database will be used Make Packing Residues (make_pack_res) type: boolean: Generate packing residues from an asymmetric unit.Default: True Max Backlog Wait (max_backlog_wait) type: integer: The max time (in seconds) that a cube will be backlogged on a group before being re-evaluatedDefault: 600 , Min: 300 Maximum Atoms in Biological Unit (max_bu_atoms) type: integer: Option to limit the size of BUs processed based on number of atoms.Default: 50000 Maximum Parts in Biological Unit (max_bu_parts) type: integer: Option to limit the size of BUs processed based on number of parts (chains).Default: 24 Number of Loops to Minimize and Evaluate (max_eval_loops) type: integer: Maximum number of loops to connect and minimize.Default: 5 Max Atoms for a Ligand (max_lig_atoms) type: integer: Override for the maximum number of heavy atoms in a molecule to be detected as a ligand. Max Residues for a Ligand (max_lig_residues) type: integer: Override for the maximum number of residues in a molecule to be detected as a ligand. Max System Atoms (max_system_atoms) type: integer: Maximum number of atoms in the system.Default: 50000 Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 1800 , Min: 256.0, Max: 8589934592 Metric Period (metric_period) type: decimal: How often to sample metrics, in secondsDefault: 60Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300 Minimum Alignment Score for Biological Unit Extraction (min_align_score) type: integer: Option to specify minimum sequence alignment score for biological unit extraction.Default: 200 Min Atoms for a Ligand (min_lig_atoms) type: integer: Override for the minimum number of heavy atoms in a molecule to be detected as a ligand. Optimize Experimental Protons (opt_expt_protons) type: boolean: Option to optimize hydrogens assigned in the experiment.Default: False Loop Optimization Shell (opt_shell) type: decimal: Include atoms within this distance in the loop optimization, larger distance results in slower optimizations.Default: 15.0 Optimize Stage 1 Step/Residue Multiplier (opt_stage1_iter_multiplier) type: integer: Number of steps per number of residues in the loop for the first stage optimizer.Default: 5 Optimize Stage 2 Step/Residue Multiplier (opt_stage2_iter_multiplier) type: integer: Number of steps per number of residues in the loop for the second stage optimizer.Default: 10 Loop Optimization Tolerance (opt_tolerance) type: decimal: Tolerance for the loop optimization, smaller numbers result in slower optimizations.Default: 0.001 Output All Biological Units (output_bio_designunits) type: boolean: Option to write all biological design units. These are intermediaries and should not be used forother applications.Default: False Thread limit per CPU (pids_per_cpu_limit) type: integer: The number of threads per CPUDefault: 32 Prefer Author BIOMT Records (pref_author_record) type: boolean: Option where the author BIOMT record is prefered over the software generated one.Default: True Protonate (protonate) type: boolean: Option to add and optimize protons in the system.Default: True Restrict DUs to Reference Site Removal (restrict_to_refsite) type: boolean: Option to not generate design units with sites not matching the reference (if one is provided).Default: True Rotamer Coverage % (rot_coverage) type: decimal: Coverage of the rotamers returned from the library in percent.Default: 100.0 Rotamer Library (rot_lib) type: string: Rotamer library to use for side-chain building.Default: Richardson2016Choices: Dunbrack, Richardson, Richardson2016 Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to addressDefault: 64 Site Residue Entry (site_residue) type: string: Single site residue specification for APO structures. Format: ‘RESNAME:RESNUM:ICODE:CHAINID[:FRAGNO:ALTLOC]’, e.g. ‘ALA:325: :A’ (note the blank/whitespace insert code). The regex ‘.*’ notation can be used as a wildcard. Size Used to Define Binding Site (site_size) type: decimal: Distance used to determine the size of the site.Default: 5.0 Spot policy (spot_policy) type: string: Control cube placement on spot market instancesDefault: ProhibitedChoices: Allowed, Preferred, NotPreferred, Prohibited, Required Strict Ligand (strict_ligand) type: boolean: Option to only emit design units with ligands that match the ligand names (if any are provided)Default: True Enforce Proline Positions in Loop Templates (strict_proline_match) type: boolean: Fuzzy matches in the loop database have to have proline in exact locations of sequence.Default: True Strict Protonation Mode (strict_protonate) type: boolean: Option to fail prep if protons could not be added.Default: True Superpose Design Units (superpose) type: boolean: Option to superpose DUs (if multiple), first onto the reference structure (if provided).Default: True Superposition Method (superpose_method) type: string: Superposition method.Default: SiteSequenceChoices: GlobalSequence, SiteSequence, DDMatrix, SSE, SiteHopper Target Classification (target) type: string: Option to pick whether target is protein or nucleic acid component.Default: ProteinChoices: Protein, Nucleic Number to Transform (transform_threshold) type: integer: Number of loops to allow through the sidechain clash checker. No matter this number, will process all with an identical sequence to target.Default: 25 output verbosity (verbosity) type: string: verbose levelDefault: warningChoices: info, warning, error, debug, ddebug
Field parameters
Extended Log Field (ext_log_field) type: Field Type: StringVec: Message extended log fieldDefault: Extended Log Field Log Field (log_field) type: Field Type: String: The field to store messages to floe reportDefault: Log Field
Hardware Parameters
- Machine hardware requirements
- Memory (MiB) (memory_mb) type: decimal: The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 1800 , Min: 256.0, Max: 8589934592
- Shared Memory (MiB) (shared_memory_mb) type: decimal: The amount of shared memory to allow a container to addressDefault: 64
- Thread limit per CPU (pids_per_cpu_limit) type: integer: The number of threads per CPUDefault: 32
- Max Backlog Wait (max_backlog_wait) type: integer: The max time (in seconds) that a cube will be backlogged on a group before being re-evaluatedDefault: 600 , Min: 300
- Temporary Disk Space (MiB) (disk_space) type: decimal: The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.Default: 5120.0 , Min: 128.0, Max: 8589934592
- GPUs (gpu_count) type: integer: The number of GPUs to run this cube withDefault: 0 , Max: 16
- CPUs (cpu_count) type: integer: The number of CPUs to run this cube withDefault: 1 , Min: 1, Max: 128
- Instance Type (instance_type) type: string: The type of instance that this cube needs to be run on
- Spot policy (spot_policy) type: string: Control cube placement on spot market instancesDefault: ProhibitedChoices: Allowed, Preferred, NotPreferred, Prohibited, Required
- Instance Tags (instance_tags) type: string: Only run on machines with matching tags (comma separated)Default: “”
Metrics Parameters
- Cube Metric Parameters
- Metric Period (None) type: decimal: How often to sample metrics, in secondsDefault: 60Choices: 1, 5, 10, 30, 60, 120, 180, 240, 300, Min: 1, Max: 300
- Cube Metrics (None) type: string: Set of metrics to be collectedChoices: cpu, disk, memory, network
Parallel Make Bio DUs
The parallel version adds these extra parameters.
Number of messages to distribute at a time (item_count) type: integer: The maximum number of messages to bundle together for a parallel cube.Default: 1 , Min: 1, Max: 65535 Maximum Failures (max_failures) type: integer: The maximum number of times to attempt processing a work itemDefault: 10 , Min: 1, Max: 100 Autoscale this Cube (autoscale) type: boolean: If True, let Orion manage the parallelism of this CubeDefault: True Maximum number of Cubes (max_parallel) type: integer: The maximum number of concurrently running copies of this CubeDefault: 1000 , Min: 1 Minimum number of Cubes (min_parallel) type: integer: The minimum number of concurrently running copies of this CubeDefault: 0