SPRUCE - Protein Preparation from PDB Codes¶
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Role-based/Computational Chemist
Product-based/SPRUCE
Solution-based/Virtual-screening/Target Preparation
Solution-based/Hit to Lead/Target Preparation
Task-based/Target Prep & Analysis/Protein Preparation
Description
This floe uses Spruce to prepare biomolecules for downstream modeling applications in Orion, such as docking, posit, gameplan, or short-trajectory MD by generating a design unit.
The required input for this floe is PDB codes. The PDB (or MMCIF if the PDB is not available), as well as the MTZ file, containing the electron density maps, (if available) will be downloaded from the RCSB.
An optional reference for biological unit extraction may be provided. This reference can be a dataset from a previous Classic Spruce: Prep run, a pdb code, or a pdb file and mtz map.
If a ligand cannot be detected during the run, consider specifying the ligand residue name, increasing the size of the input variable “max_residues”, given as an optional input to this floe. Or if this is a known apo structure, you can provide the definition of a residue in the binding site.
You can read more about Spruce in the toolkit documentation.
Promoted Parameters
Title in user interface (promoted name)
Reference Structure Inputs
Optional Reference DU Dataset (ref_dataset_in): Only the first design unit of a reference dataset will be read if multiple
Type: data_source
Optional PDB Code for reference DU (ref_code_cube_in): PDB code to generate a reference design unit from
Type: string
Optional PDB File for reference DU (ref_pdb_file_cube_in): PDB file to generate a reference design unit from
Type: file_in
Optional MTZ File for reference DU (ref_mtz_file_cube_in): MTZ file to generate a reference design unit from
Type: file_in
Reference Dataset (ref_data_out): Reference Dataset if generated as part of the Floe
Type: dataset_out
Default: Reference structure dataset
Reference Structure Prep Parameters
Build missing loops (ref_build_loops): Option to build missing loops (if information is available to do so)
Required
Type: boolean
Default: True
Choices: [True, False]
Build missing tails (ref_build_tails): Option to build missing tails (if information is available to do so)
Required
Type: boolean
Default: False
Choices: [True, False]
Ligand name(s) (ref_ligand_names): format 3-letter codes e.g. ‘LIG’, for peptides separate codes with dashes(e.g. ‘SER-VAL-TPO-ALA’.
Type: string
Strict Ligand (ref_strict_ligand): Option to only emit design units with ligands that match the ligand names (if any are provided)
Required
Type: boolean
Default: True
Choices: [True, False]
Max atoms for a ligand (ref_max_lig_atoms): Maximum number of atoms in a molecule to be detected as a ligand. For peptides we recommend 200
Required
Type: integer
Default: 100
Max residues for a ligand (ref_max_lig_residues): Maximum number of residues in a molecule to be detected as a ligand. For peptides we recommend 20
Required
Type: integer
Default: 5
Loop Builder Parameters
Build missing loops (build_loops): Option to build missing loops (if information is available to do so)
Required
Type: boolean
Default: True
Choices: [True, False]
Build missing tails (build_tails): Option to build missing tails (if information is available to do so)
Required
Type: boolean
Default: False
Choices: [True, False]
Inputs
PDB code(s) to download (input_codes): Separate multiple codes with a (default) comma delimiter, e.g. ‘1ABC, DEF2, G3HI’.
Required
Type: string
Outputs
Output Dataset (dataset_data_out): Output dataset to write to
Required
Type: dataset_out
Default: Spruce_prep_dataset
Output Dataset (failed_data_out): Output dataset to write to
Required
Type: dataset_out
Default: Failed_Spruce_prep_dataset
Ligand Parameters
Ligand name(s) (ligand_names): format 3-letter codes e.g. ‘LIG’, for peptides separate codes with dashes(e.g. ‘SER-VAL-TPO-ALA’.
Type: string
Strict Ligand (strict_ligand): Option to only emit design units with ligands that match the ligand names (if any are provided)
Required
Type: boolean
Default: True
Choices: [True, False]
Max atoms for a ligand (max_lig_atoms): Maximum number of atoms in a molecule to be detected as a ligand. For peptides we recommend 200
Required
Type: integer
Default: 100
Max residues for a ligand (max_lig_residues): Maximum number of residues in a molecule to be detected as a ligand. For peptides we recommend 20
Required
Type: integer
Default: 5
Un-liganded Structure Parameters
Enumerate pockets (enum_pocket): Option to enumerate pockets when no ligand is found
Required
Type: boolean
Default: False
Choices: [True, False]
Site residue entry (site_residue): Single site residue specification for APO structures. Format ‘name:num:insert:chain[:fragno:altloc]’, e.g. ‘ALA:325: :A’ (note the blank/whitespace insert code). The regex ‘.*’ notation can be used as a wildcard.
Type: string