Gigadock Warp Classic
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Product-based/Gigadock
Role-based/Computational Chemist
Solution-based/Virtual-screening/DB Search/Gigadock
Task-based/Virtual Screening - Structure-Based
Description
This floe approximates a full GigaDock run with a mixture of FastROCS and docking.
Dock a random subset of molecules
Cluster top docked molecules and select top scoring cluster heads
Runs FastROCS on all input molecules using top scoring poses from the previous step as queries
Re-dock the top scoring molecules from FastROCS
Output Hit List of top scoring docked molecules.
Promoted Parameters
Title in user interface (promoted name)
Inputs
Design Unit or Receptor Dataset(s) (init_input_dataset): Dataset with the design unit (DU) (or old format receptor) to dock to. Multiple design units are allowed up to a limit of 10 for the Hybrid dock method (see ‘Docking Method’ parameter) and two otherwise. The behavior with multiple design units depends on the docking method. For Fred or FastFred, each molecule will be docked to each design unit and the results from the best scoring design unit will be outputted, thus docking time (and cost) will scale roughly linearly with the number of design units. For Hybrid, each molecule will be docked only into the design unit with the crystallographically bound ligand most similar (by ROCS Tanimoto Combo) to the molecule being docked, and docking time (and cost) will increase roughly by roughly 5% per addition design unit.
Required
Type: data_source
Input Conformer Collection (molecule_input_collection): Input collection containing molecules to dock. The collection should have been created by the Prepare Giga Collections Floe. OpenEye has also prepared several large third-party vendor databases and collections. The Organization Data/OpenEye Data/Gigadocking Collections folder contains data curated and provided by OpenEye that is freely available to Orion customers.
Required
Type: collection_source
Outputs
Hit List Dataset (hit_list_output_dataset): Output dataset with the top scoring docked molecules.
Required
Type: dataset_out
Default: Gigadock Warp Hit List
Queries (queries): Output dataset with the queries used by FastROCS. The queries are the cluster heads of the top scoring poses from the initial docking of a random subset of molecule from the input collection.
Required
Type: dataset_out
Default: Gigadock Warp Queries
Output Design Unit(s) Dataset (output_design_units_dataset): Output dataset containing a copy of the design units docked to.
Required
Type: dataset_out
Default: Gigadock Warp Design Unit
Temporary Collections (temporary_collections): This temporary collection is used by the floe during the run and automatically deleted at the end of the run.
Required
Type: collection_sink
Default: Gigadock Warp Temporary Collection
Options
Hit List Size (hit_list_size): Size of the final hit list with the top scoring docked molecules.
Required
Type: integer
Default: 10000
Docking Method (docking_method): Docking method to use. Fred is the default structure-based scoring method. Hybrid biases the the docking toward poses that overlay the crystallographic ligand (the design units must have a bound ligand). FastFred is a faster variant of Fred (typically ~2x faster for single design units) that samples less and uses a simpler scoring function in the initial stages of docking.
Type: string
Default: Fred
Choices: [‘Fred’, ‘Hybrid’, ‘Fast Fred’]
Options: Advanced
Random Dock Fraction (random_dock_fraction): Fraction of molecule from the input collections to select at random and dock. The top scoring poses from this docking will be clustered and the top cluster heads used as queries for FastROCS.
Required
Type: decimal
Default: 0.02
Final Dock Fraction (final_dock_fraction): The number of top scoring molecules from FastROCS that are passed to the final docking step is equal to this fraction of the size of the input collections.
Type: decimal
Default: 0.08
Number of FastROCS Queries (number_of_fastrocs_queries): Number of top scoring molecules from the docking of the random subset of collection molecules to use as queries for FastROCS.
Required
Type: integer
Default: 50
Cluster FastROCS Queries (cluster_fastrocs_queries): If False, the queries for FastROCS will be the top scoring molecules from docking a random subset of the molecules. If True, the queries for FastROCS will be the top scoring cluster HEADS from docking a random subset of molecules.
Required
Type: boolean
Default: False
Choices: [True, False]
GPU Hardware
FastROCS Instance Type (fastrocs_instance_type): The instances excluded by default are known to be not cost effective for FastROCS.
Type: string
Default: !cdns,!g4dn.metal,!g5.12xlarge,!g5.24xlarge,!g5.48xlarge,!g4dn.12xlarge,!g3s.,!p3.
Spot instance policy for FastROCS GPU Instance. (spot_instance_policy_for_fastrocs_gpu_instance): To run on SPOT instances, use the default setting of Preferred. To run on ON-DEMAND instances, set the value to Prohibited. ON-DEMAND instances typically cost 3–4 times more than SPOT instances, but are more available than SPOT instances when overall demand for GPUs on AWS is high.
Type: string
Default: Required
Choices: [‘Allowed’, ‘Preferred’, ‘NotPreferred’, ‘Prohibited’, ‘Required’]
Output Fields
Docked Pose Field (docked_pose_field): Field on the output hit list containing the pose of the docked molecule. If unspecified, the primary molecule field will be used.
Type: field_parameter::mol
Docked Score Field (score_field): Field on the output record where the docked score will be placed.
Type: field_parameter::float
Default: Chemgauss4
Steric Score Field (steric_score_field): Output field with the steric score component of the docked molecule. This field will only be created on the output records if this parameter is specified.
Type: field_parameter::float
Clash Score Field (clash_score_field): Output field with the clash score component of the docked molecule. This field will only be created on the output records if this parameter is specified.
Type: field_parameter::float
Protein Desolv Score Field (protein_desolv_score_field): Output field with the protein desolvation score component of the docked molecule. This field will only be created on the output records if this parameter is specified.
Type: field_parameter::float
Ligand Desolv Score Field (ligand_desolv_score_field): Output field with the ligand desolvation score component of the docked molecule. This field will only be created on the output records if this parameter is specified.
Type: field_parameter::float
Ligand Desolv HB Score Field (ligand_desolv_hb_score_field): Output field with the ligand desolvation hydrogen bond score component of the docked molecule. This field will only be created on the output records if this parameter is specified.
Type: field_parameter::float
Hydrogen Bond Score Field (hydrogen_bond_score_field): Output field with the hydrogen bond score component of the docked molecule. This field will only be created on the output records if this parameter is specified.
Type: field_parameter::float
Design Unit ID Field (design_unit_id_field): Output field with with the ID of the design unit the molecule scores best in.
Type: field_parameter::int
Default: Design Unit ID
Design Unit Link Field (design_unit_link_field): Output field with a link to the design unit the molecule scores best in.
Type: field_parameter::link
Default: Design Unit Link
FastROCS Overlay Field (fastrocs_overlay_field): Field on the output hit list containing the best FastROCS overlay onto the query pose with the highest Tanimoto of any of the query poses. The query poses are generated by the floe by docking a random subset of the initial collections and selecting the top scoring poses as queries for FastROCS.
Type: field_parameter::mol
Default: FastROCS Overlay
FastROCS Query Field (fastrocs_query_field): Field on the output hit list containing the query pose the docked pose best overlayed onto with FastROCS. The query poses are generated by the floe by docking a random subset of the initial collections and selecting the top scoring poses as queries for FastROCS.
Type: field_parameter::mol
Default: FastROCS Query
Combo Tanimoto Field (combo_tanimoto_field): Name of the field with the FastROCS Combo Tanimoto Score.
Required
Type: field_parameter::float
Default: FastROCS Combo Tanimoto
Shape Tanimoto Field (shape_tanimoto_field): Name of the field with the FastROCS Shape Tanimoto Score.
Type: field_parameter::float
Default: FastROCS Shape
Color Tanimoto Field (color_tanimoto_field): Name of the field with the FastROCS Color Tanimoto Score.
Type: field_parameter::float
Default: FastROCS Color
Bemis Murcko Field (bemis_murcko_field): Output field for the Bemis Murcko core SMILES.
Type: field_parameter::string
Default: Bemis Murcko SMILES
Bemis Murcko ID Field (bemis_murcko_id_field): Output Field with an integer ID of the Bemis Murcko core. All molecules with the same Bemis Murcko core SMILES will have the same ID, and those with different Bemis Murcko core SMILES will have different IDs. The IDs starts at 1 and increments by 1 each time a new Bemis Murcko core is seen. Thus this integer ID identifier depends on the order the records are passed unlike the Bemis Murcko core SMILES itself.
Type: field_parameter::int
Default: Bemis Murcko ID
Bemis Murcko Rank Field (bemis_murcko_rank_field): Integer Field with the rank of the molecule within its Bemis Murcko family (i.e., the rank the molecule would have if the if the hit list contained only the molecules with the same Bemis Murcko core SMILES)
Type: field_parameter::int
Default: Bemis Murcko Rank
Hetero Bemis Murcko Field (hetero_bemis_murcko_field): Output field for the Hetero Bemis Murcko core SMILES.
Type: field_parameter::string
Default: Hetero Bemis Murcko
Hetero Bemis Murcko ID Field (hetero_bemis_murcko_id_field): Output Field with an integer ID of the Hetero Bemis Murcko core. All molecules with the same Hetero Bemis Murcko core SMILES will have the same ID, and those with different Hetero Bemis Murcko core SMILES will have different IDs. The IDs starts at 1 and increments by 1 each time a new Hetero Bemis Murcko core is seen. Thus this integer ID identifier depends on the order the records are passed unlike the Hetero Bemis Murcko core SMILES itself.
Type: field_parameter::int
Default: Hetero Bemis Murcko ID
Hetero Bemis Murcko Rank Field (hetero_bemis_murcko_rank_field): Integer Field with the rank of the molecule within its Hetero Bemis Murcko family (i.e., the rank the molecule would have if the if the hit list contained only the molecules with the same Hetero Bemis Murcko core SMILES)
Type: field_parameter::int
Default: Hetero Bemis Murcko Rank