Gigadock Warp Classic

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

  • Product-based/Gigadock

  • Role-based/Computational Chemist

  • Solution-based/Virtual-screening/DB Search/Gigadock

  • Task-based/Virtual Screening - Structure-Based

Description

This floe approximates a full GigaDock run with a mixture of FastROCS and docking.

  1. Dock a random subset of molecules

  2. Cluster top docked molecules and select top scoring cluster heads

  3. Runs FastROCS on all input molecules using top scoring poses from the previous step as queries

  4. Re-dock the top scoring molecules from FastROCS

  5. Output Hit List of top scoring docked molecules.

Promoted Parameters

Title in user interface (promoted name)

Inputs

Design Unit or Receptor Dataset(s) (init_input_dataset): Dataset with the design unit (DU) (or old format receptor) to dock to. Multiple design units are allowed up to a limit of 10 for the Hybrid dock method (see ‘Docking Method’ parameter) and two otherwise. The behavior with multiple design units depends on the docking method. For Fred or FastFred, each molecule will be docked to each design unit and the results from the best scoring design unit will be outputted, thus docking time (and cost) will scale roughly linearly with the number of design units. For Hybrid, each molecule will be docked only into the design unit with the crystallographically bound ligand most similar (by ROCS Tanimoto Combo) to the molecule being docked, and docking time (and cost) will increase roughly by roughly 5% per addition design unit.

  • Required

  • Type: data_source

Input Conformer Collection (molecule_input_collection): Input collection containing molecules to dock. The collection should have been created by the Prepare Giga Collections Floe. OpenEye has also prepared several large third-party vendor databases and collections. The Organization Data/OpenEye Data/Gigadocking Collections folder contains data curated and provided by OpenEye that is freely available to Orion customers.

  • Required

  • Type: collection_source

Outputs

Hit List Dataset (hit_list_output_dataset): Output dataset with the top scoring docked molecules.

  • Required

  • Type: dataset_out

  • Default: Gigadock Warp Hit List

Queries (queries): Output dataset with the queries used by FastROCS. The queries are the cluster heads of the top scoring poses from the initial docking of a random subset of molecule from the input collection.

  • Required

  • Type: dataset_out

  • Default: Gigadock Warp Queries

Output Design Unit(s) Dataset (output_design_units_dataset): Output dataset containing a copy of the design units docked to.

  • Required

  • Type: dataset_out

  • Default: Gigadock Warp Design Unit

Temporary Collections (temporary_collections): This temporary collection is used by the floe during the run and automatically deleted at the end of the run.

  • Required

  • Type: collection_sink

  • Default: Gigadock Warp Temporary Collection

Options

Hit List Size (hit_list_size): Size of the final hit list with the top scoring docked molecules.

  • Required

  • Type: integer

  • Default: 10000

Docking Method (docking_method): Docking method to use. Fred is the default structure-based scoring method. Hybrid biases the the docking toward poses that overlay the crystallographic ligand (the design units must have a bound ligand). FastFred is a faster variant of Fred (typically ~2x faster for single design units) that samples less and uses a simpler scoring function in the initial stages of docking.

  • Type: string

  • Default: Fred

  • Choices: [‘Fred’, ‘Hybrid’, ‘Fast Fred’]

Options: Advanced

Random Dock Fraction (random_dock_fraction): Fraction of molecule from the input collections to select at random and dock. The top scoring poses from this docking will be clustered and the top cluster heads used as queries for FastROCS.

  • Required

  • Type: decimal

  • Default: 0.02

Final Dock Fraction (final_dock_fraction): The number of top scoring molecules from FastROCS that are passed to the final docking step is equal to this fraction of the size of the input collections.

  • Type: decimal

  • Default: 0.08

Number of FastROCS Queries (number_of_fastrocs_queries): Number of top scoring molecules from the docking of the random subset of collection molecules to use as queries for FastROCS.

  • Required

  • Type: integer

  • Default: 50

Cluster FastROCS Queries (cluster_fastrocs_queries): If False, the queries for FastROCS will be the top scoring molecules from docking a random subset of the molecules. If True, the queries for FastROCS will be the top scoring cluster HEADS from docking a random subset of molecules.

  • Required

  • Type: boolean

  • Default: False

  • Choices: [True, False]

GPU Hardware

FastROCS Instance Type (fastrocs_instance_type): The instances excluded by default are known to be not cost effective for FastROCS.

  • Type: string

  • Default: !cdns,!g4dn.metal,!g5.12xlarge,!g5.24xlarge,!g5.48xlarge,!g4dn.12xlarge,!g3s.,!p3.

Spot instance policy for FastROCS GPU Instance. (spot_instance_policy_for_fastrocs_gpu_instance): To run on SPOT instances, use the default setting of Preferred. To run on ON-DEMAND instances, set the value to Prohibited. ON-DEMAND instances typically cost 3–4 times more than SPOT instances, but are more available than SPOT instances when overall demand for GPUs on AWS is high.

  • Type: string

  • Default: Required

  • Choices: [‘Allowed’, ‘Preferred’, ‘NotPreferred’, ‘Prohibited’, ‘Required’]

Output Fields

Docked Pose Field (docked_pose_field): Field on the output hit list containing the pose of the docked molecule. If unspecified, the primary molecule field will be used.

  • Type: field_parameter::mol

Docked Score Field (score_field): Field on the output record where the docked score will be placed.

  • Type: field_parameter::float

  • Default: Chemgauss4

Steric Score Field (steric_score_field): Output field with the steric score component of the docked molecule. This field will only be created on the output records if this parameter is specified.

  • Type: field_parameter::float

Clash Score Field (clash_score_field): Output field with the clash score component of the docked molecule. This field will only be created on the output records if this parameter is specified.

  • Type: field_parameter::float

Protein Desolv Score Field (protein_desolv_score_field): Output field with the protein desolvation score component of the docked molecule. This field will only be created on the output records if this parameter is specified.

  • Type: field_parameter::float

Ligand Desolv Score Field (ligand_desolv_score_field): Output field with the ligand desolvation score component of the docked molecule. This field will only be created on the output records if this parameter is specified.

  • Type: field_parameter::float

Ligand Desolv HB Score Field (ligand_desolv_hb_score_field): Output field with the ligand desolvation hydrogen bond score component of the docked molecule. This field will only be created on the output records if this parameter is specified.

  • Type: field_parameter::float

Hydrogen Bond Score Field (hydrogen_bond_score_field): Output field with the hydrogen bond score component of the docked molecule. This field will only be created on the output records if this parameter is specified.

  • Type: field_parameter::float

Design Unit ID Field (design_unit_id_field): Output field with with the ID of the design unit the molecule scores best in.

  • Type: field_parameter::int

  • Default: Design Unit ID

Design Unit Link Field (design_unit_link_field): Output field with a link to the design unit the molecule scores best in.

  • Type: field_parameter::link

  • Default: Design Unit Link

FastROCS Overlay Field (fastrocs_overlay_field): Field on the output hit list containing the best FastROCS overlay onto the query pose with the highest Tanimoto of any of the query poses. The query poses are generated by the floe by docking a random subset of the initial collections and selecting the top scoring poses as queries for FastROCS.

  • Type: field_parameter::mol

  • Default: FastROCS Overlay

FastROCS Query Field (fastrocs_query_field): Field on the output hit list containing the query pose the docked pose best overlayed onto with FastROCS. The query poses are generated by the floe by docking a random subset of the initial collections and selecting the top scoring poses as queries for FastROCS.

  • Type: field_parameter::mol

  • Default: FastROCS Query

Combo Tanimoto Field (combo_tanimoto_field): Name of the field with the FastROCS Combo Tanimoto Score.

  • Required

  • Type: field_parameter::float

  • Default: FastROCS Combo Tanimoto

Shape Tanimoto Field (shape_tanimoto_field): Name of the field with the FastROCS Shape Tanimoto Score.

  • Type: field_parameter::float

  • Default: FastROCS Shape

Color Tanimoto Field (color_tanimoto_field): Name of the field with the FastROCS Color Tanimoto Score.

  • Type: field_parameter::float

  • Default: FastROCS Color

Bemis Murcko Field (bemis_murcko_field): Output field for the Bemis Murcko core SMILES.

  • Type: field_parameter::string

  • Default: Bemis Murcko SMILES

Bemis Murcko ID Field (bemis_murcko_id_field): Output Field with an integer ID of the Bemis Murcko core. All molecules with the same Bemis Murcko core SMILES will have the same ID, and those with different Bemis Murcko core SMILES will have different IDs. The IDs starts at 1 and increments by 1 each time a new Bemis Murcko core is seen. Thus this integer ID identifier depends on the order the records are passed unlike the Bemis Murcko core SMILES itself.

  • Type: field_parameter::int

  • Default: Bemis Murcko ID

Bemis Murcko Rank Field (bemis_murcko_rank_field): Integer Field with the rank of the molecule within its Bemis Murcko family (i.e., the rank the molecule would have if the if the hit list contained only the molecules with the same Bemis Murcko core SMILES)

  • Type: field_parameter::int

  • Default: Bemis Murcko Rank

Hetero Bemis Murcko Field (hetero_bemis_murcko_field): Output field for the Hetero Bemis Murcko core SMILES.

  • Type: field_parameter::string

  • Default: Hetero Bemis Murcko

Hetero Bemis Murcko ID Field (hetero_bemis_murcko_id_field): Output Field with an integer ID of the Hetero Bemis Murcko core. All molecules with the same Hetero Bemis Murcko core SMILES will have the same ID, and those with different Hetero Bemis Murcko core SMILES will have different IDs. The IDs starts at 1 and increments by 1 each time a new Hetero Bemis Murcko core is seen. Thus this integer ID identifier depends on the order the records are passed unlike the Hetero Bemis Murcko core SMILES itself.

  • Type: field_parameter::int

  • Default: Hetero Bemis Murcko ID

Hetero Bemis Murcko Rank Field (hetero_bemis_murcko_rank_field): Integer Field with the rank of the molecule within its Hetero Bemis Murcko family (i.e., the rank the molecule would have if the if the hit list contained only the molecules with the same Hetero Bemis Murcko core SMILES)

  • Type: field_parameter::int

  • Default: Hetero Bemis Murcko Rank