Gigadock Warp Classic¶
Description¶
Approximates a full Gigadock run with a mixture of FastROCS and docking.
Dock a random subset of molecules
Cluster top docked molecules and select top scoring cluster heads
Runs FastROCS on all input molecules using top scoring poses from the previous step as queries
Re-dock the top scoring molecules from FastROCS
Output Hit List of top scoring docked molecules.
Details¶
Title : Gigadock Warp ClassicTags : Large Scale Floes Giga-Docking FRED HYBRID Docking Chemgauss4 Virtual ScreeningPython Name : #12_warp_dock_floe
Parameters¶
Inputs¶
Design Unit or Receptor Dataset(s) Dataset with the design unit (DU) (or old format receptor) to dock to. Multiple design units are allowed up to a limit of 10 for the Hybrid dock method (see ‘Docking Method’ parameter) and 2 otherwise. The behavior with multiple design units depends on the docking method. For ‘Fred’ or ‘FastFred’ each molecule will be docked to each design unit and the results from the best scoring design unit will be outputted, thus docking time (and cost) will scale roughly linearly with the number of design units. For ‘Hybrid’ each molecule will be docked only into to the design unit with the crystallographic bound ligand most similar (by ROCS Combo Tanimoto) to the molecule being docked, and docking time (and cost) will increase roughly by roughly 5% per addition design unit.Type : data_sourceRequired : TruePython Name : init_input_dataset Input Conformer Collection Input collection containing molecules to dock. The collection should have been created by the ‘Prepare Giga Collections’ floe. Several large pre-generated 3rd party vendor docking collections can be made available in to your organization upon request at no charge by e-mailing support@eyesopen.com (if your organization has already requested them you will already have to these pre-generated collections). The collection will be located in the ‘Organization Data->OpenEye Data->Gigadocking Collections’ folder which also automatically contains several smaller collections and collections containing random subsets of the larger vendor collections.Type : collection_sourceRequired : TruePython Name : molecule_input_collection
Outputs¶
Hit List Dataset Output dataset with the top scoring docked molecules.Type : dataset_outRequired : TrueDefault : Gigadock Warp Hit ListPython Name : hit_list_output_dataset Queries Output dataset with the queries used by FastROCS. The queries are the cluster heads of the top scoring poses from the initial docking of a random subset of molecule from the input collection.Type : dataset_outRequired : TrueDefault : Gigadock Warp QueriesPython Name : queries Output Design Unit(s) Dataset Output dataset containing a copy of the design unit(s) docked to.Type : dataset_outRequired : TrueDefault : Gigadock Warp Design UnitPython Name : output_design_units_dataset Temporary Collections This temporary collection is used by the floe during the run and automatically deleted at the end of the run.Type : collection_sinkRequired : TrueDefault : Gigadock Warp Temporary CollectionPython Name : temporary_collections
Options¶
Hit List Size Size of the final hit list with the top scoring docked molecules.Type : integerRequired : TrueDefault : 10000Range : 1 to 100000Python Name : hit_list_size Docking Method Docking method to use. ‘Fred’ is the default structure based scoring method. ‘Hybrid’ biases the the docking towards poses that overlay the crystallographic ligand (the design unit(s) must have a bound ligand). ‘FastFred’ is a faster variant of ‘Fred’ (typically ~2x faster for single design units) that samples less and uses a simpler scoring function in the initial stages of docking.Type : stringRequired : FalseDefault : FredChoices :Fred, Hybrid, Fast FredPython Name : docking_method
Options: Advanced¶
Random Dock Fraction Fraction of molecule from the input collection(s) to select at random and dock. The top scoring poses from this docking will be clustered and the top cluster heads used as queries for FastROCSType : decimalRequired : TrueDefault : 0.02Range : 0.0 to 0.1Python Name : random_dock_fraction Final Dock Fraction The number of top scoring molecules from FastROCS that are passed to the final docking step is equal to this fraction of the size of the input collection(s)Type : decimalRequired : FalseDefault : 0.08Range : 0.0 to 0.1Python Name : final_dock_fraction Number of FastROCS Queries Number of top scoring molecules from the docking of the random subset of collection molecules to use as queries for FastROCSType : integerRequired : TrueDefault : 50Min Value : 1Python Name : number_of_fastrocs_queries Cluster FastROCS Queries If False the queries for FastROCS will be the top scoring molecules from docking a random subset of the molecules. If True the queries for FastROCS will be the top scoring cluster HEADS from docking a random subset of molecule.Type : booleanRequired : TrueDefault : FalseChoices :True, FalsePython Name : cluster_fastrocs_queries
GPU Hardware¶
This parameters control the AWS instance type the FastROCS Cube will use. There is in general no reason to adjust these. They are exposed because overall demand for GPU instances on AWS has occasionally been very high and this has lead to extremely long run times for this floe as it waits for GPU instances in some circumstances.
FastROCS Instance Type The instances excluded by default are known to be not cost effective for FastROCS.Type : stringRequired : FalseDefault : !g4dn.metal,!g5.12xlarge,!g5.24xlarge,!g5.48xlarge,!g4dn.12xlarge,!g3s.,!p3.Python Name : fastrocs_instance_type Spot instance policy for FastROCS GPU Instance. To run on SPOT instances use the default setting of ‘preferred’. To run on ON-DEMAND instances set the value to ‘prohibited’. ON-DEMAND instances typically cost x3-4 more than SPOT instances, but are more available than SPOT instances when overall demand for GPUs on AWS is high.Type : stringRequired : FalseDefault : RequiredChoices :Allowed, Preferred, NotPreferred, Prohibited, RequiredPython Name : spot_instance_policy_for_fastrocs_gpu_instance
Output Fields¶
These parameters allow the user to change the default output fields this floe creates in the output datasets and/or collections. Note that parameters identifying a molecule field are special. If a molecule field is left empty the floe writes the molecule to the primary (i.e., default) molecule field of the record. The primary molecule of a dataset can be identified in the UI by looking for star on its field badge. CAUTION: If these parameters are modified the modifications must also be applied to the input fields of downstream floes that read fields written by this floe. If the downstream floe does not support specifying the input field then they may not work properly with the output of this floe if these settings are modified.
Docked Pose Field Field on the output hit list containing the pose of the docked molecule. If unspecified the primary molecule field will be used.Type : field_parameter::molRequired : FalsePython Name : docked_pose_field Docked Score Field Field on the output record where the docked score will be placedType : field_parameter::floatRequired : FalseDefault : Chemgauss4Python Name : score_field Steric Score Field Output field with the steric score component of the docked molecule. This field will only be created on the output records if this parameter is specified.Type : field_parameter::floatRequired : FalsePython Name : steric_score_field Clash Score Field Output field with the clash score component of the docked molecule. This field will only be created on the output records if this parameter is specified.Type : field_parameter::floatRequired : FalsePython Name : clash_score_field Protein Desolv Score Field Output field with the protein desolvation score component of the docked molecule. This field will only be created on the output records if this parameter is specified.Type : field_parameter::floatRequired : FalsePython Name : protein_desolv_score_field Ligand Desolv Score Field Output field with the ligand desolvation score component of the docked molecule. This field will only be created on the output records if this parameter is specified.Type : field_parameter::floatRequired : FalsePython Name : ligand_desolv_score_field Ligand Desolv HB Score Field Output field with the ligand desolvation hydrogen bond score component of the docked molecule. This field will only be created on the output records if this parameter is specified.Type : field_parameter::floatRequired : FalsePython Name : ligand_desolv_hb_score_field Hydrogen Bond Score Field Output field with the hydrogen bond score component of the docked molecule. This field will only be created on the output records if this parameter is specified.Type : field_parameter::floatRequired : FalsePython Name : hydrogen_bond_score_field Design Unit ID Field Output field with with the ID of the design unit the molecule scores best inType : field_parameter::intRequired : FalseDefault : Design Unit IDPython Name : design_unit_id_field Design Unit Link Field Output field with a Link to the design unit the molecule scores best inType : field_parameter::linkRequired : FalseDefault : Design Unit LinkPython Name : design_unit_link_field FastROCS Overlay Field Field on the output hit list containing the best FastROCS overlay onto the query pose with the highest Tanimoto of any of the query poses. The query poses are generated by the floe by docking a random subset of the initial collection(s) and selecting the top scoring poses as queries for FastROCS.Type : field_parameter::molRequired : FalseDefault : FastROCS OverlayPython Name : fastrocs_overlay_field FastROCS Query Field Field on the output hit list containing the query pose the docked pose best overlayed onto with FastROCS. The query poses are generated by the floe by docking a random subset of the initial collection(s) and selecting the top scoring poses as queries for FastROCS.Type : field_parameter::molRequired : FalseDefault : FastROCS QueryPython Name : fastrocs_query_field Combo Tanimoto Field Name of the field with the FastROCS Combo Tanimoto Score.Type : field_parameter::floatRequired : TrueDefault : FastROCS Combo TanimotoPython Name : combo_tanimoto_field Shape Tanimoto Field Name of the field with the FastROCS Shape Tanimoto Score.Type : field_parameter::floatRequired : FalseDefault : FastROCS ShapePython Name : shape_tanimoto_field Color Tanimoto Field Name of the field with the FastROCS Color Tanimoto Score.Type : field_parameter::floatRequired : FalseDefault : FastROCS ColorPython Name : color_tanimoto_field Bemis Murcko Field Output field for the Bemis Murcko core SMILES.Type : field_parameter::stringRequired : FalseDefault : Bemis Murcko SMILESPython Name : bemis_murcko_field Bemis Murcko ID Field Output Field with an integer ID of the Bemis Murcko core. All molecules with the same Bemis Murcko core SMILES will have the same ID, and those with different Bemis Murcko core SMILES will have different IDs. The IDs starts at 1 and increments by 1 each time a new Bemis Murcko core is seen. Thus this integer ID identifier depends on the order the records are passed unlike the Bemis Murcko core SMILES itself.Type : field_parameter::intRequired : FalseDefault : Bemis Murcko IDPython Name : bemis_murcko_id_field Bemis Murcko Rank Field Integer Field with the rank of the molecule within its Bemis Murcko family (i.e., the rank the molecule would have if the if the hit list contained only the molecules with the same Bemis Murcko core SMILES)Type : field_parameter::intRequired : FalseDefault : Bemis Murcko RankPython Name : bemis_murcko_rank_field Hetero Bemis Murcko Field Output field for the Hetero Bemis Murcko core SMILES.Type : field_parameter::stringRequired : FalseDefault : Hetero Bemis MurckoPython Name : hetero_bemis_murcko_field Hetero Bemis Murcko ID Field Output Field with an integer ID of the Hetero Bemis Murcko core. All molecules with the same Hetero Bemis Murcko core SMILES will have the same ID, and those with different Hetero Bemis Murcko core SMILES will have different IDs. The IDs starts at 1 and increments by 1 each time a new Hetero Bemis Murcko core is seen. Thus this integer ID identifier depends on the order the records are passed unlike the Hetero Bemis Murcko core SMILES itself.Type : field_parameter::intRequired : FalseDefault : Hetero Bemis Murcko IDPython Name : hetero_bemis_murcko_id_field Hetero Bemis Murcko Rank Field Integer Field with the rank of the molecule within its Hetero Bemis Murcko family (i.e., the rank the molecule would have if the if the hit list contained only the molecules with the same Hetero Bemis Murcko core SMILES)Type : field_parameter::intRequired : FalseDefault : Hetero Bemis Murcko RankPython Name : hetero_bemis_murcko_rank_field