Batch FastROCS
Description
Overlays a FastROCS collection onto up to 10 shape or molecule queries and outputs a separate hit list for each query.
By default the best molecules from FastROCS are re-overlayed using ROCS and 25 random starts and up to 200 conformers of each molecule before being sent to the output hit list.
Most users should use the ‘FastROCS’ floe rather than this floe as it has more features. This floe is intended to be a replacement for the deprecated ‘Multi-Query Ligand-Based Virtual Screening with FastROCS and SubROCS’ for users who desire a separate hit for each query.
Promoted Parameters
Title in user interface (promoted name)
Inputs
Query Dataset(s) (query_datasets): Dataset(s) with up to 10 query molecules and/or shape queries. A separate hit list will be created for each query.
Required
Type: data_source
FastROCS Input Collection (fastrocs_input_collection): FastROCS collection to screen against. OpenEye supplied several pre-generated vendor molecule collections in Organization Data. The ‘Prepare Giga Collections’ or ‘Giga Docking Collection to Hi-res FastROCS Collection’ floes can also be used to create suitable collections for this floe.
Required
Type: collection_source
Outputs
Prefix Output Dataset Names with Query Title (prefix_output_dataset_names_with_query_title): If true the name/title of the associated query pre-appended to the names of all the output datasets listed below.
Required
Type: boolean
Default: True
Choices: [True, False]
Hit List #1 Dataset (hit_list_1_dataset): Hit List for query 1. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.
Required
Type: dataset_out
Default: Hit List #1
Hit List #2 Dataset (hit_list_2_dataset): Hit List for query 2. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.
Required
Type: dataset_out
Default: Hit List #2
Hit List #3 Dataset (hit_list_3_dataset): Hit List for query 3. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.
Required
Type: dataset_out
Default: Hit List #3
Hit List #4 Dataset (hit_list_4_dataset): Hit List for query 4. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.
Required
Type: dataset_out
Default: Hit List #4
Hit List #5 Dataset (hit_list_5_dataset): Hit List for query 5. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.
Required
Type: dataset_out
Default: Hit List #5
Hit List #6 Dataset (hit_list_6_dataset): Hit List for query 6. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.
Required
Type: dataset_out
Default: Hit List #6
Hit List #7 Dataset (hit_list_7_dataset): Hit List for query 7. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.
Required
Type: dataset_out
Default: Hit List #7
Hit List #8 Dataset (hit_list_8_dataset): Hit List for query 8. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.
Required
Type: dataset_out
Default: Hit List #8
Hit List #9 Dataset (hit_list_9_dataset): Hit List for query 9. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.
Required
Type: dataset_out
Default: Hit List #9
Hit List #10 Dataset (hit_list_10_dataset): Hit List for query 10. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.
Required
Type: dataset_out
Default: Hit List #10
Cluster Heads #1 Dataset (cluster_heads_1_dataset): Hit list for query #1 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #1 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #1 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).
Required
Type: dataset_out
Default: Cluster Heads #1
Cluster Heads #2 Dataset (cluster_heads_2_dataset): Hit list for query #2 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #2 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #2 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).
Required
Type: dataset_out
Default: Cluster Heads #2
Cluster Heads #3 Dataset (cluster_heads_3_dataset): Hit list for query #3 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #3 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #3 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).
Required
Type: dataset_out
Default: Cluster Heads #3
Cluster Heads #4 Dataset (cluster_heads_4_dataset): Hit list for query #4 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #4 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #4 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).
Required
Type: dataset_out
Default: Cluster Heads #4
Cluster Heads #5 Dataset (cluster_heads_5_dataset): Hit list for query #5 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #5 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #5 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).
Required
Type: dataset_out
Default: Cluster Heads #5
Cluster Heads #6 Dataset (cluster_heads_6_dataset): Hit list for query #6 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #6 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #6 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).
Required
Type: dataset_out
Default: Cluster Heads #6
Cluster Heads #7 Dataset (cluster_heads_7_dataset): Hit list for query #7 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #7 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #7 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).
Required
Type: dataset_out
Default: Cluster Heads #7
Cluster Heads #8 Dataset (cluster_heads_8_dataset): Hit list for query #8 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #8 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #8 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).
Required
Type: dataset_out
Default: Cluster Heads #8
Cluster Heads #9 Dataset (cluster_heads_9_dataset): Hit list for query #9 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #9 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #9 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).
Required
Type: dataset_out
Default: Cluster Heads #9
Cluster Heads #10 Dataset (cluster_heads_10_dataset): Hit list for query #10 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #10 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #10 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).
Required
Type: dataset_out
Default: Cluster Heads #10
Output Query Dataset #1 (output_query_dataset_1): This dataset holds a copy of the query used to generate Hit List 1. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.
Required
Type: dataset_out
Default: Output Query #1
Output Query Dataset #2 (output_query_dataset_2): This dataset holds a copy of the query used to generate Hit List 2. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.
Required
Type: dataset_out
Default: Output Query #2
Output Query Dataset #3 (output_query_dataset_3): This dataset holds a copy of the query used to generate Hit List 3. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.
Required
Type: dataset_out
Default: Output Query #3
Output Query Dataset #4 (output_query_dataset_4): This dataset holds a copy of the query used to generate Hit List 4. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.
Required
Type: dataset_out
Default: Output Query #4
Output Query Dataset #5 (output_query_dataset_5): This dataset holds a copy of the query used to generate Hit List 5. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.
Required
Type: dataset_out
Default: Output Query #5
Output Query Dataset #6 (output_query_dataset_6): This dataset holds a copy of the query used to generate Hit List 6. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.
Required
Type: dataset_out
Default: Output Query #6
Output Query Dataset #7 (output_query_dataset_7): This dataset holds a copy of the query used to generate Hit List 7. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.
Required
Type: dataset_out
Default: Output Query #7
Output Query Dataset #8 (output_query_dataset_8): This dataset holds a copy of the query used to generate Hit List 8. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.
Required
Type: dataset_out
Default: Output Query #8
Output Query Dataset #9 (output_query_dataset_9): This dataset holds a copy of the query used to generate Hit List 9. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.
Required
Type: dataset_out
Default: Output Query #9
Output Query Dataset #10 (output_query_dataset_10): This dataset holds a copy of the query used to generate Hit List 10. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.
Required
Type: dataset_out
Default: Output Query #10
Options
Hit List Size (hit_list_size): Number of top scoring molecules to output. This applies to all output hit lists if multiple queries are supplied
Type: integer
Default: 20000
Similarity Type (similarity_type): Type of similarity to use for FastROCS (and ROCS if ‘Refine with ROCS’ is ‘On’).
Type: string
Default: Tanimoto Combo
Choices: [‘Tanimoto Combo’, ‘Ref Tversky’, ‘Fit Tversky’]
Shape Only FastROCS Overlay (shape_only_fastrocs_overlay): If set to ‘On’ FastROCS will overlay molecules using shape only ignoring color. If set to ‘Off’ FastROCS will overlay molecules using shape&color. Note that this parameters affects the overlay process, but not the scoring (e.g., the overlay can be done with shape while the scoring is done with shape and color). Also note that if either Fit or Referency Tversky scoring is selected (see ‘FastROCS Similarity Type’ parameter) the overlay will be done with shape only ignoring the setting of this flag.
Type: boolean
Default: False
Choices: [True, False]
Number of Random Starts (FastROCS) (number_of_random_starts_fastrocs): If specified fastROCS will use this number of random starts for each conformer overlay. If unspecified the default inertial starts will be used.
Type: integer
Refine with ROCS (refine_with_rocs): If ‘On’, the top scoring molecules from FastROCS will be re-overlaid and re-scored with ROCS, and these refined ROCS results will be output to the hit-lists in place of initial FastROCS results. if Off the top scoring FastROCS molecules will be output directly.
Required
Type: boolean
Default: True
Choices: [True, False]
Refinement Max Confs (refinement_max_confs): Maximum number of conformers Omega will generate for the ROCS re-scoring step if chosen.
Type: integer
Default: 200
Number of Random Starts (ROCS) (rocs_number_of_random_starts): If ROCS rescoring is selected, ROCS uses this number of random starts for each conformer overlay. If unspecified the default inertial starts will be used.
Type: integer
Default: 25
Color Force Field Options
OEColorFF Type (oecolorff_type): FastROCS collections are prepared using the ImplicitMillsDean color force field. ‘None’ indicates that the prepared collections will be used as is, without modification. Shape queries are used as is, and their color features will not be modified. If a different color force field is selected it would be applied to the molecule queries, if ‘None’ is selected ImplicitMillsDean will be used.
Required
Type: string
Default: None
Choices: [‘ExplicitMillsDean’, ‘ImplicitMillsDeanNoRings’, ‘ExplicitMillsDeanNoRings’, ‘None’]
Custom Color Force Field File (cff_in): Custom Color Force Field (CFF) file to use in FastROCS and ROCS rescoring.
Type: file_in
GPU Hardware
FastROCS Instance Type (fastrocs_instance_type): The instances excluded by default are known to be not cost effective for FastROCS.
Type: string
Default: !cdns,!g4dn.metal,!g5.12xlarge,!g5.24xlarge,!g5.48xlarge,!g4dn.12xlarge,!g3s.,!p3.
Spot instance policy for FastROCS GPU Instance. (spot_instance_policy_for_fastrocs_gpu_instance): To run on SPOT instances use the default setting of ‘preferred’. To run on ON-DEMAND instances set the value to ‘prohibited’. ON-DEMAND instances typically cost x3-4 more than SPOT instances, but are more available than SPOT instances when overall demand for GPUs on AWS is high.
Type: string
Default: Preferred
Choices: [‘Allowed’, ‘Preferred’, ‘NotPreferred’, ‘Prohibited’, ‘Required’]
Output Fields
Tanimoto Combo Field (tanimoto_combo_field): Output field with the Tanimoto Combo. This field will only be created if the score type is FastROCS Similarity Type is Tanimoto Combo. The value in this field is a duplicate of the value in Combo Similarity.
Type: field_parameter::float
Default: Tanimoto Combo
Tanimoto Color Field (tanimoto_color_field): Output field with the Color Tanimoto. This field will only be created if the score type is FastROCS Similarity Type is Tanimoto Combo. The value in this field is a duplicate of the value in Color Similarity.
Type: field_parameter::float
Default: Color Tanimoto
Tanimoto Shape Field (tanimoto_shape_field): Output field with the Shape Tanimoto. This field will only be created if the score type is FastROCS Similarity Type is Tanimoto Combo. The value in this field is a duplicate of the value in Shape Similarity.
Type: field_parameter::float
Default: Shape Tanimoto
Tversky Combo Field (tversky_combo_field): Output field with the Tversky Combo. This field will only be created if the score type is FastROCS Similarity Type is Fit Tversky or Ref Tversky. The value in this field is a duplicate of the value in Combo Similarity.
Type: field_parameter::float
Default: Tversky Combo
Tversky Color Field (tversky_color_field): Output field with the Color Tversky. This field will only be created if the score type is FastROCS Similarity Type is Fit Tversky or Ref Tversky. The value in this field is a duplicate of the value in Color Similarity.
Type: field_parameter::float
Default: Color Tversky
Tversky Shape Field (tversky_shape_field): Output field with the Shape Tversky. This field will only be created if the score type is FastROCS Similarity Type is Fit Tversky or Ref Tversky. The value in this field is a duplicate of the value in Shape Similarity.
Type: field_parameter::float
Default: Shape Tversky
Bemis Murcko Field (bemis_murcko_field): Output field for the Bemis Murcko core SMILES.
Type: field_parameter::string
Default: Bemis Murcko
Bemis Murcko ID Field (bemis_murcko_id_field): Output Field with an integer ID of the Bemis Murcko core. All molecules with the same Bemis Murcko core SMILES will have the same ID, and those with different Bemis Murcko core SMILES will have different IDs. The IDs starts at 1 and increments by 1 each time a new Bemis Murcko core is seen. Thus this integer ID identifier depends on the order the records are passed unlike the Bemis Murcko core SMILES itself.
Type: field_parameter::int
Default: Bemis Murcko ID
Bemis Murcko Rank Field (bemis_murcko_rank_field): Integer Field with the rank of the molecule within its Bemis Murcko family (i.e., the rank the molecule would have if the if the hit list contained only the molecules with the same Bemis Murcko core SMILES)
Type: field_parameter::int
Default: Bemis Murcko Rank
Hetero Bemis Murcko Field (hetero_bemis_murcko_field): Output field for the Hetero Bemis Murcko core SMILES.
Type: field_parameter::string
Default: Hetero Bemis Murcko
Hetero Bemis Murcko ID Field (hetero_bemis_murcko_id_field): Output Field with an integer ID of the Hetero Bemis Murcko core. All molecules with the same Hetero Bemis Murcko core SMILES will have the same ID, and those with different Hetero Bemis Murcko core SMILES will have different IDs. The IDs starts at 1 and increments by 1 each time a new Hetero Bemis Murcko core is seen. Thus this integer ID identifier depends on the order the records are passed unlike the Hetero Bemis Murcko core SMILES itself.
Type: field_parameter::int
Default: Hetero Bemis Murcko ID
Hetero Bemis Murcko Rank Field (hetero_bemis_murcko_rank_field): Integer Field with the rank of the molecule within its Hetero Bemis Murcko family (i.e., the rank the molecule would have if the if the hit list contained only the molecules with the same Hetero Bemis Murcko core SMILES)
Type: field_parameter::int
Default: Hetero Bemis Murcko Rank