Batch FastROCS

Description

Overlays a FastROCS collection onto up to 10 shape or molecule queries and outputs a separate hit list for each query.

By default the best molecules from FastROCS are re-overlayed using ROCS and 25 random starts and up to 200 conformers of each molecule before being sent to the output hit list.

Most users should use the ‘FastROCS’ floe rather than this floe as it has more features. This floe is intended to be a replacement for the deprecated ‘Multi-Query Ligand-Based Virtual Screening with FastROCS and SubROCS’ for users who desire a separate hit for each query and minimum cost per query.

Details

Title : Batch FastROCS

Tags : Large Scale Floes Screening Virtual fastrocs rocs batch

Python Name : #03_batch_fastrocs

Parameters

Inputs

Query Dataset(s) Dataset(s) with up to 10 query molecules and/or shape queries. A separate hit list will be created for each query.

Type : data_source

Required : True

Python Name : query_datasets

FastROCS Input Collection FastROCS collection to screen against. OpenEye supplied several pre-generated vendor molecule collections in Organization Data. The ‘Prepare Giga Collections’ or ‘Giga Docking Collection to Hi-res FastROCS Collection’ floes can also be used to create suitable collections for this floe.

Type : collection_source

Required : True

Python Name : fastrocs_input_collection

Outputs

Prefix Output Dataset Names with Query Title If true the name/title of the associated query pre-appended to the names of all the output datasets listed below.

Type : boolean

Required : True

Default : True

Choices :True, False

Python Name : prefix_output_dataset_names_with_query_title

Hit List #1 Dataset Hit List for query 1. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.

Type : dataset_out

Required : True

Default : Hit List #1

Python Name : hit_list_1_dataset

Hit List #2 Dataset Hit List for query 2. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.

Type : dataset_out

Required : True

Default : Hit List #2

Python Name : hit_list_2_dataset

Hit List #3 Dataset Hit List for query 3. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.

Type : dataset_out

Required : True

Default : Hit List #3

Python Name : hit_list_3_dataset

Hit List #4 Dataset Hit List for query 4. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.

Type : dataset_out

Required : True

Default : Hit List #4

Python Name : hit_list_4_dataset

Hit List #5 Dataset Hit List for query 5. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.

Type : dataset_out

Required : True

Default : Hit List #5

Python Name : hit_list_5_dataset

Hit List #6 Dataset Hit List for query 6. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.

Type : dataset_out

Required : True

Default : Hit List #6

Python Name : hit_list_6_dataset

Hit List #7 Dataset Hit List for query 7. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.

Type : dataset_out

Required : True

Default : Hit List #7

Python Name : hit_list_7_dataset

Hit List #8 Dataset Hit List for query 8. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.

Type : dataset_out

Required : True

Default : Hit List #8

Python Name : hit_list_8_dataset

Hit List #9 Dataset Hit List for query 9. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.

Type : dataset_out

Required : True

Default : Hit List #9

Python Name : hit_list_9_dataset

Hit List #10 Dataset Hit List for query 10. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query it is associated with.

Type : dataset_out

Required : True

Default : Hit List #10

Python Name : hit_list_10_dataset

Cluster Heads #1 Dataset Hit list for query #1 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #1 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #1 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).

Type : dataset_out

Required : True

Default : Cluster Heads #1

Python Name : cluster_heads_1_dataset

Cluster Heads #2 Dataset Hit list for query #2 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #2 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #2 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).

Type : dataset_out

Required : True

Default : Cluster Heads #2

Python Name : cluster_heads_2_dataset

Cluster Heads #3 Dataset Hit list for query #3 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #3 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #3 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).

Type : dataset_out

Required : True

Default : Cluster Heads #3

Python Name : cluster_heads_3_dataset

Cluster Heads #4 Dataset Hit list for query #4 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #4 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #4 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).

Type : dataset_out

Required : True

Default : Cluster Heads #4

Python Name : cluster_heads_4_dataset

Cluster Heads #5 Dataset Hit list for query #5 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #5 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #5 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).

Type : dataset_out

Required : True

Default : Cluster Heads #5

Python Name : cluster_heads_5_dataset

Cluster Heads #6 Dataset Hit list for query #6 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #6 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #6 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).

Type : dataset_out

Required : True

Default : Cluster Heads #6

Python Name : cluster_heads_6_dataset

Cluster Heads #7 Dataset Hit list for query #7 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #7 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #7 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).

Type : dataset_out

Required : True

Default : Cluster Heads #7

Python Name : cluster_heads_7_dataset

Cluster Heads #8 Dataset Hit list for query #8 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #8 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #8 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).

Type : dataset_out

Required : True

Default : Cluster Heads #8

Python Name : cluster_heads_8_dataset

Cluster Heads #9 Dataset Hit list for query #9 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #9 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #9 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).

Type : dataset_out

Required : True

Default : Cluster Heads #9

Python Name : cluster_heads_9_dataset

Cluster Heads #10 Dataset Hit list for query #10 that contains only one, top scoring, representative of each Bemis-Murcko core. This hit list is a subset of the ‘Hit List #10 Dataset’ that also includes the clustering information. This hit list is creating by filtering the ‘Hit List #10 Dataset’ for ‘Bemis Murcko Rank’=1 (see the parameter ‘Output Fields -> Bemis Murcko Rank Field’).

Type : dataset_out

Required : True

Default : Cluster Heads #10

Python Name : cluster_heads_10_dataset

Output Query Dataset #1 This dataset holds a copy of the query used to generate Hit List 1. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.

Type : dataset_out

Required : True

Default : Output Query #1

Python Name : output_query_dataset_1

Output Query Dataset #2 This dataset holds a copy of the query used to generate Hit List 2. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.

Type : dataset_out

Required : True

Default : Output Query #2

Python Name : output_query_dataset_2

Output Query Dataset #3 This dataset holds a copy of the query used to generate Hit List 3. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.

Type : dataset_out

Required : True

Default : Output Query #3

Python Name : output_query_dataset_3

Output Query Dataset #4 This dataset holds a copy of the query used to generate Hit List 4. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.

Type : dataset_out

Required : True

Default : Output Query #4

Python Name : output_query_dataset_4

Output Query Dataset #5 This dataset holds a copy of the query used to generate Hit List 5. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.

Type : dataset_out

Required : True

Default : Output Query #5

Python Name : output_query_dataset_5

Output Query Dataset #6 This dataset holds a copy of the query used to generate Hit List 6. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.

Type : dataset_out

Required : True

Default : Output Query #6

Python Name : output_query_dataset_6

Output Query Dataset #7 This dataset holds a copy of the query used to generate Hit List 7. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.

Type : dataset_out

Required : True

Default : Output Query #7

Python Name : output_query_dataset_7

Output Query Dataset #8 This dataset holds a copy of the query used to generate Hit List 8. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.

Type : dataset_out

Required : True

Default : Output Query #8

Python Name : output_query_dataset_8

Output Query Dataset #9 This dataset holds a copy of the query used to generate Hit List 9. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.

Type : dataset_out

Required : True

Default : Output Query #9

Python Name : output_query_dataset_9

Output Query Dataset #10 This dataset holds a copy of the query used to generate Hit List 10. Note if ‘Prefix Output Dataset Names with Query Title’ is True then the name of the output dataset specified here will be pre-appended with the name/title of the query.

Type : dataset_out

Required : True

Default : Output Query #10

Python Name : output_query_dataset_10

Options

Hit List Size Number of top scoring molecules to output. This applies to all output hit lists if multiple queries are supplied

Type : integer

Required : True

Default : 20000

Range : 100 to 50000

Python Name : hit_list_size

Similarity Type Type of similarity to use for FastROCS (and ROCS if ‘Refine with ROCS’ is ‘On’).

Type : string

Required : False

Default : Tanimoto Combo

Choices :Tanimoto Combo, Ref Tversky, Fit Tversky, Shape Tanimoto, Shape Ref Tversky, Shape Fit Tversky

Python Name : similarity_type

Refine with ROCS If ‘On’, the top scoring molecules from FastROCS will be re-overlaid and re-scored with ROCS, and these refined ROCS results will be output to the hit-lists in place of initial FastROCS results. if Off the top scoring FastROCS molecules will be output directly.

Type : boolean

Required : True

Default : True

Choices :True, False

Python Name : refine_with_rocs

GPU Hardware

These parameters control the AWS instance type the FastROCS Cube will use. There is in general no reason to adjust these. They are exposed because overall demand for GPU instances on AWS has occasionally been very high and this has led to extremely long run times for this floe as it waits for GPU instances in some circumstances.

FastROCS Instance Type The instances excluded by default are known to be not cost effective for FastROCS.

Type : string

Required : False

Default : !cdns,!g4dn.metal,!g5.12xlarge,!g5.24xlarge,!g5.48xlarge,!g4dn.12xlarge,!g3s.,!p3.

Python Name : fastrocs_instance_type

Spot instance policy for FastROCS GPU Instance. To run on SPOT instances use the default setting of ‘preferred’. To run on ON-DEMAND instances set the value to ‘prohibited’. ON-DEMAND instances typically cost x3-4 more than SPOT instances, but are more available than SPOT instances when overall demand for GPUs on AWS is high.

Type : string

Required : False

Default : Preferred

Choices :Allowed, Preferred, NotPreferred, Prohibited, Required

Python Name : spot_instance_policy_for_fastrocs_gpu_instance

Output Fields

These parameters allow the user to change the default output fields this floe creates in the output datasets and/or collections. Note that parameters identifying a molecule field are special. If a molecule field is left empty the floe writes the molecule to the primary (i.e., default) molecule field of the record. The primary molecule of a dataset can be identified in the UI by looking for star on its field badge. CAUTION: If these parameters are modified the modifications must also be applied to the input fields of downstream floes that read fields written by this floe. If the downstream floe does not support specifying the input field then they may not work properly with the output of this floe if these settings are modified.

Tanimoto Combo Field Output field with the Tanimoto Combo. This field will only be created if the score type is FastROCS Similarity Type is Tanimoto Combo. The value in this field is a duplicate of the value in Combo Similarity.

Type : field_parameter::float

Required : False

Default : Tanimoto Combo

Python Name : tanimoto_combo_field

Tanimoto Color Field Output field with the Color Tanimoto. This field will only be created if the score type is FastROCS Similarity Type is Tanimoto Combo. The value in this field is a duplicate of the value in Color Similarity.

Type : field_parameter::float

Required : False

Default : Color Tanimoto

Python Name : tanimoto_color_field

Tanimoto Shape Field Output field with the Shape Tanimoto. This field will only be created if the score type is FastROCS Similarity Type is Tanimoto Combo. The value in this field is a duplicate of the value in Shape Similarity.

Type : field_parameter::float

Required : False

Default : Shape Tanimoto

Python Name : tanimoto_shape_field

Tversky Combo Field Output field with the Tversky Combo. This field will only be created if the score type is FastROCS Similarity Type is Fit Tversky or Ref Tversky. The value in this field is a duplicate of the value in Combo Similarity.

Type : field_parameter::float

Required : False

Default : Tversky Combo

Python Name : tversky_combo_field

Tversky Color Field Output field with the Color Tversky. This field will only be created if the score type is FastROCS Similarity Type is Fit Tversky or Ref Tversky. The value in this field is a duplicate of the value in Color Similarity.

Type : field_parameter::float

Required : False

Default : Color Tversky

Python Name : tversky_color_field

Tversky Shape Field Output field with the Shape Tversky. This field will only be created if the score type is FastROCS Similarity Type is Fit Tversky or Ref Tversky. The value in this field is a duplicate of the value in Shape Similarity.

Type : field_parameter::float

Required : False

Default : Shape Tversky

Python Name : tversky_shape_field

Bemis Murcko Field Output field for the Bemis Murcko core SMILES.

Type : field_parameter::string

Required : False

Default : Bemis Murcko

Python Name : bemis_murcko_field

Bemis Murcko ID Field Output Field with an integer ID of the Bemis Murcko core. All molecules with the same Bemis Murcko core SMILES will have the same ID, and those with different Bemis Murcko core SMILES will have different IDs. The IDs starts at 1 and increments by 1 each time a new Bemis Murcko core is seen. Thus this integer ID identifier depends on the order the records are passed unlike the Bemis Murcko core SMILES itself.

Type : field_parameter::int

Required : False

Default : Bemis Murcko ID

Python Name : bemis_murcko_id_field

Bemis Murcko Rank Field Integer Field with the rank of the molecule within its Bemis Murcko family (i.e., the rank the molecule would have if the if the hit list contained only the molecules with the same Bemis Murcko core SMILES)

Type : field_parameter::int

Required : True

Default : Bemis Murcko Rank

Python Name : bemis_murcko_rank_field

Hetero Bemis Murcko Field Output field for the Hetero Bemis Murcko core SMILES.

Type : field_parameter::string

Required : False

Default : Hetero Bemis Murcko

Python Name : hetero_bemis_murcko_field

Hetero Bemis Murcko ID Field Output Field with an integer ID of the Hetero Bemis Murcko core. All molecules with the same Hetero Bemis Murcko core SMILES will have the same ID, and those with different Hetero Bemis Murcko core SMILES will have different IDs. The IDs starts at 1 and increments by 1 each time a new Hetero Bemis Murcko core is seen. Thus this integer ID identifier depends on the order the records are passed unlike the Hetero Bemis Murcko core SMILES itself.

Type : field_parameter::int

Required : False

Default : Hetero Bemis Murcko ID

Python Name : hetero_bemis_murcko_id_field

Hetero Bemis Murcko Rank Field Integer Field with the rank of the molecule within its Hetero Bemis Murcko family (i.e., the rank the molecule would have if the if the hit list contained only the molecules with the same Hetero Bemis Murcko core SMILES)

Type : field_parameter::int

Required : False

Default : Hetero Bemis Murcko Rank

Python Name : hetero_bemis_murcko_rank_field