Collection to Hitlist Dataset

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

  • Product-based/FastROCS

  • Product-based/Gigadock

  • Role-based/Computational Chemist

  • Solution-based/Virtual-screening/Analysis

  • Task-based/Data Science/Conversion

Description

This floe creates a dataset of the top-scoring compounds from a collection ranked by the value of a float field in the collection.

Promoted Parameters

Title in user interface (promoted name)

Inputs

Input Collection (input_collection): Collection to extract top scoring records from.

  • Required

  • Type: collection_source

Sort Field (sort_field): Field in the collection to sort on. If you are processing the Raw Results collection from a Gigadocking run, the sort field is Chemgauss4.

  • Required

  • Type: field_parameter::float

Outputs

Output Dataset (output_dataset): Output dataset to write to

  • Required

  • Type: dataset_out

  • Default: Collection Hit List

Options

Descending (descending): If On, scores will be sorted in descending order (i.e., high scores will appear at the top of the hit list). If Off, scores will be sorted in ascending order (i.e., low scores will appear at the top of the hit list).Hint: Set this to On when processing ROCS/FastROCS results and Off when processing docking results.

  • Required

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Sort Hit List (sort_switch): If turned off, the output dataset will still contain the top N molecules from the input collection. However, within the dataset the molecules will not be sorted. This will reduce the memory needed for the hit list cube.

  • Required

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Hit List Size (hit_list_size): Size of the output hit list. If this value is set to greater than 100K and ‘Sort Hit List’ is True, the amount of memory for the serial cubes may need to be increased. See the ‘Serial Cube Memory’ parameter.

  • Required

  • Type: integer

  • Default: 10000

Serial Cube Memory (serial_memory_mb): Memory (in MB) allocated to both the Hit List and Find Score Cutoff Cubes.

  • Type: decimal

  • Default: 30720