Sample Collection

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

  • Product-based/FastROCS

  • Product-based/Gigadock

  • Role-based/Computational Chemist

  • Solution-based/Virtual-screening/DB Search

  • Task-based/Data Science/Sampling and Subsetting

Description

Randomly samples the contents of a FastROCS or GigaDocking collections and writes the sample to a dataset.

If the total size of the collection is less than the sample size the entire collection will be written to the dataset

Note: Some older externalized collections in Organization Data cannot be passed this is floe (an error will occur).

Promoted Parameters

Title in user interface (promoted name)

Inputs

Input Collection (input_collection): FastROCS or Giga Docking collection to sample. Requires exactly 1 collection.

  • Required

  • Type: collection_source

Outputs

Output Sample Dataset (output_sample_dataset): Output dataset to which to write.

  • Required

  • Type: dataset_out

  • Default: Sample Dataset

Options

Sample Size (sample_size): Number of records to sample from the collection. Max value 100,000.

  • Required

  • Type: integer

  • Default: 1000

Development

Min Shard Download Timeout (min_shard_download_timeout): Minimum timeout for the smart shard to records cubes

  • Required

  • Type: integer

  • Default: 2

Max Shard Download Timeout (max_shard_download_timeout): Maximum timeout for the smart shard to records cubes

  • Required

  • Type: integer

  • Default: 21600.0

Session Retry Dict for Shard Download (session_retry_dict_for_shard_download_): Session retry dict for the smart shard to records cubes

  • Type: string

  • Default: [‘429:1000’, ‘460:1000’, ‘500:1000’, ‘502:1000’, ‘503:1000’, ‘504:1000’]

Shard Download Attempts (shard_download_attempts): Download attempts for the smart shard to records cubes

  • Type: integer

  • Default: 1

Catch exceptions (catch_exceptions): If Off exception handling will be disabled for this cube.

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Catch exceptions (parallel_catch_exception_methods): Specifies which methods of a parallel cube an exception will be caught and emitted to the exception port if the port is connected. If the exception port is connected to an exception handler this will stop the floe

  • Type: string

  • Default: [‘begin’]

  • Choices: [‘begin’, ‘process’, ‘end’]

Enable cube timing report (time_all_cubes): If true this cube will emit timing information to the timing_data port.

  • Type: boolean

  • Default: True

  • Choices: [True, False]