Multi-Stage ROCS X Preparation

Description

This floe orchestrates the multi-stage preparation of a ROCS X 3D library starting from building block SMILES. It launches one “head” job that runs the following five “stage” floes in sequence:

  1. Reaction & Reagent Database - Multi-vendor - Create from BULK SMILES: Ingests building block SMILES from multiple vendors into a Reaction & Reagent database.

  2. Reaction & Reagent Database - Directory Listing: Generates a directory listing for the Reaction & Reagent database.

  3. Reaction & Reagent Database - Multi-vendor - Parallel Export Synthon Collection: Exports a 2D synthon collection from the Reaction & Reagent database.

  4. Reaction & Reagent Database - Synthon Collection Directory Listing: Generates a directory listing for the 2D synthon collection.

  5. ROCS X - Prepare 3D Library: Prepares a ROCS X 3D library for use in ROCS X 3D similarity searches.

The Multi-Stage Floe handles inputs and outputs for the stage floes. Outputs from completed stages are automatically passed to downstream stages. The Multi-Stage Floe starts where it first receives input, so early stages can be skipped by leaving their inputs blank and providing input to the designated starting stage. All outputs generated by the Multi-Stage Floe can be used in other contexts on Orion.

Promoted Parameters

Title in user interface (promoted name)

Orchestration Settings

Unique Orchestration Job Tag (unique_job_tag): Tag to apply to this job and all launched jobs for easy identification. If provided tag is not unique, a suffix will be added to ensure uniqueness.

  • Type: string

  • Default: MS:ROCSX_Prep

Exact Prep Floes Package Version (prep_floes_package_version): Exact version of the ROCS X Preparation Floes package to use.

  • Type: string

  • Default: 2.8.1

Exact Search Floes Package Version (floes_package_version): Exact version of the ROCS X Search Floes package to use.

  • Type: string

  • Default: 1.0.0

Generate Live Report (live_report_service): Generates an experimental live report for tracking orchestrated jobs, then replaces with a static one upon job completion. If off only a static report will be generated at the end of the multi-stage orchestration.

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Stage 1: Building Block SMILES Ingestion

Reaction Definition File (rxndefs): The file resource containing the reaction definitions. A sample is available from the OpenEye Organization Data resources.

  • Type: file_in

VendorA SMI File(s) (A_SMI): One or more previously uploaded SMILES (.smi) or CXSMILES (.cxsmiles) file resource. Must provide a SMILES and a unique ID for each structure. Gzipped input files are also supported but the file resource MUST have the .gz suffix on the name.

  • Type: file_in

VendorA ID (A_id): The ID or key to use for the vendor.

  • Type: string

  • Default: VendorA_ID

VendorA Name (A_name): The full or descriptive name for the vendor.

  • Type: string

  • Default: Vendor A

VendorA Version (A_ver): Optional vendor version information.

  • Type: string

  • Default: v1.0

Strip Salts (prep_strip_salts_switch): If ON, retains only the largest fragment from each input structure prior to indexing. The default of OFF means that the additional input fragments are retained, which can result in a reduction of classified reagents for some reagent classes due to the presence of competing fragment chemistries.

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Record Batch Size (prep1_batchsize): The number of records to emit in each shard/block.

  • Type: integer

  • Default: 10000

Generate RRDB Report (generate_rrdb_report): Also generates a report for the Reaction & Reagent Database.

  • Required

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Stage 1: Additional Building Block Sources

VendorB SMI File(s) (B_SMI): The file resource containing the vendor B SMILES.

  • Type: file_in

VendorB ID (B_id): The ID or key to use for vendor B.

  • Type: string

  • Default: VendorB_ID

VendorB Name (B_name): The full or descriptive name for vendor B.

  • Type: string

  • Default: Vendor B

VendorB Version (B_ver): Optional vendor B version information.

  • Type: string

  • Default: v1.0

VendorC SMI File(s) (C_SMI): The file resource containing the vendor C SMILES.

  • Type: file_in

VendorC ID (C_id): The ID or key to use for vendor C.

  • Type: string

  • Default: VendorC_ID

VendorC Name (C_name): The full or descriptive name for vendor C.

  • Type: string

  • Default: Vendor C

VendorC Version (C_ver): Optional vendor C version information.

  • Type: string

  • Default: v1.0

Stage 2: Export 2D Synthon Library

Input Reaction & Reagent Database (rxndb_in): The RRDB file resource to extract synthons from.

  • Type: file_in

Generate Synthon Directory (generate_synthon_report): Also generates a directory for the 2D Synthon Collection.

  • Required

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Deprotection Transforms (prep2_deprotxforms): Optional file resource defining synthon deprotection transformations.

  • Type: file_in

Stage 3: Prepare 3D Library

Input 2D Synthon Library (reagcoll_in): The input collection for the 2D synthon library.

  • Type: collection_source

Outputs

Stage 1 Output: Reaction & Reagent Database (rrdb_output): The name of the output Reaction & Reagent Database.

  • Type: string

  • Default: RRDB

Stage 2 Output: ROCS X 2D Synthon Library (synth_coll_output): The name of the output collection for the 2D Synthon Library.

  • Type: string

  • Default: ROCS X 2D Synthon Library

Stage 3 Output: ROCS X 3D Library (out_coll): The name of the output collection for the 3D library.

  • Type: string

  • Default: ROCS X 3D Library