Multi-Stage ROCS X Preparation
Description
This floe orchestrates the multi-stage preparation of a ROCS X 3D library starting from building block SMILES. It launches one “head” job that runs the following five “stage” floes in sequence:
Reaction & Reagent Database - Multi-vendor - Create from BULK SMILES: Ingests building block SMILES from multiple vendors into a Reaction & Reagent database.
Reaction & Reagent Database - Directory Listing: Generates a directory listing for the Reaction & Reagent database.
Reaction & Reagent Database - Multi-vendor - Parallel Export Synthon Collection: Exports a 2D synthon collection from the Reaction & Reagent database.
Reaction & Reagent Database - Synthon Collection Directory Listing: Generates a directory listing for the 2D synthon collection.
ROCS X - Prepare 3D Library: Prepares a ROCS X 3D library for use in ROCS X 3D similarity searches.
The Multi-Stage Floe handles inputs and outputs for the stage floes. Outputs from completed stages are automatically passed to downstream stages. The Multi-Stage Floe starts where it first receives input, so early stages can be skipped by leaving their inputs blank and providing input to the designated starting stage. All outputs generated by the Multi-Stage Floe can be used in other contexts on Orion.
Promoted Parameters
Title in user interface (promoted name)
Orchestration Settings
Unique Orchestration Job Tag (unique_job_tag): Tag to apply to this job and all launched jobs for easy identification. If provided tag is not unique, a suffix will be added to ensure uniqueness.
Type: string
Default: MS:ROCSX_Prep
Exact Prep Floes Package Version (prep_floes_package_version): Exact version of the ROCS X Preparation Floes package to use.
Type: string
Default: 2.8.1
Exact Search Floes Package Version (floes_package_version): Exact version of the ROCS X Search Floes package to use.
Type: string
Default: 1.0.0
Generate Live Report (live_report_service): Generates an experimental live report for tracking orchestrated jobs, then replaces with a static one upon job completion. If off only a static report will be generated at the end of the multi-stage orchestration.
Type: boolean
Default: False
Choices: [True, False]
Stage 1: Building Block SMILES Ingestion
Reaction Definition File (rxndefs): The file resource containing the reaction definitions. A sample is available from the OpenEye Organization Data resources.
Type: file_in
VendorA SMI File(s) (A_SMI): One or more previously uploaded SMILES (.smi) or CXSMILES (.cxsmiles) file resource. Must provide a SMILES and a unique ID for each structure. Gzipped input files are also supported but the file resource MUST have the .gz suffix on the name.
Type: file_in
VendorA ID (A_id): The ID or key to use for the vendor.
Type: string
Default: VendorA_ID
VendorA Name (A_name): The full or descriptive name for the vendor.
Type: string
Default: Vendor A
VendorA Version (A_ver): Optional vendor version information.
Type: string
Default: v1.0
Strip Salts (prep_strip_salts_switch): If ON, retains only the largest fragment from each input structure prior to indexing. The default of OFF means that the additional input fragments are retained, which can result in a reduction of classified reagents for some reagent classes due to the presence of competing fragment chemistries.
Type: boolean
Default: True
Choices: [True, False]
Record Batch Size (prep1_batchsize): The number of records to emit in each shard/block.
Type: integer
Default: 10000
Generate RRDB Report (generate_rrdb_report): Also generates a report for the Reaction & Reagent Database.
Required
Type: boolean
Default: True
Choices: [True, False]
Stage 1: Additional Building Block Sources
VendorB SMI File(s) (B_SMI): The file resource containing the vendor B SMILES.
Type: file_in
VendorB ID (B_id): The ID or key to use for vendor B.
Type: string
Default: VendorB_ID
VendorB Name (B_name): The full or descriptive name for vendor B.
Type: string
Default: Vendor B
VendorB Version (B_ver): Optional vendor B version information.
Type: string
Default: v1.0
VendorC SMI File(s) (C_SMI): The file resource containing the vendor C SMILES.
Type: file_in
VendorC ID (C_id): The ID or key to use for vendor C.
Type: string
Default: VendorC_ID
VendorC Name (C_name): The full or descriptive name for vendor C.
Type: string
Default: Vendor C
VendorC Version (C_ver): Optional vendor C version information.
Type: string
Default: v1.0
Stage 2: Export 2D Synthon Library
Input Reaction & Reagent Database (rxndb_in): The RRDB file resource to extract synthons from.
Type: file_in
Generate Synthon Directory (generate_synthon_report): Also generates a directory for the 2D Synthon Collection.
Required
Type: boolean
Default: True
Choices: [True, False]
Deprotection Transforms (prep2_deprotxforms): Optional file resource defining synthon deprotection transformations.
Type: file_in
Stage 3: Prepare 3D Library
Input 2D Synthon Library (reagcoll_in): The input collection for the 2D synthon library.
Type: collection_source
Outputs
Stage 1 Output: Reaction & Reagent Database (rrdb_output): The name of the output Reaction & Reagent Database.
Type: string
Default: RRDB
Stage 2 Output: ROCS X 2D Synthon Library (synth_coll_output): The name of the output collection for the 2D Synthon Library.
Type: string
Default: ROCS X 2D Synthon Library
Stage 3 Output: ROCS X 3D Library (out_coll): The name of the output collection for the 3D library.
Type: string
Default: ROCS X 3D Library