ROCS X - Prepare 3D Library

Description

This floe prepares a ROCS X 3D library for use in ROCS X 3D similarity searches. First, the floe reads in a 2D synthon library with reaction and reagent classifications. Then, it dedupes synthons within the reagent lists. Next, it generates 3D conformers for the deduped synthons and a sample of products using OMEGA. The product sample is obtained by pairing each synthon with a random synthon in its complementary reagent list.

Key Inputs and Outputs

The key input is a ROCS X 2D synthon library. This is typically output from the Reaction & Reagent Database - Multi-vendor - Parallel Export Synthon Collection Floe in the Generative Design Hit-to-Lead Floes package.

The key output is a ROCS X 3D library containing 3D conformers for the synthons and a product sample as well as duplicate information for the synthons. It is typically used as input for the ROCS X - Initialize 3D Search Floe.

Cost Considerations

The floe cost scales with the number of synthons in the input 2D synthon library. The computational bottleneck is typically caused by conformer generation for the product sample because products tend to be larger than synthons. For very large libraries, the computational bottleneck can be avoided by turning off product sampling (under Options: Sample Products, use Enable Sample Products: Off). This has implications when using the library in downstream floes (see the Speed Up Preparation of a Very Large ROCS X 3D Library How-To Guide in the documentation).

Promoted Parameters

Title in user interface (promoted name)

Inputs

ROCS X 2D Synthon Library (reagcoll): The input collection for the 2D synthon library.

  • Required

  • Type: collection_source

Reaction Constraint (rxnlist): Select one or more reactions from the sample reaction database list. Select ‘All’ to use all reactions and ‘Custom’ to use the Custom Reaction Constraint below.

  • Required

  • Type: string

  • Default: [‘All’]

  • Choices: [‘All’, ‘Custom’, ‘3-nitrile-pyridine’, ‘Buchwald-Hartwig’, ‘Buchwald_cross_coupling1’, ‘Buchwald_cross_coupling2’, ‘Ester_hydrolysis-amide_synthesis1’, ‘Ester_hydrolysis-amide_synthesis2’, ‘Grignard_alcohol’, ‘Grignard_carbonyl’, ‘Heck_non-terminal_vinyl’, ‘Heck_terminal_vinyl’, ‘Huisgen_disubst-alkyne’, ‘Mitsunobu_imide’, ‘Mitsunobu_phenol’, ‘Mitsunobu_sulfonamide’, ‘Mitsunobu_tetrazole_1’, ‘Mitsunobu_tetrazole_2’, ‘N-alkylation1’, ‘N-alkylation2’, ‘N-arylation_heterocycles’, ‘Negishi’, ‘Niementowski_quinazoline’, ‘O-alkylation’, ‘O-biarylation’, ‘Pictet-Spengler’, ‘Reductive_amination1’, ‘Reductive_amination2’, ‘Schotten-Baumann_amide’, ‘SnAr1’, ‘SnAr2’, ‘Sonogashira’, ‘Stille’, ‘Suzuki_cross_coupling’, ‘Wittig’, ‘benzimidazole_derivatives_aldehyde’, ‘benzimidazole_derivatives_carboxylic-acid/ester’, ‘benzofuran’, ‘benzothiazole’, ‘benzothiophene’, ‘benzoxazole_arom-aldehyde’, ‘benzoxazole_carboxylic-acid’, ‘decarboxylative_coupling’, ‘heteroaromatic_nuc_sub’, ‘imidazole’, ‘indole’, ‘nucl_sub_aromatic_ortho_nitro’, ‘nucl_sub_aromatic_para_nitro’, ‘oxadiazole’, ‘phthalazinone’, ‘piperidine_indole’, ‘pyrazole’, ‘spiro-chromanone’, ‘sulfon_amide’, ‘tetrazole_connect_regioisomere_1’, ‘tetrazole_connect_regioisomere_2’, ‘tetrazole_terminal’, ‘thiazole’, ‘triaryl-imidazole’, ‘urea’]

Custom Reaction Constraint (customrxnlist): Custom input for a comma-delimited list of reaction names to process from the reaction database. If Reaction Constraint is not ‘Custom’, this field will be ignored.

  • Type: string

Filter Ring Forming Reactions (filter_ringforming_flag): Toggle On to filter out ring-forming reactions. Off is not currently supported.

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Use Deduped Reagents (if available) (use_deduped_reagents_flag):

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Outputs

Temporary Collection (temporary_collection): This collection is created by the floe for internal use during the run and is automatically deleted by the floe when it finishes.

  • Type: collection_sink

  • Default: Temporary Collection

ROCS X 3D Library Collection (out_coll): The name of the output collection for the 3D library.

  • Required

  • Type: collection_sink

  • Default: ROCS X 3D Library

Synthon Failures Collection (synthon_failures_out): The name of the output collection for synthon prep failures.

  • Type: collection_sink

  • Default: ROCS X Prepare 3D Library Failures (Synthons)

Product Failures Collection (product_failures_out): The name of the output collection for product prep failures.

  • Type: collection_sink

  • Default: ROCS X Prepared 3D Library Failures (Products)

Options: Filtering

Max Molecular Weight (mw_max): Molecules with molecular weight greater than this value will be filtered out. If unspecified, this cube will not filter out molecules with high molecular weight.

  • Type: decimal

  • Default: 500

Min Molecular Weight (mw_min): Molecules with molecular weight less than this value will be filtered out. If unspecified, this cube will not filter out molecules with low molecular weight.

  • Type: decimal

Max Rotatable Bond Count (rot_bond_max): Molecules with rotatable bond count greater than this value will be filtered out. If unspecified, this cube will not filter out molecules with high rotatable bond count.

  • Type: integer

  • Default: 15

Min Rotatable Bond Count (rot_bond_min): Molecules with rotatable bond count less than this value will be filtered out. If unspecified, this cube will not filter out molecules with low rotatable bond count.

  • Type: integer

Max Count Undefined Atom Stereo (atom_stereo_max): Molecules with count undefined atom stereo greater than this value will be filtered out. If unspecified, this cube will not filter out molecules with high count undefined atom stereo.

  • Type: integer

  • Default: 3

Max Count Undefined Bond Stereo (bond_stereo_max): Molecules with count undefined bond stereo greater than this value will be filtered out. If unspecified, this cube will not filter out molecules with high count undefined bond stereo.

  • Type: integer

  • Default: 3

Options: Sample Products

Enable Generate Conformers (gen_confs_flag):

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Enable Sample Products (sample_products_flag):

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Product Normalization (protomer_prep_mode): Tautomer and ionization state normalization applied to products.

  • Type: string

  • Default: Get reasonable protomer and set neutral pH

  • Choices: [‘Get reasonable protomer and set neutral pH’, ‘Set neutral pH’, ‘None’]

Number of Synthon Pairs (num_rand_pairs): The number of random complementary components to pair with each synthon. The sampling method ensures all synthons are represented at least once. At the default value (1), the number of sampled products will be equal to the number of synthons. Warning: Increasing this from the default value (1) will significantly increase the cost of running the floe.

  • Type: integer

  • Default: 1

Pairing Attempts (while_max_count): The number of times to attempt synthon pairing when sampling products. If the number of attempts is exceeded, the synthon will be passed during product sampling.

  • Type: integer

  • Default: 10

Options: Advanced

Validate Conformers (validate_conformers_flag):

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Read Synthon Chunk Size (read_reagent_chunk_size): The number of 2D synthons to process in a chunk.

  • Type: integer

  • Default: 10000

Product Chunk Size (product_chunk_size): The number of 2D unenumerated products to process in a chunk. The maximum value is 2000.

  • Type: integer

  • Default: 500

Synthon Chunk Size (synthon_chunk_size): The number of 2D enumerated synthons to process in a chunk.

  • Type: integer

  • Default: 5000

FastROCS Chunk Size (records_per_shard): The target number of records in a FastROCS shard. The recommended default is 800,000.

  • Type: integer

  • Default: 800000