Reaction & Reagent Database - Reaction Definition Validator
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Role-based/Cheminformatician/Medicinal Chemistry Support
Description
This floe will perform a series of validations on the provided Reaction & Reagent Database to verify and diagnose issues with the reaction definition file used in the database creation.
Required Inputs:
A Reaction & Reagent Database generated by any of the Reaction & Reagent Database - Create from Dataset, Reaction & Reagent Database - Create from SMILES, or Reaction & Reagent Database - Create from ZINC Download Floes is required for input.
The validation output is a floe report with validation status for each reaction from the reaction definition file and the classified reagents in the database.
Optional Activities:
Reagent Sampling: If “On”, reagents will be sampled from the full set of reagents in the Reaction & Reagent database. If “Off”, a limited number of reagents are extracted, generally in reagent registration order. Sampling adds overhead to extract the full set of reagents, but it can provide a better overview of the variations in the chemistry of the reagents.
Reaction Validations: This parameter can be used to provide a list of reaction IDs (space- or comma-delimited) for the validation activity, or left blank to process all the reactions from the Reaction & Reagent Database.
Reaction & Reagent Directory: If “On”, it will generate a floe report that is a listing of all the reactions in the database.
Output Generated Products: If “On”, all the valid products generated during the validation will be output to the Generated Products Dataset. If “Off”, just a count of the products generated is provided.
Output Unreacted Products: If “On”, reagents that fail to transform to products will be output to the Unreacted Reagents Dataset. If “Off”, just a count of the products generated is provided.
Output Valence Errors: If “On”, all the products that fail valence validations will be output to the Valence Error Dataset. If “Off”, just a count of the products generated is provided.
Max Reagents: For N-components, this value to the Nth power will be used for the validation. Generally something less that 1,000 for 2-component reactions is recommended.
Strict Valences: This parameter controls the behavior of the selected Check Valences parameter behavior by rejecting valence errors outright or by rejecting valence errors after a valence repair is attempted.
Max [Pass|Valence|Fail] Results: Caps the number of output structures for each category.
Sample Seed: A specific random seed (generally a 6-digit integer) can be provided to allow reproducibility of the floe when Reagent Sampling is employed.
Logging Verbosity: Generally only warning level verbosity is recommended, but specific problems with the floe may need information level reporting for support tickets.
Promoted Parameters
Title in user interface (promoted name)
Inputs
Reaction & Reagent Database (rxndb): The name of the Reaction & Reagent Database to use.
Required
Type: file_in
Reaction Selection (reactions): One or more reaction selections from the sample reaction database list or All for every reaction.
Type: string
Default: [‘All’]
Choices: [‘All’, ‘3-nitrile-pyridine’, ‘Buchwald-Hartwig’, ‘Buchwald_cross_coupling1’, ‘Buchwald_cross_coupling2’, ‘Ester_hydrolysis-amide_synthesis1’, ‘Ester_hydrolysis-amide_synthesis2’, ‘Grignard_alcohol’, ‘Grignard_carbonyl’, ‘Heck_non-terminal_vinyl’, ‘Heck_terminal_vinyl’, ‘Huisgen_disubst-alkyne’, ‘Mitsunobu_imide’, ‘Mitsunobu_phenol’, ‘Mitsunobu_sulfonamide’, ‘Mitsunobu_tetrazole_1’, ‘Mitsunobu_tetrazole_2’, ‘N-alkylation1’, ‘N-alkylation2’, ‘N-arylation_heterocycles’, ‘Negishi’, ‘Niementowski_quinazoline’, ‘O-alkylation’, ‘O-biarylation’, ‘Pictet-Spengler’, ‘Reductive_amination1’, ‘Reductive_amination2’, ‘Schotten-Baumann_amide’, ‘SnAr1’, ‘SnAr2’, ‘Sonogashira’, ‘Stille’, ‘Suzuki_cross_coupling’, ‘Wittig’, ‘benzimidazole_derivatives_aldehyde’, ‘benzimidazole_derivatives_carboxylic-acid/ester’, ‘benzofuran’, ‘benzothiazole’, ‘benzothiophene’, ‘benzoxazole_arom-aldehyde’, ‘benzoxazole_carboxylic-acid’, ‘decarboxylative_coupling’, ‘heteroaromatic_nuc_sub’, ‘imidazole’, ‘indole’, ‘nucl_sub_aromatic_ortho_nitro’, ‘nucl_sub_aromatic_para_nitro’, ‘oxadiazole’, ‘phthalazinone’, ‘piperidine_indole’, ‘pyrazole’, ‘spiro-chromanone’, ‘sulfon_amide’, ‘tetrazole_connect_regioisomere_1’, ‘tetrazole_connect_regioisomere_2’, ‘tetrazole_terminal’, ‘thiazole’, ‘triaryl-imidazole’, ‘urea’]
Custom Reaction Names (customreactions): One or more reactions from the Reaction & Reagent Database (blank delimited). Any value for this parameter supersedes a Reaction Selection above.
Type: string
Max Reagents (maxreagents): Limit the number of validation reagents to this value: the number of reaction validations performed is the square of this value.
Type: integer
Default: 10
Reaction & Reagent Directory (rxndir): If ON, generates a directory listing for the Reaction & Reagent Database.
Type: boolean
Default: True
Choices: [True, False]
Output Product Options
Output Generated Products (pass): If OFF, just counts product records, but does not output them.
Required
Type: boolean
Default: True
Choices: [True, False]
Max Pass Results (maxpass): Output limited to this number of passing validation results for each reaction validated, or 0 for all.
Type: integer
Default: 10
Generated Products Dataset (output): Output dataset containing generated products.
Required
Type: dataset_out
Default: Generated_products
Output Failure Options
Output Unreacted Reagents (fail): If OFF, just counts unreacted reagents, but does not output them.
Required
Type: boolean
Default: False
Choices: [True, False]
Max Fail Results (maxfail): Output limited to this number of failure results for each reaction validated, or 0 for all.
Type: integer
Default: 10
Unreacted Reagents Dataset (failures): Output dataset containing input failures and reagents that failed to react.
Required
Type: dataset_out
Default: Unreacted_reagents
Output Valence Failure Options
Output Valence Errors (valfail): If OFF, just counts valence errors, but does not output them.
Required
Type: boolean
Default: False
Choices: [True, False]
Max Valence Errors (maxvalerr): Output limited to this number of valence errors for each reaction validated, or 0 for all.
Type: integer
Default: 10
Valence Error Dataset (valfailures): Output dataset containing products with valence errors.
Required
Type: dataset_out
Default: Valence_errors
Output Half-reaction Failure Options
Output Half Rxn Errors (halfrxnfail): If OFF, just counts half reaction validation errors, but does not output them.
Required
Type: boolean
Default: False
Choices: [True, False]
Max Half Reaction Errors (maxhalfrxnerr): Output limited to this number of half reaction errors for each reaction validated, or 0 for all.
Type: integer
Default: 10
Half Rxn Error Dataset (halfrxnfailures): Output dataset with half reaction validation errors.
Required
Type: dataset_out
Default: Halfrxn_errors
Advanced Options
Strict Valences (strictval): If Check Valences is active, any valence issues found after the transformation is applied terminate further application.
Type: boolean
Default: True
Choices: [True, False]
Check Valences (valence): How to handle valence issues for the generated products.
Required
Type: string
Default: reject
Choices: [‘ignore’, ‘fix’, ‘reject’]
Sample Seed (seed): Uses the specified seed for any sampling activities.
Type: integer
Reagent Sampling (reagsampling): If ON(slower) or OFF(faster), will sample the number of reagents specified by the Max Reagents parameter. Otherwise, the first Max Reagents will be used for the validation.
Type: boolean
Default: False
Choices: [True, False]
Classifier Memory Limit (classifiermem): The memory limit for the reaction classifier. It may need to be increased for large R&R Databases.
Required
Type: decimal
Default: 10240
Logging Verbosity (verbose): How much logging output to generate during validation activities.
Type: string
Default: warning
Choices: [‘info’, ‘warning’, ‘error’, ‘debug’, ‘ddebug’]