Reaction & Reagent Database - Reaction Definition Validator
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Role-based/Cheminformatician/Medicinal Chemistry Support
This floe will perform a series of validations on the provided Reaction & Reagent Database to verify and diagnose issues with the reaction definition file used in the database creation.
Required Inputs:
A Reaction & Reagent Database generated by any of the Reaction & Reagent Database - Create from Dataset, Reaction & Reagent Database - Create from SMILES, or Reaction & Reagent Database - Create from ZINC Download Floes is required for input.
The validation output is a floe report with validation status for each reaction from the reaction definition file and the classified reagents in the database.
Optional Activities:
Reagent Sampling: If “On”, reagents will be sampled from the full set of reagents in the Reaction & Reagent database. If “Off”, a limited number of reagents are extracted, generally in reagent registration order. Sampling adds overhead to extract the full set of reagents, but it can provide a better overview of the variations in the chemistry of the reagents.
Reaction Validations: This parameter can be used to provide a list of reaction IDs (space- or comma-delimited) for the validation activity, or left blank to process all the reactions from the Reaction & Reagent Database.
Reaction & Reagent Directory: If “On”, it will generate a floe report that is a listing of all the reactions in the database.
Output Generated Products: If “On”, all the valid products generated during the validation will be output to the Generated Products Dataset. If “Off”, just a count of the products generated is provided.
Output Unreacted Products: If “On”, reagents that fail to transform to products will be output to the Unreacted Reagents Dataset. If “Off”, just a count of the products generated is provided.
Output Valence Errors: If “On”, all the products that fail valence validations will be output to the Valence Error Dataset. If “Off”, just a count of the products generated is provided.
Max Reagents: For N-components, this value to the Nth power will be used for the validation. Generally something less that 1,000 for 2-component reactions is recommended.
Strict Valences: This parameter controls the behavior of the selected Check Valences parameter behavior by rejecting valence errors outright or by rejecting valence errors after a valence repair is attempted.
Max [Pass|Valence|Fail] Results: Caps the number of output structures for each category.
Sample Seed: A specific random seed (generally a 6-digit integer) can be provided to allow reproducibility of the floe when Reagent Sampling is employed.
Logging Verbosity: Generally only warning level verbosity is recommended, but specific problems with the floe may need information level reporting for support tickets.
Promoted Parameters
Title in user interface (promoted name)
Reaction & Reagent Database (rxndb): The name of the Reaction & Reagent Database to use.
Type: file_in
Reaction Selection (reactions): One or more reaction selections from the sample reaction database list or All for every reaction.
Type: string
Default: [‘All’]
Choices: [‘All’, ‘3-nitrile-pyridine’, ‘Buchwald-Hartwig’, ‘Buchwald_cross_coupling1’, ‘Buchwald_cross_coupling2’, ‘Ester_hydrolysis-amide_synthesis1’, ‘Ester_hydrolysis-amide_synthesis2’, ‘Grignard_alcohol’, ‘Grignard_carbonyl’, ‘Heck_non-terminal_vinyl’, ‘Heck_terminal_vinyl’, ‘Huisgen_disubst-alkyne’, ‘Mitsunobu_imide’, ‘Mitsunobu_phenol’, ‘Mitsunobu_sulfonamide’, ‘Mitsunobu_tetrazole_1’, ‘Mitsunobu_tetrazole_2’, ‘N-alkylation1’, ‘N-alkylation2’, ‘N-arylation_heterocycles’, ‘Negishi’, ‘Niementowski_quinazoline’, ‘O-alkylation’, ‘O-biarylation’, ‘Pictet-Spengler’, ‘Reductive_amination1’, ‘Reductive_amination2’, ‘Schotten-Baumann_amide’, ‘SnAr1’, ‘SnAr2’, ‘Sonogashira’, ‘Stille’, ‘Suzuki_cross_coupling’, ‘Wittig’, ‘benzimidazole_derivatives_aldehyde’, ‘benzimidazole_derivatives_carboxylic-acid/ester’, ‘benzofuran’, ‘benzothiazole’, ‘benzothiophene’, ‘benzoxazole_arom-aldehyde’, ‘benzoxazole_carboxylic-acid’, ‘decarboxylative_coupling’, ‘heteroaromatic_nuc_sub’, ‘imidazole’, ‘indole’, ‘nucl_sub_aromatic_ortho_nitro’, ‘nucl_sub_aromatic_para_nitro’, ‘oxadiazole’, ‘phthalazinone’, ‘piperidine_indole’, ‘pyrazole’, ‘spiro-chromanone’, ‘sulfon_amide’, ‘tetrazole_connect_regioisomere_1’, ‘tetrazole_connect_regioisomere_2’, ‘tetrazole_terminal’, ‘thiazole’, ‘triaryl-imidazole’, ‘urea’]
Custom Reaction Names (customreactions): One or more reactions from the Reaction & Reagent Database (blank delimited). Any value for this parameter supersedes a Reaction Selection above.
Type: string
Max Reagents (maxreagents): Limit the number of validation reagents to this value: the number of reaction validations performed is the square of this value.
Type: integer
Default: 10
Reaction & Reagent Directory (rxndir): If ON, generates a directory listing for the Reaction & Reagent Database.
Type: boolean
Default: True
Choices: [True, False]
Output Product Options
Output Generated Products (pass): If OFF, just counts product records, but does not output them.
Type: boolean
Default: True
Choices: [True, False]
Max Pass Results (maxpass): Output limited to this number of passing validation results for each reaction validated, or 0 for all.
Type: integer
Default: 10
Generated Products Dataset (output): Output dataset containing generated products.
Type: dataset_out
Default: Generated_products
Output Failure Options
Output Unreacted Reagents (fail): If OFF, just counts unreacted reagents, but does not output them.
Type: boolean
Default: False
Choices: [True, False]
Max Fail Results (maxfail): Output limited to this number of failure results for each reaction validated, or 0 for all.
Type: integer
Default: 10
Unreacted Reagents Dataset (failures): Output dataset containing input failures and reagents that failed to react.
Type: dataset_out
Default: Unreacted_reagents
Output Valence Failure Options
Output Valence Errors (valfail): If OFF, just counts valence errors, but does not output them.
Type: boolean
Default: False
Choices: [True, False]
Max Valence Errors (maxvalerr): Output limited to this number of valence errors for each reaction validated, or 0 for all.
Type: integer
Default: 10
Valence Error Dataset (valfailures): Output dataset containing products with valence errors.
Type: dataset_out
Default: Valence_errors
Output Half-reaction Failure Options
Output Half Rxn Errors (halfrxnfail): If OFF, just counts half reaction validation errors, but does not output them.
Type: boolean
Default: False
Choices: [True, False]
Max Half Reaction Errors (maxhalfrxnerr): Output limited to this number of half reaction errors for each reaction validated, or 0 for all.
Type: integer
Default: 10
Half Rxn Error Dataset (halfrxnfailures): Output dataset with half reaction validation errors.
Type: dataset_out
Default: Halfrxn_errors
Advanced Options
Strict Valences (strictval): If Check Valences is active, any valence issues found after the transformation is applied terminate further application.
Type: boolean
Default: True
Choices: [True, False]
Check Valences (valence): How to handle valence issues for the generated products.
Type: string
Default: reject
Choices: [‘ignore’, ‘fix’, ‘reject’]
Sample Seed (seed): Uses the specified seed for any sampling activities.
Type: integer
Reagent Sampling (reagsampling): If ON(slower) or OFF(faster), will sample the number of reagents specified by the Max Reagents parameter. Otherwise, the first Max Reagents will be used for the validation.
Type: boolean
Default: False
Choices: [True, False]
Classifier Memory Limit (classifiermem): The memory limit for the reaction classifier. It may need to be increased for large R&R Databases.
Type: decimal
Default: 10240
Logging Verbosity (verbose): How much logging output to generate during validation activities.
Type: string
Default: warning
Choices: [‘info’, ‘warning’, ‘error’, ‘debug’, ‘ddebug’]