Focused Library - Synthon Analogs¶
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Solution-based/Hit to Lead/Generative Design/Reaction-based Libraries
Task-based/Library Prep & Design/Reaction-based Enumeration
Task-based/Virtual Screening - Structure-Based
Role-based/Medicinal Chemist
Description
This floe performs a single-step retro-synthetic analysis of the input lead molecule(s) and applies the corresponding reaction transformations to generate analog libraries. All applied transforms are provided in the Reaction & Reagent Database.
Required Inputs:
Both the Reaction & Reagent Database and an input lead molecule dataset are required. Sample databases are available as File resources in the Organization Data/OpenEye Data/Generative Design Data folder as 2022_2_ZINC_5K_lowcomplexity.db and 2022_2_ZINC_5K_highinterest.db. The former samples ZINC reagents (for each reagent class in the database) with low molecular complexity values, while the latter contains ZINC reagents of high medchem interest scores.
Required Outputs:
The name of an output dataset should be specified, as the Output Data parameter is “On” by default. See the discussion of prospective runs below.
Optional Activities:
The Molecule ID Field should generally match the source of the input lead molecules for the Reaction & Reagent Database file. In the case of ZINC as the source, zinc_id is the standard structure ID field.
There is a small set of preselected properties, Compute Molecule Properties, that can be computed on the generated products, or this activity can be disabled by removing all the properties from the list.
For prospective and trial activities, the Output Data, Output Failures, and Output Specific Failures booleans, when set to “Off”, will provide counts of the outputs from the floe without creating dataset(s). This is useful for validating the input options against a specific input lead molecule dataset prior running a capture run to generate output dataset(s).
The Check Valences and Strict Valences options control whether rejecting or fixing of valence issues are allowed and/or whether any illegal valence in the product results in rejection from the output products.
The Strict Classification option controls whether lead molecules are classified according to both the required and disallowed chemical features (defined by the Reaction & Reagent Database reactions) or simply by the required features. Turning “Off” the strict option may generate alternate (or even surprising) products due to reactions at undesirable sites.
The Fragmentation Size option adds a constraint to the size of the reagents generated from the retro reaction transformation(s) application, where a smaller value allows smaller reagents and a larger value requires larger reagents, specified as a heavy-atom percentage of the input molecules.
General Considerations¶
If specific reagents or reactions are specified, the analysis of the input lead molecule will be restricted to those reactions.
If one or more reagent classes are specified and the retro-synthetic analysis of the input molecule is productive for that reaction, the unspecified reagent of the reaction is kept fixed, and the specified reagent is varied based on sampled reagents from the Reaction & Reagent Database.
Promoted Parameters
Title in user interface (promoted name)
Inputs
Lead Molecule Dataset (lead_molecule): A dataset containing the lead molecule(s) to be transformed by reactions from the Reaction & Reagent Database. This dataset is assumed to be a dataset of ONE lead molecule due to the amplification of product(s) from the floe, but the input limit can be altered in the Advanced Focused Library Options tab. Generally small input datasets are expected.
Required
Type: data_source
Reaction & Reagent Database (rxndb): The name of the Reaction & Reagent Database to use. Sample databases are available as file resources in the Organization Data/OpenEye Data/Generative Design Data folder.
Required
Type: file_in
Outputs
Output Dataset (output): Output dataset containing generated products.
Required
Type: dataset_out
Default: Reaction_products
Output Data (outdata): If OFF, just counts records, but does not output them.
Required
Type: boolean
Default: True
Choices: [True, False]
General Failures (failures): Output dataset containing input failures and reagents that failed to react.
Required
Type: dataset_out
Default: Input_failures
Output Failures (outfails): If OFF, just counts records, but does not output them.
Required
Type: boolean
Default: False
Choices: [True, False]
Specific Product Failures (prodfailures): Output dataset containing specific reagent combinations that failed to react.
Required
Type: dataset_out
Default: Product_failures
Output Specific Failures (outprodfails): If OFF, just counts records, but does not output them.
Required
Type: boolean
Default: False
Choices: [True, False]
Focused Library Options
Reactions or Reagents (queryclass): A list of reactions and/or reagents for selection of transforms. If this is a list of reagents, the input molecules will be verified against this reagent type, or presumed to be this reagent type if the Verify Classifications switch is OFF.
Type: string
Default: []
Choices: [‘3-nitrile-pyridine’, ‘3-nitrile-pyridine:Diones_2_4’, ‘Buchwald-Hartwig’, ‘Buchwald-Hartwig:Amines’, ‘Buchwald-Hartwig:Halides_aryl’, ‘Buchwald_cross_coupling1’, ‘Buchwald_cross_coupling1:Amines’, ‘Buchwald_cross_coupling1:Aryl_halides’, ‘Buchwald_cross_coupling2’, ‘Buchwald_cross_coupling2:Amines’, ‘Buchwald_cross_coupling2:Aryl_halides’, ‘Ester_hydrolysis-amide_synthesis1’, ‘Ester_hydrolysis-amide_synthesis1:Amines’, ‘Ester_hydrolysis-amide_synthesis1:Esters’, ‘Ester_hydrolysis-amide_synthesis2’, ‘Ester_hydrolysis-amide_synthesis2:Amines’, ‘Ester_hydrolysis-amide_synthesis2:Esters’, ‘Grignard_alcohol’, ‘Grignard_alcohol:Halides_alkyl’, ‘Grignard_alcohol:Ketones_aldehydes’, ‘Grignard_carbonyl’, ‘Grignard_carbonyl:Halides_alkyl_aryl’, ‘Grignard_carbonyl:Nitriles’, ‘Heck_non-terminal_vinyl’, ‘Heck_non-terminal_vinyl:Halide_vinyl_aryls’, ‘Heck_non-terminal_vinyl:Non_terminal_vinyls’, ‘Heck_terminal_vinyl’, ‘Heck_terminal_vinyl:Halide_vinyl_aryls’, ‘Heck_terminal_vinyl:Terminal_vinyls’, ‘Huisgen_disubst-alkyne’, ‘Huisgen_disubst-alkyne:Alkyl_halides_alcohols’, ‘Huisgen_disubst-alkyne:Alkynes_disubstituted’, ‘Mitsunobu_imide’, ‘Mitsunobu_imide:Acetylacetamides’, ‘Mitsunobu_imide:Alcohols_primary_secondary’, ‘Mitsunobu_phenol’, ‘Mitsunobu_phenol:Alcohols_primary_secondary’, ‘Mitsunobu_phenol:Phenols’, ‘Mitsunobu_sulfonamide’, ‘Mitsunobu_sulfonamide:Alcohols_primary_secondary’, ‘Mitsunobu_sulfonamide:Sulfonamides’, ‘Mitsunobu_tetrazole_1’, ‘Mitsunobu_tetrazole_1:Alcohols_primary_secondary’, ‘Mitsunobu_tetrazole_1:Tetrazoles’, ‘Mitsunobu_tetrazole_2’, ‘Mitsunobu_tetrazole_2:Alcohols_primary_secondary’, ‘Mitsunobu_tetrazole_2:Tetrazoles’, ‘N-alkylation1’, ‘N-alkylation1:Amines’, ‘N-alkylation1:Benzyl_halides’, ‘N-alkylation2’, ‘N-alkylation2:Amines’, ‘N-alkylation2:Benzyl_halides’, ‘N-arylation_heterocycles’, ‘N-arylation_heterocycles:Boronic_acids_aryl’, ‘N-arylation_heterocycles:Pyrrole_like_nitrogens’, ‘Negishi’, ‘Negishi:Alkyl_halides_primary1’, ‘Negishi:Alkyl_halides_primary2’, ‘Niementowski_quinazoline’, ‘Niementowski_quinazoline:Amides_primary’, ‘Niementowski_quinazoline:Aminobenzoic_acids’, ‘O-alkylation’, ‘O-alkylation:Benzyl_halides’, ‘O-alkylation:Phenols’, ‘O-biarylation’, ‘O-biarylation:Aryl_bromides’, ‘O-biarylation:Phenols’, ‘Pictet-Spengler’, ‘Pictet-Spengler:Aldehydes’, ‘Pictet-Spengler:Beta_amino_benzenes’, ‘Reductive_amination1’, ‘Reductive_amination1:Aldehydes’, ‘Reductive_amination1:Amines’, ‘Reductive_amination2’, ‘Reductive_amination2:Aldehydes’, ‘Reductive_amination2:Amines’, ‘Schotten-Baumann_amide’, ‘Schotten-Baumann_amide:Amines’, ‘Schotten-Baumann_amide:Carboxylic_acids’, ‘SnAr1’, ‘SnAr1:Amines’, ‘SnAr1:Heterohalides’, ‘SnAr2’, ‘SnAr2:Amines’, ‘SnAr2:Heterohalides’, ‘Sonogashira’, ‘Sonogashira:Alkynes’, ‘Sonogashira:Bromo_iodo_vinyls_aryls’, ‘Stille’, ‘Stille:Bromo_iodo_vinyls_aryls’, ‘Stille:Halides_aryl’, ‘Suzuki_cross_coupling’, ‘Suzuki_cross_coupling:Aryl_bromides’, ‘Suzuki_cross_coupling:Suzuki_boronics’, ‘Wittig’, ‘Wittig:Alkyl_halides_primary’, ‘Wittig:Ketones_aldehydes’, ‘benzimidazole_derivatives_aldehyde’, ‘benzimidazole_derivatives_aldehyde:Aldehydes’, ‘benzimidazole_derivatives_aldehyde:Aro_6_diamines’, ‘benzimidazole_derivatives_carboxylic-acid/ester’, ‘benzimidazole_derivatives_carboxylic-acid/ester:Aro_6_diamines’, ‘benzimidazole_derivatives_carboxylic-acid/ester:Carboxylic_acids’, ‘benzofuran’, ‘benzofuran:Alkynes’, ‘benzofuran:Halophenols’, ‘benzothiazole’, ‘benzothiazole:Aldehydes’, ‘benzothiazole:Aro_6_thiamines’, ‘benzothiophene’, ‘benzothiophene:Alkynes’, ‘benzothiophene:Halomethiols’, ‘benzoxazole_arom-aldehyde’, ‘benzoxazole_arom-aldehyde:Aminophenols’, ‘benzoxazole_arom-aldehyde:Benzaldehydes’, ‘benzoxazole_carboxylic-acid’, ‘benzoxazole_carboxylic-acid:Aminophenols’, ‘benzoxazole_carboxylic-acid:Carboxylic_acids’, ‘decarboxylative_coupling’, ‘decarboxylative_coupling:Carbonyl_benzoic_acids’, ‘decarboxylative_coupling:Halides_aryl’, ‘heteroaromatic_nuc_sub’, ‘heteroaromatic_nuc_sub:Amines’, ‘heteroaromatic_nuc_sub:Halo_aryls_activated’, ‘imidazole’, ‘imidazole:Alpha_halo_ketones’, ‘imidazole:Aryl_amidines_guanidines’, ‘indole’, ‘indole:Alkynes’, ‘indole:Haloanilines’, ‘nucl_sub_aromatic_ortho_nitro’, ‘nucl_sub_aromatic_ortho_nitro:Amines’, ‘nucl_sub_aromatic_ortho_nitro:Ortho_nitro_halides’, ‘nucl_sub_aromatic_para_nitro’, ‘nucl_sub_aromatic_para_nitro:Amines’, ‘nucl_sub_aromatic_para_nitro:Para_nitro_halides’, ‘oxadiazole’, ‘oxadiazole:Carboxylic_acids’, ‘oxadiazole:Nitriles’, ‘phthalazinone’, ‘phthalazinone:Hydrazines’, ‘phthalazinone:Ketobenzoic_acids’, ‘piperidine_indole’, ‘piperidine_indole:Indoles’, ‘piperidine_indole:Piperidines’, ‘pyrazole’, ‘pyrazole:Diones_2_4’, ‘pyrazole:Hydrazines’, ‘spiro-chromanone’, ‘spiro-chromanone:Ketophenols’, ‘spiro-chromanone:Piperadone_ketones’, ‘sulfon_amide’, ‘sulfon_amide:Amines’, ‘sulfon_amide:Sulfonyl_chlorides’, ‘tetrazole_connect_regioisomere_1’, ‘tetrazole_connect_regioisomere_1:Alkyl_bromides’, ‘tetrazole_connect_regioisomere_1:Nitriles’, ‘tetrazole_connect_regioisomere_2’, ‘tetrazole_connect_regioisomere_2:Alkyl_bromides’, ‘tetrazole_connect_regioisomere_2:Nitriles’, ‘tetrazole_terminal’, ‘tetrazole_terminal:Nitriles’, ‘thiazole’, ‘thiazole:Alpha_halo_ketones’, ‘thiazole:Thioamides’, ‘triaryl-imidazole’, ‘triaryl-imidazole:Aro_ethane_diones’, ‘triaryl-imidazole:Aroaldehydes’, ‘urea’, ‘urea:Amines’, ‘urea:Isocyanates’]
Custom Reactions or Reagents (customqueryclass): A list of custom reactions and/or reagents for selection of transforms. If this is a list of reagents, the input molecules will be verified against this reagent type, or presumed to be this reagent type if the Verify Classifications switch is OFF. Any specification here supersedes any selection specified by the Reactions or Reagents above.
Type: string
Reaction Applied (rxnid): Name of the string field to identify the reaction.
Type: string
Default: ReactionId
Output Mol Field (outmol): Output molecule field.
Type: field_parameter::mol
Annotate Mol (outmolsmi): Name of the string field for the input molecule SMILES or blank to suppress.
Type: string
Default: OriginalMol
Annotate Rxn (outrxnsmi): Name of the string field for the reaction SMILES or blank to suppress.
Type: string
Default: Reaction
Strict Valences (strictval): If ON, only outputs products with valid valences.
Type: boolean
Default: True
Choices: [True, False]
Strict Reagent Classification (strictreagents): If ON, uses strict reagent classifications. Otherwise relaxes validation to only required chemical features and suppresses validations based on disallowed reagent chemistry features.
Type: boolean
Default: True
Choices: [True, False]
Check Valences (checkval): How to handle valence issues for the generated products.
Required
Type: string
Default: ignore
Choices: [‘ignore’, ‘fix’, ‘reject’]
SMILES Dedupe (dedupesmi): If ON, performs a deduplication of the product SMILES.
Required
Type: boolean
Default: True
Choices: [True, False]
SMILES Dedupe Memory (dedupesmimem): Product deduplication may require significant memory resources, specify the desired amount in MB.
Type: decimal
Default: 10240
Fragmentation Size (fragpercent): For the retro reaction products, requires generated fragments to be at least this percentage of the heavy atom count of the input.
Type: integer
Default: 40
Focused Library Filtering Options
Filter Products (mol_filtering): Enables molecule filtering of the generated products (see type specified by Product Filter).
Required
Type: boolean
Default: True
Choices: [True, False]
Product Filter (mol_filter_type): Type of molecule filter to apply to the generated analogs.
Type: string
Default: BlockBuster
Choices: [‘Lead’, ‘Drug’, ‘BlockBuster’, ‘BlockBuster+PAINS’, ‘PAINS’, ‘Custom’]
Product Filter Summary Report (mol_filter_summary): If ON, will generate a summary report of the rules that filtered molecules.
Type: boolean
Default: False
Choices: [True, False]
Focused Library Property Generation
Compute Molecule Properties (mol_props): Which molecule properties to calculate.
Type: string
Default: [‘HeavyAtoms’, ‘MedChemInterest’, ‘MolComplexity’, ‘MolWeight’, ‘TPSA’, ‘XLogP’]
Choices: [‘HeavyAtoms’, ‘MedChemInterest’, ‘MolComplexity’, ‘MolWeight’, ‘TPSA’, ‘XLogP’]
Reagent Functional Group Conversions
Allow Functional Group Conversions (funcgroups): If ON, allows functional group translations during input molecule classifications and reagent retrievals.
Type: boolean
Default: True
Choices: [True, False]
Advanced Focused Library Options
Lead Molecule Minimum Records (rec_min): The minimum number of lead molecule records allowed (default:1). Input lead molecule datasets that do not meet this threshold will terminate the floe. Use 0 to suppress validation.
Type: integer
Default: 1
Lead Molecule Maximum Records (rec_max): The maximum number of lead molecule records allowed (default:1). Input lead molecule datasets that exceed this threshold will terminate the floe. Use 0 to suppress validation.
Type: integer
Default: 1
Maximum Reagents (maxreagents): Maximum number of reagents to process.
Type: integer
Default: 100
Sample Reagents (samplereagents): Samples this percentage of the total reagents available. Limit by the (optional) Maximum Reagents total.
Type: integer
Molecule ID Field (molid): Name of the string field for the molecule ID.
Type: field_parameter::string
Molecule SMILES Field (molsmi): Name of the string field for the molecule SMILES.
Type: field_parameter::string
Classifier Memory Limit (classifiermem): The memory limit for the reaction classifier. It may need to be increased for large R&R Databases.
Required
Type: decimal
Default: 10240
Verbosity (verbosity): Sets the output logging verbosity.
Type: string
Default: warning
Choices: [‘info’, ‘warning’, ‘error’, ‘debug’, ‘ddebug’]