Tutorial: Focused Library – Core Input with Multiple Attachment Points
The Focused Library – Core Input Floe is one of several floes that can be used for generative design. The input is a molecule core than can have up to four attachment points. This floe enumerates a focused library by processing only a half-reaction on qualified reagents in the database for a particular reagent and then joining at a defined site on a core.
Enumerate Reagents for an Input Core Using a Built Database: Multiple Site Enumeration
Now we will be working with two attachment points, and it will be assumed that you already have the prepared database from the Create and Inspect a Reaction and Reagent Database from a SMILES File tutorial. If you want to use this section without preparing your own database, you can use the OpenEye databases provided in the Organization Data folder.
Choose the Focused Library – Core Input Floe. Click “Launch Floe” to bring up the Job Form. The parameters can be specified as below.
Input Parameters
Library Core Input (Library Core):
Click “Choose Input” to bring up the Sketcher window. Paste the SMILES string c1cc(ccc1C(=O)O)S(=O)(=O)Nc2ccc(c(c2)C#N)Oc3ccc(c(c3)C(F)(F)F)Cl into the Sketcher.
To define the enumeration sites, attach the R groups by hovering over OH and Cl and pressing R on the keyboard until the atoms are replaced by R1 and R2, respectively. See the core with the attachment points defined in Figure 5.
Reaction & Reagent Database (Append R4): Select the Mcule reaction and reagent database that you created previously as the Reaction & Reagent database.
Product Dataset (Products Dataset): Title the output dataset “Core_Input_MultipleSites.”
Figure 1. Molecule core sketch with enumeration sites identified as R1 and R2.
R1 Site: These are the enumeration-related parameters for R1. The reagents that will react and join with our core can be selected here. For this tutorial, we will make amides at the selected site, with the core being a carboxylic acid as one component of the Schotten-Baumann reaction, and the building blocks being the amine that will join with the core.
R1 Reagent Selection: Select Schotten-Baumann_amide:Amines.
R2 Site: We will be making aryl amines by reacting an aryl halide (our core) with an amine building block from our database. The functional group described along the reaction specification refers to the functional group that must be present in the screened building blocks.
R2 Reagent Name: Select the Buchwald_cross_coupling1:Amines.
Maximum Reagents: The number of records expected in the output dataset is the combination of the Schotten-Baumann_amide:Amines and Buchwald_cross_coupling1:Amines reagents (R1 Maximum Reagents X R2 Maximum Reagents). To avoid a combinatorial explosion, input a number for the R Maximum Reagents parameters such as they will generate a maximum of one million compounds, or 100K if you intend to visualize the results in the Analyze page. In this tutorial, 100K results will be generated by inputting 500 in the R1 Maximum Reagents and 200 in the R2 Maximum Reagent. See the input parameters in Figure 2.
Figure 2. Input parameters for the R1 Site and R2 Site parameters from the Focused Library – Core Input Floe.
Enumeration Options: Enumeration-related parameters.
Max Products: This parameter should always be equal to or greater than the maximum possible number of enumerated products. Given that the default is 1M and a maximum of 100K products will be enumerated with the settings we chose, it is not necessary to change the default input number. This parameter is a threshold, not a limit. If the input value is lower than the possible number of enumerated products, the floe will fail.
Focused Library Filtering Options: Parameters controlling filtering of the products for the focused library generation.
Filter Products: Toggle the switch to Off. Given the high molecular weight, resultant by enumerating two sites concomitantly, many products will not pass the Blockbuster filter. If no products pass the filter, the floe will fail. For more information about OpenEye default filters, please click here. Input parameters are shown in Figure 7.
Focused Library Property Generation: These parameters specify the computed molecular properties for the products from the focused library generation.
Compute Product Properties: For the sake of saving effort in postprocessing, select XlogP, TPSA, and MolWeight.
Click “Start Job” to begin the floe.
Figure 3. Input parameters for the Focused Library – Core Input Floe.
Once the job is complete, activate the generated dataset and load it in the Analyze page (Figure 4). If you have chosen to generate between 100K and 1M records, you can use the Dataset Filtering – Custom or Built-in Filter Types Floe, which allows the use of OpenEye filters and/or custom filters.
Figure 4. The 3D & Analyze page showing the scatterplot (XLogP vs Molecular Weight) and spreadsheet with the products enumerated.