How to Generate Conformers with a Custom Torsion Library

Disclaimer: Running both Floes in this tutorial will only cost about $2 with some variation based on current AWS pricing. Remember that changes to QM Methods and Basis sets or molecule size could significantly change costs. If you wish to run this tutorial with a much larger molecule than the one provided or with a non-default method and basis set, you may wish to review How to Run Benchmark Floes for Cost Estimation.

This tutorial uses the following Floes:

Create an Input Dataset

You can copy a SMILES from anywhere and paste it into the Orion Sketcher. The fragmentation and torsion scanning floe can take any molecule from the sketcher or an existing dataset. For this tutorial we use a molecule from the Crystallography Open Database with SMILES COC(=O)c1cc([nH]n1)c2ccccc2 and COD identification number 221060. For this tutorial, you can paste the SMILES into the Orion Sketcher or follow along with your own molecule.

../../../../../_images/images_custom_2216060.png

From the Orion Data page, find the Sketcher and paste the SMILES or draw your own molecule. Consider that Floe costs could change dramatically with molecule size (and number of rotatable bonds).

Run the Fragmentation and Torsion Scanning Floe

Locate the Psi4 QM Fragmentation and Torsion Scanning by:

  1. Show “OpenEye QM Psi4 Floes”

  2. Search for “Frag” to isolate the Floe

  3. Click on “Psi4 QM Fragmentation and Torsion Scanning”

../../../../../_images/images_search_frag_floe.png

Parameters to change

Select Input Dataset

For this tutorial we will use default parameters. Therefore, all you need to do is to select your input dataset and rename your outputs (if you desire).

../../../../../_images/images_custom_input.png

Select “Choose input…” and find the dataset you created above, or use the sketch tab at the top right to draw a new molecule.

Specify Output Datasets

../../../../../_images/images_custom_output_names.png

Default output dataset names are always available. One way to keep track of Floe output is to add a prefix to these so you know which job they came from. In this case, the prefix psi4_torsion_tutorial was added to the beginning of each of the outputs generated by this Floe:

  • Torsion Rules Output: Saves the input molecule with the custom torsion rules from the fragment scans as a new field on the record.

  • Fragment Output: The fragments generated around each rotatable bond in the input molecule ( in this case). Each molecule in the output has a conformer record for each angle in the torsion scan, about 73 in total.

  • Failure Output: Only appears if there are any failed calculations. It is important to note that during QM torsion scans there are sometimes failures for very high energy torsions which failure to converge during optimization. Generally, we do not worry about these failures. A report will be generated at the end summarizing any failures, which we will investigate.

  • Fragmentation Report Title: This is a Floe Report (html file) with the torsion scan results.

Parameters to leave unchanged for tutorial

  • Torsion Increment: separation between points in the torsion scan in degrees. It is not recommended to set this value above 10 degrees when you wish to use the custom torsion library.

  • Psi4 Calculation Parameters: This collapsed parameter group includes the parameters for changing the method, basis set, memory, and number of CPUs used by the QM optimization. For each torsion increment, in each fragment, a geometry optimization is performed constraining that torsion while allowing all other degrees of freedom to relax. We have chosen HF-3c (minix basis set) as a default method for this protocol because it give a reliable sense of the energy landscape for drug like molecules. The memory and CPU defaults should be more than sufficient for most molecules under 30 heavy atoms. If you are working with a much larger molecule (or a different method or basis set), you may need to perform some benchmark calculations first, see Benchmark Torsion Scanning Floes.

Understand the Torsion Scan Floe Report

Now that the Floe has finished you can see the torsion scanning results in the Floe report. Find the report by:

  1. Navigate to the completed job from the Floes page

  2. Click on the “Floe Report” tab above the Floe diagram

  3. Click on the name of the report you specified above, in this example we used “psi4_torsion_tutorial_Fragmentation_Report.” Note, if you did not change the default output it will just be called “Fragmentation Report”).

  4. Optionally: click the square with an arrow to open the report in a new tab.

../../../../../_images/images_search_frag_report.png

In the report, you navigate by molecule. The first page is a grid of all molecules used as input to the Floe (only one for this tutorial). You can examine the results by:

  1. Click on the molecule you want to see fragments for

  2. This takes you to a page showing the fragments created around rotatable bonds in this molecule (3 fragments in this example). The sampling rules generated for the molecule are also shown.

    • These sampling rules are not the most human readable. They are stored on your output dataset so that you can use that as an input in the conformer generating Floes. If you want to understand the rules, they are a SMARTS pattern for the torsion followed by the angles to be sampled. The rules for different torsions are separated by $$$.

  3. Click on any fragment to see the torsion scan for that bond

../../../../../_images/images_frag_report_list_fragments.png

On each fragment page the torsions are shown:

  1. On the fragment, the molecule used for the the torsion scan,

  2. The same torsion highlighted on the full molecule for context, and

  3. The actual torsion scan, showing the relative QM energy at each angle increment highlighting the angles chosen for the custom torsion rule.

../../../../../_images/images_frag_report_torsion_scan.png

The browser back button can be used to navigate back to the previous pages.

Generate Conformers with Your Custom Torsion Rules

Locate the Psi4 QM Conformer Ensemble by:

  1. Show “OpenEye QM Psi4 Floes”

  2. Search for “Conf” to isolate the Floe

  3. Click on “Psi4 QM Conformer Ensemble”

../../../../../_images/images_search_conf_floe.png

The conformers generated in this Floe are optimized while constraining torsions around all rotatable bonds. This allows for reasonable energies at the specified level of theory for a very diverse set of conformers. If you would prefer full conformer optimizations, the Psi4 QM Local Minima Search includes unconstrained geometry optimizations and the parameters are nearly identical.

Parameters to change

Select Input Dataset

In order to use the torsion library created in the last Floe, you need to use the Torion Rules Output. The default name for that output is``torsion_rule_output`` which corresponds to psi4_torsion_tutorial_torsion_rule_output if you used the outputs specified in this tutorial.

Specify Output Datasets

As above, we will prefix each output with psi4_torsion_tutorial.

../../../../../_images/images_custom_output_names_conf.png
  1. Intermediate Optimization Output: These records contain molecules which have completed torsion constrained optimizations, but the single point energy has not been complete. We save these output as an intermediate. If you need to quit the Floe early or something goes wrong during the single point energy calculations this intermediate output allows you to save the results from the optimization calculations.

  2. Psi4 Conformer Ensemble Output: This is the final conformer ensemble with conformers which fall within your specified energy window.

  3. Failure Output: These are records for calculations that failed during the Floe. This includes conformers that are too high in energy to be included in the successful ensemble. Another common failure, is for high energy conformers to fail during the constrained optimizations. During conformer generation, we use a force field in the OpenEye Omega Toolkit to evaluate conformer energies, but allow for fairly high energy conformers to guarantee sufficient sampling.

  4. Conformer Floe Report Name: This is a Floe report which summarizes the conformer ensemble results including a distribution of energies in the final ensemble.

Parameters to leave unchanged for tutorial

  • Conformer Parameters:
    • RMSD Threshold for Conformer Generation: Dense conformer generation is performed with this as the maximum RMSD threshold between conformers.

    • Maximum Conformers for Geometry Optimization: This parameter limits the number of geometry optimizations performed. If the number of conformers generated is greater than this number then only 1 conformer is optimized. That single conformer allows you to predict how expensive the full floe would be. See Benchmark Conformer Generating Floes for more details.

    • Psi4 Energy Window (kcal/mol): Only conformers within this energy threshold of the lowest energy conformer will be included in the final ensemble.

    • Constrain Torsions with Polar Hydrogens: When On, torsions terminating in a polar hydrogen (such as hydroxyls) are constrained with other torsions around rotatable bonds.

  • Psi4 Calculation Parameters:
    • The biggest concerns here is an accuracy/cost trade off, if you are going to change these settings, you may want to run a benchmark calculation with a strict limit on the number of geometry optimizations performed. (Benchmark Conformer Generating Floes)

    • Psi4 Hamiltonian (Geometry Optimization) and Psi4 Basis Set (Geometry Optimization): Remember, that during the geometry optimization all torsions are constrained. This is to allow the bonds and angles to relax reducing the noise in your final energy while maintaining the diversity in the conformers generated. In our experience, for atoms up to Ar, HF-3c does a good job of getting these small optimizations correct. If you are going to change the method and basis set for the optimizations, keep the constraints in mind considering the cost/accuracy trade off.

    • Psi4 Hamiltonian (Single Point Energy) and Psi4 Basis Set (Single Point Energy): If you are looking for a higher level of theory for the relative energy of your conformers, this is where you might want to make changes.

    • Psi4 Memory and Psi4 #Threads: These should be more than sufficient for our default method and basis sets with molecules up to 40 heavy atoms. If you have extremely large molecules or are going to change the method or basis set you may want to perform a benchmark calculation and check the relevant cube metrics (How do I check the metrics (memory, CPU usage) in a QM calculation?).

Understand the Conformer Ensemble Floe Report

The Floe report for the Conformer Ensemble can be found in the same way as above, under the “Floe Report” tab.

../../../../../_images/images_conf_report_full.png