Psi4 Combined Tautomer and Torsion Sampling Conformer Floe

Category Paths

  • Solution-based/Formulation

  • Role-based/Formulation Scientist

  • Product-based/Crystal Structure Prediction

  • Task-based/Crystal Structure Prediction

Description

This Floe is a first step for small molecule crystal structure prediction. Reasonable tautomers are first enumerated for the input molecule. Custom torsion sampling rules are generated for each molecule (and tautomer). Finally, a low resolution conformer ensemble is generated. Details for each step are provided below.

Tautomer Enumeration

All reasonable tautomers of each input molecule are enumerated. Note that all molecules (and tautomers) in this Floe will be included in the same energy landscape. Therefore it is not recommended to provide multiple input molecules unless there are multiple forms of the molecule being considered for a Crystal Structure Prediction (i.e. multiple stereo-isomers of the same molecule).

Custom Torsion Rules

To generate custom torsion rules, scanning is performed with the OETorsionScan function from the Szybki Toolkit at a specified resolution (5 degrees by default). This function includes a force field minimization of all internal degrees of freedom except for the rotatable torsion in each fragment. Then, a QM optimization is performed with the torsion constrained while all other degrees of freedom are relaxed.

The torsion scans are used to determine sampling rules for each rotatable bond. The ‘Torsion Rules Output’ comes from this step. To generate more conformers with these custom rules the ‘Torsion Rules Output’ can be used as input in the Psi4 QM Conformer Ensemble (Part I of CSP Protocol) Floe.

Conformer Ensemble

In this step, conformers are generated at the resolution specified in the ‘RMSD Threshold for conformer generation’ parameter. These conformers are QM optimized with all torsions constrained and the single point energy is computed at a higher level of theory. Finally, all conformers above the specified energy window are removed. These conformers saved in the ‘Psi4 Conformer Ensemble Output’ can be used as input for the Polymorph Search with IEFF Crystal Force Field (Part II of CSP Protocol: Generation and Filtering) Floe if a multi-stage approach is being performed.

Analysis and Report

Included in the results is a Floe Report summarizing all of these results. For each molecule, the fragments and their corresponding torsion scans are available. There will be a histogram of conformer energies from the generated ensemble to help understand the relative energies of tautomers of the input molecule. Finally, there is a table summarizing the number of expected conformers at each RMSD. This final table is useful in deciding if the multistage approach is necessary for the next crystal structure prediction steps.

Promoted Parameters

Title in user interface (promoted name)

Inputs

Input Dataset (in): The dataset(s) to read records from

  • Required

  • Type: data_source

Outputs

Torsion Rules Output (out): Dataset to store records with custom torsion rules, these can be used as input in ‘Psi4 QM Conformer Ensemble’ or Psi4 QM Local Minima Search’ Floes to use the custom torsion rules during conformer generation.

  • Required

  • Type: dataset_out

  • Default: torsion_rule_output

Fragment Output (frag_out): Dataset to store fragment records with torsion scans.

  • Required

  • Type: dataset_out

  • Default: fragment_output

Intermediate Optimization Output (confgen_data_out): Dataset to store QM optimized conformers before deduplication. If a job is cancelled early (either by the user or hitting a cost threshold is reached), these intermediate optimized conformers, will still be saved.

  • Required

  • Type: dataset_out

  • Default: all_conf_gopt

Psi4 Conformer Ensemble Output (psi4_gopt_data_out): Dataset to store optimized conformers in the specified energy window, each record has a single conformer.

  • Required

  • Type: dataset_out

  • Default: confs_psi4_gopt_spe

Failure Output (failure): Dataset to store records which fail during this Floe.

  • Required

  • Type: dataset_out

  • Default: psi4_fragmentation_failures

CSP Conformer Report Title (frag_floe_report_name):

  • Type: string

  • Default: taut_torsion_conf_report

Conformer Ensemble Parameters

RMSD Threshold for conformer generation (confgen_rmsd_threshold):

  • Type: decimal

  • Default: 0.75

First Test RMSD (test_rmsd2): RMSD threshold for conformer duplicate removal

  • Type: decimal

  • Default: 0.5

Second Test RMSD (test_rmsd3):

  • Type: decimal

  • Default: 0.25

Maximum Conformers for Geometry Optimization (limit_confs): This parameter limits the number of conformers optimized, to prevent accidentally spending more than expected on a single Floe. If more than this number of conformers are generated, then only one conformer will be optimized to learn about the cost of this floe/conformer. If the max number of conformers is set to 0, then ALL generated conformers are optimized.

  • Type: integer

  • Default: 100

Psi4 Energy Window (kcal/mol) (psi4_energy_window): Psi4 energy window for filtering high strain conformers. When the filter is set to -1 all conformers are included in output.

  • Type: decimal

  • Default: 10.0

Advanced Parameters

Generate Reasonable Tautomers Only (reasonable_tautomers): Choice of rather to generate reasonable tautomers (On) or all possible tautomers (Off).

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Remove Extended Ring Atoms (remove_extended): When On, heuristics are used to reduce the size of the fragments by removing atoms from extended ring systems where possible. Turn Off to keep all atoms in complex ring systems. In very rare cases, (with obscure chemistries) turning this Off may reduce problems in fragmentation

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Torsion Increment (resolution): Torsion angle increment in degrees

  • Type: decimal

  • Default: 5.0

Energy Cutoff (max_sample_energy): Energy cutoff for choosing angles in custom torsion rules. Default value of 5kcal is chosen based on the assumption that these rules will later be used to generate conformers in a 10kcal/mol energy window. If you plan to generate a QM conformer ensemble in a larger energy window you may want to increase this cut off accordingly.

  • Type: decimal

  • Default: 5

Constrain Torsions with Polar Hydrogens (constrain_polar_hydrogens): Torsions terminating in a polar hydrogen (i.e. hydroxyl groups) will be constrained along with all other rotatable bonds (On). Otherwise (Off), only rotatable bonds with heavy atoms will be constrained.

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Psi4 Calculation Parameters

Psi4 Hamiltonian (Torsion Scan) (torsion_method): Method used for geometry optimizations in torsion scans.

  • Type: string

  • Default: HF3c

  • Choices: [‘HF3c’, ‘PBEh3c’, ‘HF’, ‘HF-D3’, ‘B3LYP’, ‘B3LYP-D3BJ’, ‘B3LYP-D3MBJ’, ‘B2PLYP-D3BJ’, ‘M06’, ‘M06-2X’, ‘M06-L’, ‘MN15-D3BJ’, ‘MN15-L’, ‘PW6B95-D3BJ’, ‘CAM-B3LYP’, ‘CAM-B3LYP-D3BJ’, ‘WB97X’, ‘WB97X-D’, ‘PBE’, ‘PBE0’]

Psi4 Basis Set (Torsion Scan) (torsion_basis): Basis set used for geometry optimizations in torsion scans.

  • Type: string

  • Default:

  • Choices: [‘’, ‘minix’, ‘6-31G’, ‘6-31G*’, ‘6-31+G*’, ‘6-31G**’, ‘6-31+G**’, ‘6-311G**’, ‘6-311+G**’, ‘6-311G(2d,2p)’, ‘def2-SVP’, ‘def2-SVPD’, ‘def2-TZVP’, ‘def2-TZVPD’, ‘def2-TZVPP’, ‘def2-TZVPPD’, ‘cc-pVDZ’, ‘aug-cc-pVDZ’, ‘cc-pVTZ’, ‘aug-cc-pVTZ’, ‘LanL2DZ’]

Psi4 Memory (Torsion Scan) (torsion_memory): Memory for Psi4 calculations in MBs

  • Type: decimal

  • Default: 14400

Psi4 #Threads (Torsion Scan) (torsion_nthreads): Number of CPUs for Psi4 calculations.

  • Type: integer

  • Default: 8

Psi4 Hamiltonian (conf. geometry optimization) (psi4_gopt_method): Method used for Psi4 geometry optimization.

  • Type: string

  • Default: HF3c

  • Choices: [‘HF3c’, ‘PBEh3c’, ‘HF’, ‘HF-D3’, ‘B3LYP’, ‘B3LYP-D3BJ’, ‘B3LYP-D3MBJ’, ‘B2PLYP-D3BJ’, ‘M06’, ‘M06-2X’, ‘M06-L’, ‘MN15-D3BJ’, ‘MN15-L’, ‘PW6B95-D3BJ’, ‘CAM-B3LYP’, ‘CAM-B3LYP-D3BJ’, ‘WB97X’, ‘WB97X-D’, ‘PBE’, ‘PBE0’]

Psi4 Basis set (conf. geometry optimization) (psi4_gopt_basis): Basis set for Psi4 geometry optimization. Default empty basis set (‘’) goes with HF3c which has one built in.

  • Type: string

  • Default:

  • Choices: [‘’, ‘minix’, ‘6-31G’, ‘6-31G*’, ‘6-31+G*’, ‘6-31G**’, ‘6-31+G**’, ‘6-311G**’, ‘6-311+G**’, ‘6-311G(2d,2p)’, ‘def2-SVP’, ‘def2-SVPD’, ‘def2-TZVP’, ‘def2-TZVPD’, ‘def2-TZVPP’, ‘def2-TZVPPD’, ‘cc-pVDZ’, ‘aug-cc-pVDZ’, ‘cc-pVTZ’, ‘aug-cc-pVTZ’, ‘LanL2DZ’]

Psi4 Hamiltonian (Conf. Single Point Energy) (psi4_spe_method): Method used for single point energy calculation.

  • Type: string

  • Default: B3LYP-D3MBJ

  • Choices: [‘HF3c’, ‘PBEh3c’, ‘HF’, ‘HF-D3’, ‘B3LYP’, ‘B3LYP-D3BJ’, ‘B3LYP-D3MBJ’, ‘B2PLYP-D3BJ’, ‘M06’, ‘M06-2X’, ‘M06-L’, ‘MN15-D3BJ’, ‘MN15-L’, ‘PW6B95-D3BJ’, ‘CAM-B3LYP’, ‘CAM-B3LYP-D3BJ’, ‘WB97X’, ‘WB97X-D’, ‘PBE’, ‘PBE0’]

Psi4 Basis Set (Conf. Single Point Energy) (psi4_spe_basis): Basis set used for single point energy calculation.

  • Type: string

  • Default: 6-31G*

  • Choices: [‘’, ‘minix’, ‘6-31G’, ‘6-31G*’, ‘6-31+G*’, ‘6-31G**’, ‘6-31+G**’, ‘6-311G**’, ‘6-311+G**’, ‘6-311G(2d,2p)’, ‘def2-SVP’, ‘def2-SVPD’, ‘def2-TZVP’, ‘def2-TZVPD’, ‘def2-TZVPP’, ‘def2-TZVPPD’, ‘cc-pVDZ’, ‘aug-cc-pVDZ’, ‘cc-pVTZ’, ‘aug-cc-pVTZ’, ‘LanL2DZ’]

Psi4 Memory (Conf. Opt. and SPE) (psi4_memory): Memory for Psi4 calculations in MBs

  • Type: decimal

  • Default: 14400

Psi4 #Threads (Conf. Opt. and SPE) (psi4_nthreads): Number of CPUs for Psi4 calculations.

  • Type: integer

  • Default: 8

Fields Generated during Floe

Torsion Energy Field (frag_energy_field): New field created on torsion scan output to store QM energy at each torsion angle.

  • Required

  • Type: field_parameter::float

  • Default: Psi4 Energy (kcal/mol)

Torsion Strain Energy Field (frag_strain_energy_field): New field created on torsion scan output to store relative QM energies at each torsion angle, for each fragment.

  • Required

  • Type: field_parameter::float

  • Default: Psi4 Strain Energy (kcal/mol)

Conformer Optimization Energy Field (gopt_energy_field): Field to store the absolute QM energy at the end of the conformer optimization.

  • Required

  • Type: field_parameter::float

  • Default: Psi4 Opt Energies (kcal/mol)

Conformer Single Point Energy Field (conf_energy_field): Field to store the absolute QM energy of each conformer after the single point energy calculation.

  • Type: field_parameter::float

  • Default: Psi4 Energy (kcal/mol)

Conformer Single Point Strain Energy Field (conf_strain_energy_field): Field to store the relative QM energy after the single point energy calculation.

  • Required

  • Type: field_parameter::float

  • Default: Psi4 Strain Energy (kcal/mol)