Equilibration and Nonequilibrium Switching [MDPrep] [MDRun] [FECalc]

Description

  • Tutorials and Further Reading:

  • Purpose:

    • This Floe performs relative binding free energy (RBFE) calculations using the nonequilibrium switching (NES) method refined by the de Groot lab (Gapsys et al., Chem. Sci., 2020, 11, 1140-1152). It also carries out the equilibration MD runs which must precede NES.

  • Method Recommendations/Requirements:

    • Three primary inputs are required; one additional primary input is optional. All other fields are supplementary:

      • A protein prepared to MD standards: protein chains must be capped, all atoms in protein residues (including hydrogens) must be present, and missing protein loops resolved or capped. Crystallographic internal waters should be retained where possible.

        Note: Optional if the dataset of posed ligands also has the protein in it.

      • A dataset of posed ligands need to have reasonable 3D coordinates, all atoms, and correct chemistry (in particular, bond orders and formal charges). The starting poses should not have very high gradients, in particular no bad clashes with the protein.

      • A dataset of edges or alternatively, a text file (explained below) with one line per edge, of form “ligA_name >> ligB_name”.

      • [Optional] Select experimental binding affinities from the Ligand Input Dataset, or supply a text file containing ligand names, affinity values, optional uncertainties, and optional units.

  • Limitations

    • If no experimental binding affinities are given, predicted affinities will lack an absolute reference and will be shifted so that their mean is 0 kcal/mol.

    • Currently, there is no mitigation for the effects of changes in buried waters, protein sidechain flips, or large protein movements between ligA and ligB.

  • Expertise Level:

    • Regular

  • Compute Resource:

    • High

  • Keywords:

    • MDPrep, MD, FECalc

  • Related Floes:

    • Ligand Bound and Unbound Equilibration for NES [MDPrep] [MD]

    • Nonequilibrium Switching [MD] [FECalc]

    • Compare Experimental Affinity with NES Results [Utility] [FECalc]

    • Nonequilibrium Switching Recovery [Utility] [FECalc]

This Floe combines Equilibration MD calculations of the bound and unbound ligands with subsequent Relative Binding Free Energy calculations using Nonequilibrium Switching. Given the inputs of the protein and posed ligands, the complex is formed with each ligand/conformer separately, and the bound and unbound simulations are then carried out. Each ligand can have multiple conformers but each conformer will be run separately as a different ligand. Currently only one of the conformers will be used in the NES calculations. A minimization stage is performed on the system followed by a warm up (NVT ensemble) and several equilibration stages (NPT ensemble). In the minimization, warm up, and equilibration stages, positional harmonic restraints are applied on the ligand and protein. At the end of the equilibration stages a production run (by default 6 ns) is performed on the unrestrained system. Separate datasets are written for the bound and unbound ligands.

Then, Relative Binding Free Energy (RBFE) calculations are performed using the Nonequilibrium Switching (NES) method. Here the third input mentioned above is used, and the text file of edges, describing the map of desired alchemical transformations of one ligand into another. Each transformation forms an edge of a connected graph of ligands. The file must have one line per transformation, of format

ligA_name >> ligB_name

where “ligA_name” and “ligB_name” are the ligand names for the ligands to be transformed. These ligand names must correspond exactly to those in the bound and unbound ligand equilibration datasets.

The Floe will draw a number of starting snapshots from the bound and unbound trajectories of the ligands. Then for each edge in the edge file, it will generate an RBFE alchemical transformation from ligA into ligB, and carry out the NES fast transformation of ligA into ligB, and vice versa, for each of the snapshots. The resulting relative free energy change, or DeltaDeltaG, for each edge is the primary output of this method. A maximum likelihood estimator is then used to derive a predicted binding affinity (free energy, or DeltaG) for each ligand. The mean value of the input experimental binding free energies is used as the reference value for the computed ones.

The speed of the NES transformation and the number of snapshots transformed can be adjusted from default values by the user at runtime. The floe outputs two floe report/dataset pairs, one for the calculated RBFE edges (DeltaDeltaGs), and one for the derived affinity predictions (DeltaGs) of ligand.

If protein tumbling restraints are requested, they are applied during the equilibration MD as well as NES part of the Floe that computes the RBFE.

Promoted Parameters

Title in user interface (promoted name)

NES Run Parameter

time (nes_time): NPT simulation time in nanoseconds.

  • Type: decimal

  • Default: 0.05

trajectory_frames (trajectory_frames): The total number of trajectory frames to run NES.

  • Type: integer

  • Default: 80

Optimize NES costs (nes_switch): Optimize NES costs by selecting to run the Bound Switching on cost effective instances

  • Required

  • Type: boolean

  • Default: True

  • Choices: [True, False]

timestep_in_fs (timestep_in_fs): Time step (in femtoseconds) for MD integration.

  • Type: decimal

  • Default: 2.0

  • Choices: [1.0, 2.0]

Max number of Mapper edges allowed (max_mapper_edges): The max number of mapper edges allowed

  • Type: integer

  • Default: 500

Inputs

Protein Input Dataset (protein): Protein Input Dataset

  • Type: data_source

Ligand Input Dataset (ligands): Ligands-only input dataset or protein-ligand input dataset containing Design Unit prepared by SPRUCE

  • Required

  • Type: data_source

Mapper Dataset (mapper): Mapper Input Dataset

  • Required

  • Type: data_source

Use Rapid FE-NES Settings (use_rapid_fenes): Override the MD sampling time and number of NES switches with Rapid FE-NES values (1.5 ns per ligand and 20 switches per edge). Setting this to On will ignore both default and user‑specified values.

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Experimental Affinities (From Ligand Input Dataset)

Column in the dataset with experimental affinity values (affinity_column): Populated after selecting the dataset.

  • Type: field_parameter::float

  • Default: —

Units for affinity values (units):

Use ‘log’ for pIC50, pKi, etc.

  • Type: string

  • Default: Not selected

  • Choices: [‘Not selected’, ‘kcal/mol’, ‘kJ/mol’, ‘pM’, ‘nM’, ‘uM’, ‘mM’, ‘M’, ‘log’]

Column in the dataset with experimental affinity uncertainties (affinity_error_column): Populated after selecting the dataset.

  • Type: field_parameter::float

  • Default: —

Experimental Affinities (From Text File)

Text file containing experimental affinities (exp):

Syntax for ASCII file: [Ligand] [Affinity] [Error {optional}] [units {optional}]. Allowed units: kcal/mol, kJ/mol, log, M, mM, uM, nM, pM. Use ‘log’ for pIC50, pKi, etc. Use ‘M’, ‘mM’, ‘uM’, ‘nM’ or ‘pM’ for IC50, Ki, etc.

  • Type: file_in

Delimiter (field_separator): Whitespace (including tabs) or comma. Delimiters cannot be mixed.

  • Type: string

  • Default: whitespace(s)

  • Choices: [‘whitespace(s)’, ‘,’]

Units for affinity values (units_expt_file):

Use ‘log’ for pIC50, pKi, etc. Units present in the experimental file override this selection.

  • Type: string

  • Default: Not selected

  • Choices: [‘Not selected’, ‘kcal/mol’, ‘kJ/mol’, ‘pM’, ‘nM’, ‘uM’, ‘mM’, ‘M’, ‘log’]

Outputs

Bound Equilibration Output Dataset (out_bound): Output dataset of bound MD.

  • Required

  • Type: dataset_out

  • Default: MD_Bnd

MD Recovery Dataset (md_recovery): MD Recovery Dataset.

  • Required

  • Type: dataset_out

  • Default: recovery_dataset

Unbound Equilibration Output Dataset (out_unbound): Output dataset of unbound MD.

  • Required

  • Type: dataset_out

  • Default: MD_Unb

Affinity Output Dataset (DG): Output dataset of binding affinity calculations.

  • Required

  • Type: dataset_out

  • Default: binding_affinity_output_dataset

NES Output Dataset (out): Output dataset of NES.

  • Required

  • Type: dataset_out

  • Default: nes_output_dataset

Failed Dataset (fail): Output dataset of failed calculations.

  • Required

  • Type: dataset_out

  • Default: failed_dataset

Recovery File (recovery): Recovery Output File.

  • Required

  • Type: file_out

  • Default: recovery_file

Bound and Unbound Production Parameters

Bound States NPT Production Runtime (prod_ns): NPT simulation production time in nanoseconds

  • Type: decimal

  • Default: 6.0

Unbound State NPT Production Time (prod_unb_us_ns): NPT simulation production time for each starting pose in nanoseconds

  • Type: decimal

  • Default: 6.0

Number of MD starts (n_md_starts): The number of md starts for each ligand/conformer

  • Type: integer

  • Default: 1

Complex Setup Parameters

Protein Name (flask_title): Prefix name used to identity the Protein. If not specified, it will use the title of the input protein.

  • Type: string

  • Default:

Restrain protein tumbling (restraint_protein_tumbling): Restraining protein tumbling allows for a smaller flask

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Restrain protein tumbling wt (restraint_protein_tumbling_Wt): Restraint weight for pre-defined xyz atom restraints in kcal/(mol A^2)

  • Type: decimal

  • Default: 0.1

Assign Ligand Partial Charges (charge_ligands): Assign Ligand Partial Charges or not

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Equilibration Setup Parameters

Ligand Force Field (ligand_ff): Force field to be applied to the ligand. The OpenFF >=1.3.1 and Custom force fields may be augmented with bespoke force field parameters by turning on ‘Use Bespoke Parameters When Available’ and providing SMIRNOFF format parameters on the input record.

  • Required

  • Type: string

  • Default: OpenFF_2.2.0

  • Choices: [‘Gaff_1.81’, ‘Gaff_2.2.20’, ‘OpenFF_1.1.1’, ‘OpenFF_1.2.1’, ‘OpenFF_1.3.1’, ‘OpenFF_2.0.0’, ‘OpenFF_2.2.0’, ‘Smirnoff99Frosst’, ‘Custom’]

Custom Ligand Force Field File (custom_offxml_file_in): One or more SMIRNOFF XML files defining the force field to be applied to the ligand. This input is required when ‘Ligand Force Field’ is set to ‘Custom’.

  • Type: file_in

Protein Force Field (protein_ff): Force field to be applied to the protein.

  • Required

  • Type: string

  • Default: Amber14SB

  • Choices: [‘Amber14SB’, ‘Amber99SB’, ‘Amber99SBildn’, ‘AmberFB15’]

Cube max run time (cube_max_run_time): Max Cube Running Time in hrs

  • Type: decimal

  • Default: 1

MD Engine (md_engine): Select the available MD engine

  • Type: string

  • Default: OpenMM

  • Choices: [‘OpenMM’, ‘Gromacs’]

Hydrogen Mass Repartitioning (HMR): Give hydrogens more mass and increase the MD integration time step from 2 to 4 fs

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Trajectory Interval (prod_trajectory_interval):

Trajectory saving interval in nanoseconds. The default trajectory saving interval is 0.025 ns. For denser sampling, set the interval to 0.004 ns. For longer simulations, consider using an interval larger than the default setting, as smaller intervals substantially increase trajectory file sizes and associated AWS storage costs.

  • Type: decimal

  • Default: 0.025

CPU GPU Spot Policy Selection

CPUs (cpu_count_md): The number of CPUs to run this cube with

  • Type: integer

  • Default: 12

GPUs (gpu_count_md): The number of GPUs to run this cube with

  • Type: decimal

  • Default: 1

Spot policy (spot_policy_md): Control cube placement on spot market instances

  • Type: string

  • Default: Allowed

  • Choices: [‘Allowed’, ‘Preferred’, ‘NotPreferred’, ‘Prohibited’, ‘Required’]