Short Trajectory MD with Analysis [MDPrep] [MDRun] [MDAnalysis]¶

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

Product-based/Molecular Dynamics/GROMACS

Product-based/Molecular Dynamics/OpenMM

Role-based/Computational Chemist

Role-based/Medicinal Chemist

Task-based/Molecular Dynamics

Task-based/Affinity Prediction

Solution-based/Hit to Lead/Target Preparation/Generic MD simulation

Solution-based/Hit to Lead/Affinity Prediction/STMD

Solution-based/Small Molecule Lead-opt/Affinity

Description

Purpose:
- This Floe performs MD simulations given a prepared protein and a set of posed and prepared ligands, running both bound and unbound simulations of each ligand, then analyzes the bound trajectory for pose stability. The output bound/unbound datasets can be used as input datasets of nonequilibrium switching floes.
Method Recommendations/Requirements:
- The ligands need to have reasonable 3D coordinates, all atoms, and correct chemistry (in particular bond orders and formal charges).
- Each ligand can have multiple conformers but each conformer will be run separately as a different ligand.
- The starting poses should not have very high forces, in particular no bad clashes with the protein.
- The protein needs to be prepared to MD standards: protein chains must be capped, all atoms in protein residues (including hydrogens) must be present, and missing protein loops resolved or capped. Typically a Spruce floe is used.
- Crystallographic internal waters should be retained where possible.
Limitations
- Currently this Floe cannot handle covalent bonds between different components such as ligand, protein, and cofactors.
- Glycosylation on proteins is truncated and the amino acid is capped with H.
Expertise Level:
- Regular/Intermediate/Advanced
Compute Resource:
- Depends on simulation length.
Keywords:
- MD, MDPrep, MDAnalysis
Related Floes:
- Bound Protein-Ligand MD [MDPrep] [MD]
- Analyze Protein-Ligand MD [MDAnalysis]
- Convert MD Analysis results to Cluster-Centric Dataset [Utility]
  - Convert ligand-centric output from this Floe into cluster-centric output to select clusters for further work

For bound states, given the inputs of the protein and posed ligands, the complex is formed with each ligand/conformer separately, and the complex is solvated and parametrized according to the selected force fields. A minimization stage is performed on the system followed by a warm up (NVT ensemble) and three equilibration stages (NPT ensemble). In the minimization, warm up, and equilibration stages, positional harmonic restraints are applied on the ligand and protein. At the end of the equilibration stages a short (default 6 ns) production run is performed on the unrestrained system. The production run is then analyzed. Trajectories from different starting poses of the same ligand are combined and analyzed collectively. One analysis is in terms of interactions between the ligand and the active site. Another looks at clustering the ligand positions in the protein active site after fitting the trajectory based on active site C_alphas. Ensemble MMPBSA (single trajectory and dual-trajectory) and ensemble BintScore calculations are carried out on the trajectory and are localized to the ligand clusters. An HTML Floe report is generated for the top-scoring 100 ligands by single-trajectory ensemble MMPBSA score. Once the analysis is done, it generates a ready-to-be-downloaded tarball file in Amazon S3, which includes the analysis results in CSV files, the HTML floe report, ligand trajectories, and molecular structure files of cluster medians and averages.

For unbound states, OpenEye’s Freeform is used to generate an ensemble of unique unbound conformations. By default, it uses the input bio-active conformation as a starting pose for a single 6 ns unbound simulation, but users can choose to use the n most probable conformations from the ensemble as starting poses for independent m ns (“Unbound State Equilibration Production Time”) unbound simulations, by setting the Cube parameters “Sampling Scheme” to “State Probability” and “Number Of Starting Confs” to n from “FreeForm Output Ligand Setting” Cube.

Three datasets are written: a Freeform output dataset and unbound/bound MD analysis datasets. The latter two datasets can be used as bound/unbound input datasets of nonequilibrium switching floes.

Promoted Parameters

Title in user interface (promoted name)

Inputs

Protein Input Dataset (protein): Protein Input Dataset

Type: data_source

Ligand Input Dataset (ligands): Ligands-only input dataset or protein-ligand input dataset containing Design Unit prepared by SPRUCE

Required

Type: data_source

CPU GPU Spot Policy Selection

CPUs (cpu_count_md): The number of CPUs to run this cube with

Type: integer

Default: 16

GPUs (gpu_count_md): The number of GPUs to run this cube with

Type: integer

Default: 1

Spot policy (spot_policy_md): Control cube placement on spot market instances

Type: string

Default: Allowed

Choices: [‘Allowed’, ‘Preferred’, ‘NotPreferred’, ‘Prohibited’, ‘Required’]

Bound and Unbound Production Parameters

Number of Bound State MD Starts (n_md_starts): The number of Bound MD starts for each ligand/conformer

Type: integer

Default: 1

Unbound State NPT Production Time (prod_unb_us_ns): NPT simulation production time for each starting pose in nanoseconds

Type: decimal

Default: 6.0

Bound States NPT Production Runtime (prod_ns)*:

NPT simulation production time in nanoseconds

Type: decimal

Default: 6.0

Complex Setup Parameters

Protein Name (flask_title): Prefix name used to identity the Protein. If not specified, it will use the title of the input protein.

Type: string

Default:

Assign Ligand Partial Charges (charge_ligands): Assign Ligand Partial Charges or not

Type: boolean

Default: True

Choices: [True, False]

Equilibration Setup Parameters

Cube max run time (cube_max_run_time): Max Cube Running Time in hrs

Type: decimal

Default: 1

Protein Force Field (protein_ff): Force field to be applied to the protein

Type: string

Default: Amber14SB

Choices: [‘Amber14SB’, ‘Amber99SB’, ‘Amber99SBildn’, ‘AmberFB15’]

Ligand Force Field (ligand_ff): Force field to be applied to the ligand

Type: string

Default: OpenFF_2.0.0

Choices: [‘Gaff_1.81’, ‘Gaff_2.11’, ‘OpenFF_1.1.1’, ‘OpenFF_1.2.1’, ‘OpenFF_1.3.1’, ‘OpenFF_2.0.0’, ‘Smirnoff99Frosst’]

MD Engine (md_engine): Select the available MD engine

Type: string

Default: OpenMM

Choices: [‘OpenMM’, ‘Gromacs’]

Hydrogen Mass Repartitioning (HMR): Give hydrogens more mass to speed up the MD

Type: boolean

Default: True

Choices: [True, False]

Trajectory Interval (prod_trajectory_interval): Unbound Trajectory saving interval in nanoseconds

Type: decimal

Default: 0.004