Short Trajectory MD with Analysis [MDPrep] [MDRun] [MDAnalysis] [STMD]

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

Product-based/Molecular Dynamics/GROMACS

Product-based/Molecular Dynamics/OpenMM

Role-based/Computational Chemist

Role-based/Medicinal Chemist

Task-based/Molecular Dynamics

Task-based/Affinity Prediction

Solution-based/Hit to Lead/Target Preparation/Generic MD simulation

Solution-based/Hit to Lead/Affinity Prediction/STMD

Solution-based/Small Molecule Lead-opt/Affinity

Description

Tutorials and Further Reading:
- Short Trajectory MD with Analysis (to avoid leaving this page, right-click and open link in new tab)
- Bespoke and Custom Force Fields (to avoid leaving this page, right-click and open link in new tab)
Purpose:
- This Floe performs MD simulations given a prepared protein and a set of posed and prepared ligands, running both bound and unbound simulations of each ligand, then analyzes the bound trajectory for pose stability. The output bound/unbound datasets can be used as input datasets of nonequilibrium switching floes.
Method Recommendations/Requirements:
- The ligands need to have reasonable 3D coordinates, all atoms, and correct chemistry (in particular, bond orders and formal charges).
- Each ligand can have multiple conformers but each conformer will be run separately as a different ligand.
- The starting poses should not have very high forces, in particular no bad clashes with the protein.
- The protein needs to be prepared to MD standards: protein chains must be capped, all atoms in protein residues (including hydrogens) must be present, and missing protein loops resolved or capped. Typically, a Spruce floe is used.
- Crystallographic internal waters should be retained where possible.
Limitations
- Currently this Floe cannot handle covalent bonds between different components such as ligand, protein, and cofactors.
- Glycosylation on proteins is truncated and the amino acid is capped with H.
Expertise Level:
- Regular/Intermediate/Advanced
Compute Resource:
- Depends on simulation length.
Keywords:
- MD, MDPrep, MDAnalysis, STMD
Related Floes:
- Bound Protein-Ligand MD [MDPrep] [MD]
- Analyze Protein-Ligand MD [MDAnalysis]
- Convert MD Analysis results to Cluster-Centric Dataset [Utility]
  - Convert ligand-centric output from this Floe into cluster-centric output to select clusters for further work

For bound states, given the inputs of the protein and posed ligands, the complex is formed with each ligand/conformer separately, and the complex is solvated and parametrized according to the selected force fields. A minimization stage is performed on the system followed by a warm up (NVT ensemble) and three equilibration stages (NPT ensemble). In the minimization, warm up, and equilibration stages, positional harmonic restraints are applied on the ligand and protein. At the end of the equilibration stages a short (default 6 ns) production run is performed on the unrestrained system. The production run is then analyzed. Trajectories from different starting poses of the same ligand are combined and analyzed collectively. One analysis is in terms of interactions between the ligand and the active site. Another looks at clustering the ligand positions in the protein active site after fitting the trajectory based on active site C_alphas. Ensemble MMPBSA (single trajectory and dual-trajectory) and ensemble BintScore calculations are carried out on the trajectory and are localized to the ligand clusters. An HTML Floe report is generated for the top-scoring 100 ligands by single-trajectory ensemble MMPBSA score. Once the analysis is done, it generates a ready-to-be-downloaded tarball file in Amazon S3, which includes the analysis results in CSV files, the HTML floe report, ligand trajectories, and molecular structure files of cluster medians and averages.

For unbound states, OpenEye’s Freeform is used to generate an ensemble of unique unbound conformations. By default, it uses the input bio-active conformation as a starting pose for a single 6 ns unbound simulation, but users can choose to use the n most probable conformations from the ensemble as starting poses for independent m ns (”Unbound State Equilibration Production Time”) unbound simulations, by setting the Cube parameters “Sampling Scheme” to “State Probability” and “Number Of Starting Confs” to n from “FreeForm Output Ligand Setting” Cube.

Three datasets are written: the unbound and bound MD analysis datasets, and a Bound MD output dataset. The bound and unbound MD analysis datasets can be used as bound/unbound input datasets of the nonequilibrium switching floes. A fourth, optional dataset can also be written, a FreeForm output dataset.

Promoted Parameters

Title in user interface (promoted name)

Inputs

Protein Input Dataset (protein): Protein Input Dataset

Type: data_source

Ligand Input Dataset (ligands): Ligands-only input dataset or protein-ligand input dataset containing Design Unit prepared by SPRUCE

Required

Type: data_source

CPU GPU Spot Policy Selection

CPUs (cpu_count_md): The number of CPUs to run this cube with

Type: integer

Default: 12

GPUs (gpu_count_md): The number of GPUs to run this cube with

Type: integer

Default: 1

Spot policy (spot_policy_md): Control cube placement on spot market instances

Type: string

Default: Allowed

Choices: [‘Allowed’, ‘Preferred’, ‘NotPreferred’, ‘Prohibited’, ‘Required’]

Bound and Unbound Production Parameters

Number of Bound State MD Starts (n_md_starts): The number of Bound MD starts for each ligand/conformer

Type: integer

Default: 1

Bound States NPT Production Runtime (prod_ns): NPT simulation production time in nanoseconds

Type: decimal

Default: 6.0

Unbound State NPT Production Time (prod_unb_us_ns): NPT simulation production time for each starting pose in nanoseconds

Type: decimal

Default: 6.0

Complex Setup Parameters

Protein Name (flask_title): Prefix name used to identity the Protein. If not specified, it will use the title of the input protein.

Type: string

Default:

Restrain protein tumbling (restraint_protein_tumbling): Restraining protein tumbling allows for a smaller flask

Type: boolean

Default: False

Choices: [True, False]

Restrain protein tumbling wt (restraint_protein_tumbling_Wt): Restraint weight for pre-defined xyz atom restraints in kcal/(mol A^2)

Type: decimal

Default: 0.1

Assign Ligand Partial Charges (charge_ligands): Assign Ligand Partial Charges or not

Type: boolean

Default: True

Choices: [True, False]

Equilibration Setup Parameters

Ligand Force Field (ligand_ff): Force field to be applied to the ligand. The OpenFF >=1.3.1 and Custom force fields may be augmented with bespoke force field parameters by turning on ‘Use Bespoke Parameters When Available’ and providing SMIRNOFF format parameters on the input record.

Required

Type: string

Default: OpenFF_2.2.0

Choices: [‘Gaff_1.81’, ‘Gaff_2.11’, ‘OpenFF_1.1.1’, ‘OpenFF_1.2.1’, ‘OpenFF_1.3.1’, ‘OpenFF_2.0.0’, ‘OpenFF_2.2.0’, ‘Smirnoff99Frosst’, ‘Custom’]

Custom Ligand Force Field File (custom_offxml_file_in): One or more SMIRNOFF XML files defining the force field to be applied to the ligand. This input is required when ‘Ligand Force Field’ is set to ‘Custom’.

Type: file_in

Protein Force Field (protein_ff): Force field to be applied to the protein.

Required

Type: string

Default: Amber14SB

Choices: [‘Amber14SB’, ‘Amber99SB’, ‘Amber99SBildn’, ‘AmberFB15’]

Cube max run time (cube_max_run_time): Max Cube Running Time in hrs

Type: decimal

Default: 1

MD Engine (md_engine): Select the available MD engine

Type: string

Default: OpenMM

Choices: [‘OpenMM’, ‘Gromacs’]

Hydrogen Mass Repartitioning (HMR): Give hydrogens more mass and increase the MD integration time step from 2 to 4 fs.

Type: boolean

Default: True

Choices: [True, False]

Trajectory Interval (prod_trajectory_interval): Trajectory saving interval in nanoseconds

Type: decimal

Default: 0.004