Short Trajectory MD with Analysis [MDPrep] [MDRun] [MDAnalysis] [STMD]

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

  • Product-based/Molecular Dynamics/GROMACS

  • Product-based/Molecular Dynamics/OpenMM

  • Role-based/Computational Chemist

  • Role-based/Medicinal Chemist

  • Task-based/Molecular Dynamics

  • Task-based/Affinity Prediction

  • Solution-based/Hit to Lead/Target Preparation/Generic MD simulation

  • Solution-based/Hit to Lead/Affinity Prediction/STMD

  • Solution-based/Small Molecule Lead-opt/Affinity

Description

  • Tutorials and Further Reading:

  • Purpose:

    • This Floe performs MD simulations given a prepared protein and a set of posed and prepared ligands, running both bound and unbound simulations of each ligand, then analyzes the bound trajectory for pose stability. The output bound/unbound datasets can be used as input datasets of nonequilibrium switching floes.

  • Method Recommendations/Requirements:

    • The ligands need to have reasonable 3D coordinates, all atoms, and correct chemistry (in particular bond orders and formal charges).

    • Each ligand can have multiple conformers but each conformer will be run separately as a different ligand.

    • The starting poses should not have very high forces, in particular no bad clashes with the protein.

    • The protein needs to be prepared to MD standards: protein chains must be capped, all atoms in protein residues (including hydrogens) must be present, and missing protein loops resolved or capped. Typically a Spruce floe is used.

    • Crystallographic internal waters should be retained where possible.

  • Limitations

    • Currently this Floe cannot handle covalent bonds between different components such as ligand, protein, and cofactors.

    • Glycosylation on proteins is truncated and the amino acid is capped with H.

  • Expertise Level:

    • Regular/Intermediate/Advanced

  • Compute Resource:

    • Depends on simulation length.

  • Keywords:

    • MD, MDPrep, MDAnalysis, STMD

  • Related Floes:

    • Bound Protein-Ligand MD [MDPrep] [MD]

    • Analyze Protein-Ligand MD [MDAnalysis]

    • Convert MD Analysis results to Cluster-Centric Dataset [Utility]

      • Convert ligand-centric output from this Floe into cluster-centric output to select clusters for further work

For bound states, given the inputs of the protein and posed ligands, the complex is formed with each ligand/conformer separately, and the complex is solvated and parametrized according to the selected force fields. A minimization stage is performed on the system followed by a warm up (NVT ensemble) and three equilibration stages (NPT ensemble). In the minimization, warm up, and equilibration stages, positional harmonic restraints are applied on the ligand and protein. At the end of the equilibration stages a short (default 6 ns) production run is performed on the unrestrained system. The production run is then analyzed. Trajectories from different starting poses of the same ligand are combined and analyzed collectively. One analysis is in terms of interactions between the ligand and the active site. Another looks at clustering the ligand positions in the protein active site after fitting the trajectory based on active site C_alphas. Ensemble MMPBSA (single trajectory and dual-trajectory) and ensemble BintScore calculations are carried out on the trajectory and are localized to the ligand clusters. An HTML Floe report is generated for the top-scoring 100 ligands by single-trajectory ensemble MMPBSA score. Once the analysis is done, it generates a ready-to-be-downloaded tarball file in Amazon S3, which includes the analysis results in CSV files, the HTML floe report, ligand trajectories, and molecular structure files of cluster medians and averages.

For unbound states, OpenEye’s Freeform is used to generate an ensemble of unique unbound conformations. By default, it uses the input bio-active conformation as a starting pose for a single 6 ns unbound simulation, but users can choose to use the n most probable conformations from the ensemble as starting poses for independent m ns (”Unbound State Equilibration Production Time”) unbound simulations, by setting the Cube parameters “Sampling Scheme” to “State Probability” and “Number Of Starting Confs” to n from “FreeForm Output Ligand Setting” Cube.

Three datasets are written: a Freeform output dataset and unbound/bound MD analysis datasets. The latter two datasets can be used as bound/unbound input datasets of nonequilibrium switching floes.

Promoted Parameters

Title in user interface (promoted name)

Inputs

Protein Input Dataset (protein): Protein Input Dataset

  • Type: data_source

Ligand Input Dataset (ligands): Ligands-only input dataset or protein-ligand input dataset containing Design Unit prepared by SPRUCE

  • Required

  • Type: data_source

CPU GPU Spot Policy Selection

CPU Count (cpu_count_md): The number of CPUs to run this cube with

  • Type: integer

  • Default: 12

GPU Count (gpu_count_md): The number of GPUs to run this cube with

  • Type: integer

  • Default: 1

AWS Spot Instances For MD Cubes (spot_policy_md): Control cube placement on spot market instances

  • Type: string

  • Default: Allowed

  • Choices: [‘Allowed’, ‘Preferred’, ‘NotPreferred’, ‘Prohibited’, ‘Required’]

Bound and Unbound Production Parameters

Number of Bound State MD Starts (n_md_starts): The number of Bound MD starts for each ligand/conformer

  • Type: integer

  • Default: 1

Bound States NPT Production Runtime (prod_ns): NPT simulation production time in nanoseconds

  • Type: decimal

  • Default: 6.0

Unbound State NPT Production Time (prod_unb_us_ns): NPT simulation production time for each starting pose in nanoseconds

  • Type: decimal

  • Default: 6.0

Complex Setup Parameters

Protein Name (flask_title): Prefix name used to identity the Protein. If not specified, it will use the title of the input protein.

  • Type: string

  • Default:

Restrain protein tumbling (restraint_protein_tumbling): Restraining protein tumbling allows for a smaller flask

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Restrain protein tumbling wt (restraint_protein_tumbling_Wt): Restraint weight for pre-defined xyz atom restraints in kcal/(mol A^2)

  • Type: decimal

  • Default: 0.1

Assign Ligand Partial Charges (charge_ligands): Assign Ligand Partial Charges or not

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Equilibration Setup Parameters

Ligand Force Field (ligand_ff): Force field to be applied to the ligand. The OpenFF >=1.3.1 and Custom force fields may be augmented with bespoke force field parameters by turning on ‘Use Bespoke Parameters When Available’ and providing SMIRNOFF format parameters on the input record.

  • Required

  • Type: string

  • Default: OpenFF_2.2.0

  • Choices: [‘Gaff_1.81’, ‘Gaff_2.11’, ‘OpenFF_1.1.1’, ‘OpenFF_1.2.1’, ‘OpenFF_1.3.1’, ‘OpenFF_2.0.0’, ‘OpenFF_2.2.0’, ‘Smirnoff99Frosst’, ‘Custom’]

Custom Ligand Force Field File (custom_offxml_file_in): One or more SMIRNOFF XML files defining the force field to be applied to the ligand. This input is required when ‘Ligand Force Field’ is set to ‘Custom’.

  • Type: file_in

Protein Force Field (protein_ff): Force field to be applied to the protein.

  • Required

  • Type: string

  • Default: Amber14SB

  • Choices: [‘Amber14SB’, ‘Amber99SB’, ‘Amber99SBildn’, ‘AmberFB15’]

Cube max run time (cube_max_run_time): Max Cube Running Time in hrs

  • Type: decimal

  • Default: 1

MD Engine (md_engine): Select the available MD engine

  • Type: string

  • Default: OpenMM

  • Choices: [‘OpenMM’, ‘Gromacs’]

Hydrogen Mass Repartitioning (HMR): Give hydrogens more mass and increase the MD integration time step from 2 to 4 fs

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Trajectory Interval (prod_trajectory_interval): Trajectory saving interval in nanoseconds

  • Type: decimal

  • Default: 0.004