Short Trajectory MD with Analysis [MDPrep] [MDRun] [MDAnalysis] [STMD]
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Product-based/Molecular Dynamics/GROMACS
Product-based/Molecular Dynamics/OpenMM
Role-based/Computational Chemist
Role-based/Medicinal Chemist
Task-based/Molecular Dynamics
Task-based/Affinity Prediction
Solution-based/Hit to Lead/Target Preparation/Generic MD simulation
Solution-based/Hit to Lead/Affinity Prediction/STMD
Solution-based/Small Molecule Lead-opt/Affinity
Description
Tutorials and Further Reading:
Short Trajectory MD with Analysis (to avoid leaving this page, right-click and open link in new tab)
Bespoke and Custom Force Fields (to avoid leaving this page, right-click and open link in new tab)
Purpose:
This Floe performs MD simulations given a prepared protein and a set of posed and prepared ligands, running both bound and unbound simulations of each ligand, then analyzes the bound trajectory for pose stability. The output bound/unbound datasets can be used as input datasets of nonequilibrium switching floes.
Method Recommendations/Requirements:
The ligands need to have reasonable 3D coordinates, all atoms, and correct chemistry (in particular bond orders and formal charges).
Each ligand can have multiple conformers but each conformer will be run separately as a different ligand.
The starting poses should not have very high forces, in particular no bad clashes with the protein.
The protein needs to be prepared to MD standards: protein chains must be capped, all atoms in protein residues (including hydrogens) must be present, and missing protein loops resolved or capped. Typically a Spruce floe is used.
Crystallographic internal waters should be retained where possible.
Limitations
Currently this Floe cannot handle covalent bonds between different components such as ligand, protein, and cofactors.
Glycosylation on proteins is truncated and the amino acid is capped with H.
Expertise Level:
Regular/Intermediate/Advanced
Compute Resource:
Depends on simulation length.
Keywords:
MD, MDPrep, MDAnalysis, STMD
Related Floes:
Bound Protein-Ligand MD [MDPrep] [MD]
Analyze Protein-Ligand MD [MDAnalysis]
Convert MD Analysis results to Cluster-Centric Dataset [Utility]
Convert ligand-centric output from this Floe into cluster-centric output to select clusters for further work
For bound states, given the inputs of the protein and posed ligands, the complex is formed with each ligand/conformer separately, and the complex is solvated and parametrized according to the selected force fields. A minimization stage is performed on the system followed by a warm up (NVT ensemble) and three equilibration stages (NPT ensemble). In the minimization, warm up, and equilibration stages, positional harmonic restraints are applied on the ligand and protein. At the end of the equilibration stages a short (default 6 ns) production run is performed on the unrestrained system. The production run is then analyzed. Trajectories from different starting poses of the same ligand are combined and analyzed collectively. One analysis is in terms of interactions between the ligand and the active site. Another looks at clustering the ligand positions in the protein active site after fitting the trajectory based on active site C_alphas. Ensemble MMPBSA (single trajectory and dual-trajectory) and ensemble BintScore calculations are carried out on the trajectory and are localized to the ligand clusters. An HTML Floe report is generated for the top-scoring 100 ligands by single-trajectory ensemble MMPBSA score. Once the analysis is done, it generates a ready-to-be-downloaded tarball file in Amazon S3, which includes the analysis results in CSV files, the HTML floe report, ligand trajectories, and molecular structure files of cluster medians and averages.
For unbound states, OpenEye’s Freeform is used to generate an ensemble of unique unbound conformations. By default, it uses the input bio-active conformation as a starting pose for a single 6 ns unbound simulation, but users can choose to use the n most probable conformations from the ensemble as starting poses for independent m ns (”Unbound State Equilibration Production Time”) unbound simulations, by setting the Cube parameters “Sampling Scheme” to “State Probability” and “Number Of Starting Confs” to n from “FreeForm Output Ligand Setting” Cube.
Three datasets are written: a Freeform output dataset and unbound/bound MD analysis datasets. The latter two datasets can be used as bound/unbound input datasets of nonequilibrium switching floes.
Promoted Parameters
Title in user interface (promoted name)
Inputs
Protein Input Dataset (protein): Protein Input Dataset
Type: data_source
Ligand Input Dataset (ligands): Ligands-only input dataset or protein-ligand input dataset containing Design Unit prepared by SPRUCE
Required
Type: data_source
CPU GPU Spot Policy Selection
CPU Count (cpu_count_md): The number of CPUs to run this cube with
Type: integer
Default: 12
GPU Count (gpu_count_md): The number of GPUs to run this cube with
Type: integer
Default: 1
AWS Spot Instances For MD Cubes (spot_policy_md): Control cube placement on spot market instances
Type: string
Default: Allowed
Choices: [‘Allowed’, ‘Preferred’, ‘NotPreferred’, ‘Prohibited’, ‘Required’]
Bound and Unbound Production Parameters
Number of Bound State MD Starts (n_md_starts): The number of Bound MD starts for each ligand/conformer
Type: integer
Default: 1
Bound States NPT Production Runtime (prod_ns): NPT simulation production time in nanoseconds
Type: decimal
Default: 6.0
Unbound State NPT Production Time (prod_unb_us_ns): NPT simulation production time for each starting pose in nanoseconds
Type: decimal
Default: 6.0
Complex Setup Parameters
Protein Name (flask_title): Prefix name used to identity the Protein. If not specified, it will use the title of the input protein.
Type: string
Default:
Restrain protein tumbling (restraint_protein_tumbling): Restraining protein tumbling allows for a smaller flask
Type: boolean
Default: False
Choices: [True, False]
Restrain protein tumbling wt (restraint_protein_tumbling_Wt): Restraint weight for pre-defined xyz atom restraints in kcal/(mol A^2)
Type: decimal
Default: 0.1
Assign Ligand Partial Charges (charge_ligands): Assign Ligand Partial Charges or not
Type: boolean
Default: True
Choices: [True, False]
Equilibration Setup Parameters
Ligand Force Field (ligand_ff): Force field to be applied to the ligand. The OpenFF >=1.3.1 and Custom force fields may be augmented with bespoke force field parameters by turning on ‘Use Bespoke Parameters When Available’ and providing SMIRNOFF format parameters on the input record.
Required
Type: string
Default: OpenFF_2.2.0
Choices: [‘Gaff_1.81’, ‘Gaff_2.11’, ‘OpenFF_1.1.1’, ‘OpenFF_1.2.1’, ‘OpenFF_1.3.1’, ‘OpenFF_2.0.0’, ‘OpenFF_2.2.0’, ‘Smirnoff99Frosst’, ‘Custom’]
Custom Ligand Force Field File (custom_offxml_file_in): One or more SMIRNOFF XML files defining the force field to be applied to the ligand. This input is required when ‘Ligand Force Field’ is set to ‘Custom’.
Type: file_in
Protein Force Field (protein_ff): Force field to be applied to the protein.
Required
Type: string
Default: Amber14SB
Choices: [‘Amber14SB’, ‘Amber99SB’, ‘Amber99SBildn’, ‘AmberFB15’]
Cube max run time (cube_max_run_time): Max Cube Running Time in hrs
Type: decimal
Default: 1
MD Engine (md_engine): Select the available MD engine
Type: string
Default: OpenMM
Choices: [‘OpenMM’, ‘Gromacs’]
Hydrogen Mass Repartitioning (HMR): Give hydrogens more mass and increase the MD integration time step from 2 to 4 fs
Type: boolean
Default: True
Choices: [True, False]
Trajectory Interval (prod_trajectory_interval): Trajectory saving interval in nanoseconds
Type: decimal
Default: 0.004