Short Trajectory MD with Analysis [MDPrep] [MDRun] [MDAnalysis]¶
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Product-based/Molecular Dynamics/GROMACS
Product-based/Molecular Dynamics/OpenMM
Role-based/Computational Chemist
Role-based/Medicinal Chemist
Task-based/Molecular Dynamics
Task-based/Affinity Prediction
Solution-based/Hit to Lead/Target Preparation/Generic MD simulation
Solution-based/Hit to Lead/Affinity Prediction/STMD
Solution-based/Small Molecule Lead-opt/Affinity
Description
Purpose:
This Floe performs MD simulations given a prepared protein and a set of posed and prepared ligands, running both bound and unbound simulations of each ligand, then analyzes the bound trajectory for pose stability. The output bound/unbound datasets can be used as input datasets of nonequilibrium switching floes.
Method Recommendations/Requirements:
The ligands need to have reasonable 3D coordinates, all atoms, and correct chemistry (in particular bond orders and formal charges).
Each ligand can have multiple conformers but each conformer will be run separately as a different ligand.
The starting poses should not have very high forces, in particular no bad clashes with the protein.
The protein needs to be prepared to MD standards: protein chains must be capped, all atoms in protein residues (including hydrogens) must be present, and missing protein loops resolved or capped. Typically a Spruce floe is used.
Crystallographic internal waters should be retained where possible.
Limitations
Currently this Floe cannot handle covalent bonds between different components such as ligand, protein, and cofactors.
Glycosylation on proteins is truncated and the amino acid is capped with H.
Expertise Level:
Regular/Intermediate/Advanced
Compute Resource:
Depends on simulation length.
Keywords:
MD, MDPrep, MDAnalysis
Related Floes:
Bound Protein-Ligand MD [MDPrep] [MD]
Analyze Protein-Ligand MD [MDAnalysis]
Convert MD Analysis results to Cluster-Centric Dataset [Utility]
Convert ligand-centric output from this Floe into cluster-centric output to select clusters for further work
For bound states, given the inputs of the protein and posed ligands, the complex is formed with each ligand/conformer separately, and the complex is solvated and parametrized according to the selected force fields. A minimization stage is performed on the system followed by a warm up (NVT ensemble) and three equilibration stages (NPT ensemble). In the minimization, warm up, and equilibration stages, positional harmonic restraints are applied on the ligand and protein. At the end of the equilibration stages a short (default 6 ns) production run is performed on the unrestrained system. The production run is then analyzed. Trajectories from different starting poses of the same ligand are combined and analyzed collectively. One analysis is in terms of interactions between the ligand and the active site. Another looks at clustering the ligand positions in the protein active site after fitting the trajectory based on active site C_alphas. Ensemble MMPBSA (single trajectory and dual-trajectory) and ensemble BintScore calculations are carried out on the trajectory and are localized to the ligand clusters. An HTML Floe report is generated for the top-scoring 100 ligands by single-trajectory ensemble MMPBSA score. Once the analysis is done, it generates a ready-to-be-downloaded tarball file in Amazon S3, which includes the analysis results in CSV files, the HTML floe report, ligand trajectories, and molecular structure files of cluster medians and averages.
For unbound states, OpenEye’s Freeform is used to generate an ensemble of unique unbound conformations. By default, it uses the input bio-active conformation as a starting pose for a single 6 ns unbound simulation, but users can choose to use the n most probable conformations from the ensemble as starting poses for independent m ns (“Unbound State Equilibration Production Time”) unbound simulations, by setting the Cube parameters “Sampling Scheme” to “State Probability” and “Number Of Starting Confs” to n from “FreeForm Output Ligand Setting” Cube.
Three datasets are written: a Freeform output dataset and unbound/bound MD analysis datasets. The latter two datasets can be used as bound/unbound input datasets of nonequilibrium switching floes.
Promoted Parameters
Title in user interface (promoted name)
Inputs
Protein Input Dataset (protein): Protein Input Dataset
Type: data_source
Ligand Input Dataset (ligands): Ligands-only input dataset or protein-ligand input dataset containing Design Unit prepared by SPRUCE
Required
Type: data_source
CPU GPU Spot Policy Selection
CPUs (cpu_count_md): The number of CPUs to run this cube with
Type: integer
Default: 16
GPUs (gpu_count_md): The number of GPUs to run this cube with
Type: integer
Default: 1
Spot policy (spot_policy_md): Control cube placement on spot market instances
Type: string
Default: Allowed
Choices: [‘Allowed’, ‘Preferred’, ‘NotPreferred’, ‘Prohibited’, ‘Required’]
Bound and Unbound Production Parameters
Number of Bound State MD Starts (n_md_starts): The number of Bound MD starts for each ligand/conformer
Type: integer
Default: 1
Unbound State NPT Production Time (prod_unb_us_ns): NPT simulation production time for each starting pose in nanoseconds
Type: decimal
Default: 6.0
Bound States NPT Production Runtime (prod_ns)*:
NPT simulation production time in nanoseconds
Type: decimal
Default: 6.0
Complex Setup Parameters
Protein Name (flask_title): Prefix name used to identity the Protein. If not specified, it will use the title of the input protein.
Type: string
Default:
Assign Ligand Partial Charges (charge_ligands): Assign Ligand Partial Charges or not
Type: boolean
Default: True
Choices: [True, False]
Equilibration Setup Parameters
Cube max run time (cube_max_run_time): Max Cube Running Time in hrs
Type: decimal
Default: 1
Protein Force Field (protein_ff): Force field to be applied to the protein
Type: string
Default: Amber14SB
Choices: [‘Amber14SB’, ‘Amber99SB’, ‘Amber99SBildn’, ‘AmberFB15’]
Ligand Force Field (ligand_ff): Force field to be applied to the ligand
Type: string
Default: OpenFF_2.0.0
Choices: [‘Gaff_1.81’, ‘Gaff_2.11’, ‘OpenFF_1.1.1’, ‘OpenFF_1.2.1’, ‘OpenFF_1.3.1’, ‘OpenFF_2.0.0’, ‘Smirnoff99Frosst’]
MD Engine (md_engine): Select the available MD engine
Type: string
Default: OpenMM
Choices: [‘OpenMM’, ‘Gromacs’]
Hydrogen Mass Repartitioning (HMR): Give hydrogens more mass to speed up the MD
Type: boolean
Default: True
Choices: [True, False]
Trajectory Interval (prod_trajectory_interval): Unbound Trajectory saving interval in nanoseconds
Type: decimal
Default: 0.004