How-To Guide

What this Floe does

Given the inputs of the protein and posed ligands, the complex is formed with each ligand/conformer separately, and the complex is solvated and parametrized according to the selected force fields. We refer to this ready-to-run molecular assembly as a “flask” by analogy to experiment: all the components are combined into the flask, upon which we run our experiment.

For the bound state, the minimization stage is performed on the flask followed by a warm-up (NVT ensemble) and several equilibration stages (NPT ensemble). In the minimization, warm-up, and equilibration stages, positional harmonic restraints are applied on the ligand and protein. At the end of the equilibration stages, a short (default 6 ns) production run is performed on the unrestrained flask. The production run is then analysed in terms of interactions between the ligand and the active site and in terms of ligand RMSD, after fitting the trajectory based on active site C_alphas.

Ligand input

Just to be able to run, this Floe requires ligands to have reasonable 3D coordinates, all atoms, and correct chemistry (in particular bond orders and formal charges). If the ligands already have good atomic partial charges (we recommend RESP or AM1-BCC_ELF10 charges), we recommend using these for STMD as opposed to re-charging them in the STMD Floe.

Given that this Floe only runs a very short timescale by default (6 ns), it is preferable that the input pose be well refined.

Although bad clashes (or poor positioning for interactions that you know are important) can be (and often are) cleaned up by even this short trajectory, it starts off the “evaluation” purpose of the Floe on the wrong foot by giving a poor comparator.

Poor initial poses might even be considered outside the scope of this Floe, given how short is the default timescale. This is why we strongly recommend that docked poses be subsequently minimized in the active site before input to STMD. This will resolve high gradients (usually clashes) with the protein and to allow protein-ligand interactions to optimize in the context of a good force field. It is possible that even with this pre-MD refinement, the docked-pose starting points could be reevaluated and triaged prior to the extra effort and expense of STMD.

Protein Input

All the MD Floes require correctly prepared protein up to “MD ready” standards. This begins with the normal prerequisites for physics-based modeling:

  • Protein chains must be capped,

  • All atoms in protein residues (including hydrogens) must be present, and

  • Missing protein loops resolved or capped.

Of course, protein side-chain formal charges and protonation at this point determine their tautomeric state.

Additionally, cofactors and structured internal waters are also important to include, not only those in the immediate vicinity of the ligand and active site but also distally because they can have an important effect on the protein structure and dynamics over the course of the MD.

We strongly recommend using Spruce for protein preparation.

Warning

Unfortunately, proteins with covalently bound ligands or covalently bound cofactors are currently not tractable

How to run this Floe

After selecting the Short Trajectory MD with Analysis Floe in the Orion UI, you will be presented with a job form with parameters to select. In Figure Key fields of STMD Job Form you can see the key fields of that form.

STMD Job Form

Key fields of STMD Job Form

Aside from the essential user-defined parameters relating to job name, input (protein and ligand datasets as described above), and output (output and failure dataset name), all other parameters except Protein Name have reasonable defaults.

The top-level parameters are:

  • Protein Name (no default): Here is where you can put a handy short name for the protein to use in molecule titles (e.g. “Bace” instead of “beta-secretase”).

  • Number Of Bound State MD Starts (default 1): This allows the user to ask for N independent starts to each ligand/pose, giving rise to N independent MD runs; this gives more sampling while keeping the simulation closer to the starting pose.

  • Bound State Equilibration Production Time (default 6): The default 6 is set to be synchronized with the De Groot protocol, so that the output ligand-bound/unbound datasets can be directly used in Nonequilibrium switching (NES) Floe. For the fast, High-throughput pose validation, users can change to 2, but the output datasets won’t be suitable for NES.

  • Restrain protein tumbling (default Off): Enabling this parameter allows for the automatic selection of the stable protein region, applying gentle restraints to prevent protein tumbling and using a smaller water box. The default setting is “off”.

  • Assign Ligand Partial Charges (default On): If your input ligands already have good atomic partial charges (e.g. RESP or AM1-BCC_ELF10), set this to Off to have the Floe use the existing ligand charges.

  • Custom Ligand Force Field File: One or more SMIRNOFF XML files defining the force field to be applied to the ligand. This input is required when Ligand Force Field is set to Custom.

  • Ligand Force Field (default OpenFF2.0.0): This forcefield choice has a strong impact on the results. We recommend the most recent version of the OpenFF force field from the Open Force Field Initiative.

  • MD Engine (default OpenMM): Gromacs is the other alternative.

  • Hydrogen Mass Repartitioning: Hydrogen Mass Repartitioning (HMR) gives a two-fold speedup and reduces cost. We recommend leaving it on.

We make the other top-level parameters available for expert users by turning on Show Cube Parameters at the bottom of the input form and then drilling down into the parameters of the desired Cube in the list below.

Note

By default, the input bio-active (bound) conformation is used as a starting point for the unbound state of the ligand. To support non-bio-active starting conformations for the unbound state, OpenEye’s Freeform is used to generate an ensemble of unique unbound conformations. By setting the Cube parameters Sampling Scheme to State Probability and Number Of Starting Confs to n from FreeForm Output Ligand Setting Cube, users can use the n most probable conformations from the generated Freeform ensemble as starting points for the independent m ns unbound simulations. The feature is still in the experimental stage, so please use it at your own risk.

How to check results

The results from the STMD Floe are accessed via two main avenues: through the job output in the Jobs tab in Orion’s Floe pages, and through Orion’s Analyze page.