The goal of the Plain MD floe is to run long Molecular Dynamics simulations (MD) in the NPT ensemble overcoming the AWS limitation to run parallel cubes for more than 12 hrs. The floe uses a floe topology cycle to address the issue saving checkpoints along the runs and restarting the MD simulations before the AWS running time limit is reached. The floe figure is shown below:
The cubes to highlight are the Recovery Restart, the NPT and the MD Proxy cubes. The Recovery Restart cube detects if a new start is needed, an attempt restarting is in progress or a recovery restart is required. In addition, this cube examines the input datasets, checking if it has been produced by one of the MD floes like the Protein-ligand MD or from the MD Analysis floes to perform the correct Starting/Restarting. In the floe the NPT cube is running the MD simulations allowing the selection of the MD engine and other md parameter options. Finally, the MD proxy cube coordinates the different MD runs and it is responsible to generate the MD schedules for each flask and checks for termination, e.g. if a given run is completed or need to further run till the selected md running time is reached.
Floe Input and Outputs¶
The Plain MD floe requires just one input md dataset to run. This type of dataset is produces by one of the MD Affinity floes like the Bound Protein-Ligand MD, Ligand Bound and Unbound Equilibration for NES, or Solvate and Run MD floes. It can also accept datasets generated by the Analyze Protein-Ligand MD or the Short Trajectory MD with Analysis floes. The input Orion window is shown below:
Of course the floe also requires setting the dataset names for the output, possible failures, and the recovery datasets.
How to use the floe¶
After filling out the input and output dataset names you will be presented with the selection of job promoted parameter. Below is shown the floe parameter selection window:
In the form important parameters are the Time parameter which defines the total md running time in ns, the Trajectory Interval which defines how often to save trajectory snapshots (#frames ~ Time/Trajectory Interval) and the Cube Max Run Time in hrs. This last parameter must be <=10hrs and defines when to stop a MD run and attempting a new restart along the cycle. For example, if Cube Max Run Time is set to 5 hrs every roughly 5hrs each MD run is stopped and restarted till the selected running Time is reached.
After the floe has completed two datasets are produced: the output and the recovery datasets. The output dataset can be used as input to the Analysis floe or to extend the md running time re-running the Plain MD floe. For example, after completing 10 ns Plain MD runs another 15 ns can added re-inputting the floe with the 10 ns out dataset and selecting 15 ns as new running Time for a total running time of 25ns. In this case the Plain MD floe should detect the restarting process and run for further 15ns.
The recovery dataset should be used if something goes wrong along the runs. For example, if out of 10 md simulations 7 succeeded and 3 failed it is possible to re-input the Plain MD floe with the recovery dataset produced along the runs and the floe will attempt the restarting of the 3 failed datasets from the last available checkpoints.