Generate Most Probable Path from WEMD Simulation

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

  • Role-based/Structural Biologist

  • Solution-based/Target Identification/Target Preparation

  • Task-based/Target Prep & Analysis/Protein Preparation

  • Product-based/Molecular Dynamics/WESTPA

Description

This floe generates the most probable path from the starting structure to a selected minimum or region specified by the input parameters of minimum and maximum values for 1D or 2D weighted ensemble MD (WEMD) simulations. It goes through the weights of all walkers within the selected region at the final state, finds the one with largest weight, and traces back to the starting state. For the case of 2D simulations, it will also map the path onto the estimated 2D free energy surface of the progress coordinates. Note that the resulting path is a transition path between the starting structure and the final structure in the selected range; the waiting time is neglected, however. This is not a minimum free energy path based on the rigorous statistical theory.

Promoted Parameters

Title in user interface (promoted name)

Inputs

Collection (collection): Protein sampling data is generated from previous weighted ensemble MD simulations. Only one WEMD collection is created for a series of WEMD simulations for the same system and setting.

  • Required

  • Type: collection_source

  • Default: wemd_simulation_collection

Superpose Control (superpose_ctrl): Toggle on to superpose trajectory onto input reference structure.

  • Required

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Reference Protein Dataset after SPRUCE Preparation (ref_dataset): Input dataset for reference protein to superpose output structures.

  • Required

  • Type: data_source

Superposition method (superpose_method): Superposition method to fit simulation structure to reference structure.

  • Required

  • Type: string

  • Default: GlobalSequence

  • Choices: [‘GlobalSequence’, ‘SiteSequence’, ‘DDMatrix’, ‘SSE’, ‘SiteHopper’]

Reference Cryo-EM Map (ref_map): Input file for the reference cryo-EM map (in .mrc file format).

  • Required

  • Type: file_in

Cryo-EM Map Resolution (resolution): Resolution of output cryo-EM maps (angstroms).

  • Required

  • Type: decimal

  • Default: 2.0

Selected Components to Output (mask_components): Selected components to output the structural candidates.

  • Required

  • Type: string

  • Default: [‘protein’, ‘nucleic’, ‘ligand’]

  • Choices: [‘protein’, ‘nucleic’, ‘ligand’, ‘solvent’, ‘metals’, ‘counter_ions’, ‘lipids’, ‘packing_residues’, ‘sugars’, ‘undefined’, ‘cofactors’, ‘excipients’, ‘polymers’, ‘post_translational’, ‘other_proteins’, ‘other_nucleics’, ‘other_ligands’, ‘other_cofactors’]

Resize Cryo-EM Map(s) (resize_map): Toggle on to resize cryo-EM map(s) using reference structure(s) with 10 angstroms of padding.

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Selection Ranges for Final States

Minimum Value of First Dimension (min_value1): Minimum value of first dimension to select structures in the final states.

  • Required

  • Type: decimal

Maximum Value for First Dimension (max_value1): Maximum value of first dimension to select structures in the final state.

  • Required

  • Type: decimal

Minimum Value of Second Dimension (optional) (min_value2): Minimum value of second dimension to select structures in the final states. Not required if using 1D progress coordinate.

  • Type: decimal

Maximum Value of Second Dimension (optional) (max_value2): Maximum value of second dimension to select structures in the final states. Not required if using 1D progress coordinate.

  • Type: decimal

Selection Ranges for Trajectories

First Iteration Number (start_iteration): First iteration number for estimating 2D free energy map of progress coordinates.

  • Type: integer

  • Default: 1

Last Iteration Number (end_iteration): Last iteration number for estimating 2D free energy map of progress coordinates.

  • Type: integer

  • Default: 100

Stride Number (N) (stride): Skip every N frames in saving path trajectory file. Not valid if MD segments are extracted with endpoint only option.

  • Type: integer

  • Default: 1

Parameters for Free Energy Maps/Surfaces (optional)

Number of Bins for Histogram (number_bins): Number of bins for plotting density distribution.

  • Type: integer

  • Default: 100

Maximum Value of Estimated Free Energy (max_free_energy): Maximum value for showing 2D free energy map in colors in unit of KT.

  • Type: decimal

  • Default: 50.0

Outputs

None (analysis_report_title): Title for the path report

  • Type: string

  • Default: Path on Free Energy Map Report

Output Structures for Most Probable Path (data_out): Output dataset that saves simulation structures along the most probable path.

  • Required

  • Type: dataset_out

  • Default: path_structures_dataset