Generate Most Probable Path from WEMD Simulation
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Role-based/Structural Biologist
Solution-based/Target Identification/Target Preparation
Task-based/Target Prep & Analysis/Protein Preparation
Product-based/Molecular Dynamics/WESTPA
Description
This floe generates the most probable path from the starting structure to a selected minimum or region specified by the input parameters of minimum and maximum values for 1D or 2D weighted ensemble MD (WEMD) simulations. It goes through the weights of all walkers within the selected region at the final state, finds the one with largest weight, and traces back to the starting state. For the case of 2D simulations, it will also map the path onto the estimated 2D free energy surface of the progress coordinates. Note that the resulting path is a transition path between the starting structure and the final structure in the selected range; the waiting time is neglected, however. This is not a minimum free energy path based on the rigorous statistical theory.
Promoted Parameters
Title in user interface (promoted name)
Inputs
Collection (collection): Protein sampling data is generated from previous weighted ensemble MD simulations. Only one WEMD collection is created for a series of WEMD simulations for the same system and setting.
Required
Type: collection_source
Default: wemd_simulation_collection
Superpose Control (superpose_ctrl): Toggle on to superpose trajectory onto input reference structure.
Required
Type: boolean
Default: True
Choices: [True, False]
Reference Protein Dataset after SPRUCE Preparation (ref_dataset): Input dataset for reference protein to superpose output structures.
Required
Type: data_source
Superposition method (superpose_method): Superposition method to fit simulation structure to reference structure.
Required
Type: string
Default: GlobalSequence
Choices: [‘GlobalSequence’, ‘SiteSequence’, ‘DDMatrix’, ‘SSE’, ‘SiteHopper’]
Reference Cryo-EM Map (ref_map): Input file for the reference cryo-EM map (in .mrc file format).
Required
Type: file_in
Cryo-EM Map Resolution (resolution): Resolution of output cryo-EM maps (angstroms).
Required
Type: decimal
Default: 2.0
Selected Components to Output (mask_components): Selected components to output the structural candidates.
Required
Type: string
Default: [‘protein’, ‘nucleic’, ‘ligand’]
Choices: [‘protein’, ‘nucleic’, ‘ligand’, ‘solvent’, ‘metals’, ‘counter_ions’, ‘lipids’, ‘packing_residues’, ‘sugars’, ‘undefined’, ‘cofactors’, ‘excipients’, ‘polymers’, ‘post_translational’, ‘other_proteins’, ‘other_nucleics’, ‘other_ligands’, ‘other_cofactors’]
Resize Cryo-EM Map(s) (resize_map): Toggle on to resize cryo-EM map(s) using reference structure(s) with 10 angstroms of padding.
Type: boolean
Default: True
Choices: [True, False]
Selection Ranges for Final States
Minimum Value of First Dimension (min_value1): Minimum value of first dimension to select structures in the final states.
Required
Type: decimal
Maximum Value for First Dimension (max_value1): Maximum value of first dimension to select structures in the final state.
Required
Type: decimal
Minimum Value of Second Dimension (optional) (min_value2): Minimum value of second dimension to select structures in the final states. Not required if using 1D progress coordinate.
Type: decimal
Maximum Value of Second Dimension (optional) (max_value2): Maximum value of second dimension to select structures in the final states. Not required if using 1D progress coordinate.
Type: decimal
Selection Ranges for Trajectories
First Iteration Number (start_iteration): First iteration number for estimating 2D free energy map of progress coordinates.
Type: integer
Default: 1
Last Iteration Number (end_iteration): Last iteration number for estimating 2D free energy map of progress coordinates.
Type: integer
Default: 100
Stride Number (N) (stride): Skip every N frames in saving path trajectory file. Not valid if MD segments are extracted with endpoint only option.
Type: integer
Default: 1
Parameters for Free Energy Maps/Surfaces (optional)
Number of Bins for Histogram (number_bins): Number of bins for plotting density distribution.
Type: integer
Default: 100
Maximum Value of Estimated Free Energy (max_free_energy): Maximum value for showing 2D free energy map in colors in unit of KT.
Type: decimal
Default: 50.0
Outputs
None (analysis_report_title): Title for the path report
Type: string
Default: Path on Free Energy Map Report
Output Structures for Most Probable Path (data_out): Output dataset that saves simulation structures along the most probable path.
Required
Type: dataset_out
Default: path_structures_dataset