Basic Tutorial 1: Automated WEMD Simulation and Best Structure Search Guided by Target Cryo-EM Map
In this tutorial, you will run the Automated WEMD Simulation and Best Structure Search Guided by Target CryoEM Map Floe, which includes three major components:
Set up and perform a WEMD simulation using a target cryo-EM map. Use the starting structure from the open state (4AKE) and a synthetic cryo-EM map from the closed state (1AKE) as the consensus map. The system will explore conformations consistent with the closed state gradually as the simulation reaches an equilibrium distribution.
For real applications, the starting structure might be obtained using ab initio structural modeling tools, such as AlphaFold, or derived from other experimental data.
For the progress coordinate, use the real-space correlation coefficient (RSCC) between the simulated maps created from conformational snapshots and the experimental map of the closed state.
During the simulation, a protein complex might drift or rotate in the solution, so first you must align the simulated conformations with a structure fit into or refined against the experimental map before calculating the correlation coefficients. Generally, if the initial structure was derived from the related experimental map, it should be aligned to the map already.
Generate and review the WEMD simulation report. For example, you can check the evolution plot of RSCC as a function of iteration number, as well as the simulation convergence plot showing Kullback-Leibler (KL) divergence of the probability density distribution.
Find the top candidates from the saved simulation trajectories for a series of input cryo-EM maps and generate a search report of the best structures. You will use two synthetic maps from the open and closed states as input and find the top N structures from the simulation trajectories.
Tip
After submitting a simulation job, it is good practice to check approximately 30 minutes later to ensure the floe is still running and has reached the Propagate WE Segments (MD) Cube.
The simulation floes are set to minimize the usage of resources. If the simulation fails for large systems, the memory and disk storage space may need to be increased for related cubes.
For real applications, the biological systems might be very complicated, and the selection of progress coordinates and other simulation parameters might not be obvious. We suggest running the simulation for 10 to 20 iterations to experiment with input settings and determine if the system is exploring the progress coordinates to your liking. Once a desired setup is identified, restart the full-length automated floes (Automated WEMD Simulation and Best Structure Search Guided by Target CryoEM Map Floe or Automated WEMD Simulation and Best Structure Search Guided By Eigen CryoEM Maps Floe) or use the continuation floe (Continue WEMD Simulation Guided by CryoEM Maps Floe) to perform the rest of the simulation.
It is good practice to set a cost limit to prevent a simulation floe from running for an unexpectedly long time and incurring a large cost.
Download Initial Files and Prepare the Protein to Run WEMD Simulations
Before starting the simulation, please download
all_init_files.tgz
and upload the files under the adk folder
to a folder you have created in Orion, such as Team Data/ADK/init_files. For directions on how to upload,
please see the Data page documentation in the User Guide. Next, prepare the protein using the
SPRUCE - Protein Preparation and Solvate and Equilibrate Target Protein Floes
as described in the Soup to Nuts tutorial.
Use 4ake_open_solute_fit2map.pdb as the initial structure to obtain a dataset for the simulation input.
The requisite parameters for these floes are shown in Figures 1 and 2. Remember that the SPRUCE - Protein Preparation Floe
will run first and then provide input to the Solvate and Equilibrate Target Protein Floe.

Figure 1. User inputs for the SPRUCE - Protein Preparation Floe.

Figure 2. User input for the Solvate and Equilibrate Target Protein Floe.
Create Output Folder and Locate the Floe in Orion
To create an output folder, navigate to the Data page from the navigation bar, click on the Team Data folder, and select adk. Then click the “Add Folder” icon to create a WEMDCC1D folder for the output datasets and collections.
Navigate to the Floe page to find the floe and select the Automated WEMD Simulation and Best Structure Search Guided by Target CryoEM Map Floe. Click the “Launch Floe” button to bring up the Job Form, as shown in Figure 3.

Figure 3. The Job Form for the Automated WEMD Simulation and Best Structure Search Guided by Target CryoEM Map Floe.
Provide Input and Parameters to Run the Floe
The default values for most input and output parameters can be used for an initial run. The most notable parameters are listed below:
Output path: On the Job Form, specify where your output data should be directed. For this tutorial, use Team Data/ADK/WEMDCC1D.
Input Data for Simulation: Choose the Solvated and Equilibrated Protein output dataset from the Solvate and Equilibrate Target Protein Floe.
Output Data for Simulation: Customize the output dataset and collection names here for future identification.
Progress Coordinate Inputs:
Use the uploaded 1ake_close_4A.mrc file for the consensus map and specify a resolution of 4.
Reference Protein Dataset: To align simulation structures to the consensus map, use the SPRUCE - Protein Preparation Floe output dataset created from 4AKE. Specify the components to compute the simulation map for the RSCC calculation as the progress coordinate. There are several methods to set the mask and calculate the RSCC; for this tutorial, use the default. More details on customization of these parameters are in the FAQs for Cryo-EM Maps.
When Resize Cryo-EM Map(s) is toggled on, both simulation maps and target maps will be resized based on the reference structure(s) with 10 Å of padding in order to save time in calculating the RSCC.
Weighted Ensemble Parameters: For a quick run, set 20 iterations and 10 ps for each MD segment. The FAQs include details for customization of these parameters.
Input Cryo-EM Maps and Options for Best Structures Search:
Cryo-EM Map Files: For this tutorial, use the uploaded 1ake_close_4A.mrc and 4ake_open_4A.mrc files.
Reference Protein Dataset: Use the SPRUCE - Protein Preparation Floe output dataset.
Selections for Output Best Structures: Save the top five best candidates for each input cryo-EM map.
Enter the requisite parameters as shown in Figures 4 and 5 and click “Start Job” to begin the floe.

Figure 4. User inputs for the Progress Coordinate Inputs and Weighted Ensemble Parameters.

Figure 5. User inputs for the Input Cryo-EM Maps and Options for Best Structures Search parameters.
Check Outputs and Floe Reports
When the floe is finished, navigate to the Jobs tab of the Floe page. Click on the finished job to bring up the Job Details panel. Under Reports, there will be two Floe Reports: the WEMD Simulation Report and the Cryo-EM Map Match Report. You can click on the “Open Report” icon to open each Floe Report in a new tab. Under Results, you will find the files best_structures_dataset, wemd_simulation_dataset, and wemd_simulation_collection. You can view the datasets and the collection in the specified output folder on the Data page by clicking the “View in Project Data” button. Activate the best_structures_dataset to visualize the structures on the 3D & Analyze page.

Figure 6. The Start WEMD Simulation and Structure Search Report. Upper left: main page. Upper right: progress coordinate evolution, based on the real-space correlation coefficient (RSCC) to the reference map, as the weighted ensemble (WE) simulation progresses. Free energy, conditional on the starting state, is plotted by iteration (y-axis), with dark blue to light yellow corresponding to free energy minima and maxima. Lower left: final probability distribution of the RSCC with the reference map. Lower right: final free energy conditional on the starting structure.

Figure 7. The CryoEM Map Match Report. Upper: Probability distribution and free energy conditional on the starting state, for each best structure search target map. Lower: Free energy landscape conditional on the starting state for the two target maps, plotted simultaneously on each axis.
Predict Cryptic Pockets
The output dataset and collection are consistent with those obtained from the simulation floes using normal modes as progress coordinates in the OpenEye Cryptic Pocket Detection Floes package. Thus all pocket detection floes from that package can be used to predict cryptic pockets for the simulation collections generated in this package. The Cryptic Pocket Detection advanced tutorials offer more insight to the floes from that package.