A Small Molecule Membrane Permeability Calculation

The Openeye Permeability Floes are designed to be a user-friendly way to understand the mechanism of passive membrane permeability. The starting input is a small molecule (2D or 3D) on a record in a dataset, which is turned into a set of input systems (basis states) containing the molecule, a pre-equilibrated POPC membrane, and a layer of water to solvate the entire system. These basis states are used in weighted ensemble (WE) molecular dynamics (MD) simulations to predict the mechanism of passive permeation, as well as provide an estimate for the permeability coefficient.

A permeability estimate using this floe can be useful for the following reasons:

  • Permeability liabilities can hinder progress of a promising lead candidate, and this tool will provide insight into the mechanistic process that prevents or permits acceptable permeation.

  • Detailed knowledge and analysis of the permeation pathways provide insight into the bottleneck regions for a common lead series, which could help guide a medicinal chemist toward rational drug design for permeability.

The WE-MD method applies MD simulations and WE algorithm in an iterative fashion. In this tutorial, a small molecule (tacrine) will be prepared and run for a short simulation (10 iterations of 100 ps each in terms of molecular time). This short simulation will be analyzed using the analysis floe, and the resulting data will be compared to a larger, pre-generated output dataset of the same molecule.

This tutorial uses the following Floes

  • Run Permeability Simulation from the OpenEye Permeability Floes package.

  • Analyze Permeability Simulation from the OpenEye Permeability Floes package.

  • Calculate Auxiliary Coordinates from the OpenEye Permeability Floes package.

Create a Tutorial Project

Note

If you have already created a Tutorial project you can re-use the existing one.

Log into Orion and click the home button at the top of the blue ribbon on the left of the Orion Interface. Then click on the ‘Create New Project’ button and in the pop up window enter Tutorial for the name of the project and click ‘Save’.

create_project_ui

Orion home page

Run the Permeability Simulation Floe

Note

If you have already created a dataset for use in the Permeability Simulation Floe you can re-use it.

In the Permeability Simulation Floe, any 2D or 3D molecule can be used. However, for sampling concerns, users are encouraged to use molecules that fall within Lipinski’s Rule of Five. In this tutorial, a rigid small molecule acetylcholinesterase inhibitor, tacrine, will be used. The user will be instructed on how to prepare a molecule using the 2D sketcher, and will run the simulation for a short period (10 WE iterations).

The steps to prepare the dataset and run the permeability are detailed below.

Select the Permeability Simulation Floe

  1. Click on the ‘Floes’ button in the left menu bar.

  2. Click on the ‘Floes’ tab.

  3. Set the ‘Browse Workfloes’ drop down menu to ‘Show all packages’.

  4. Select ‘All’ under Browse Workfloes.

  5. In the search bar enter Run Permeability Simulation.

A list of two Permeability Floes will now be visible to the right (see below)

find_run_perm_floe

Click on the Permeability - Run Permeability Simulation and a Job Form will pop up. Specify the following parameter settings in the Job Form.

Create Initial Tacrine Structure

  1. An input dataset must either be select or generated on-the-fly. Here, we will generate tacrine on-the-fly using the 2D sketcher.

  2. Click on the Input Dataset button (Choose Input…), and click on the 2D Sketcher tab.

  3. Copy and paste the SMILES for tacrine into the 2D Sketcher: Nc1c2c(nc3ccccc13)CCCC2

create_tacrine_input

In the ‘Name this Molecule’ field type ‘tacrine’, and then click the ‘Use dataset as input’ button. Note that all the Permeability Floes can only run one molecule at a time (for now), so if your dataset contains multiple records, only the last one will be taken as the input by the floe.

Modify Floe Parameters

  1. Back on the job form, add the name ‘tacrine_output’ to the ‘Output Dataset’ field.

  2. Under the ‘Weighted Ensemble Parameters’ section, change the number of iterations from 500 to 10.

Warning

If you leave the value at the default setting (500 iterations), this run may cost several thousand dollars.

Note

For a real permeability job run, running 500 iterations is recommended for the simulation to converge.

set_number_iters

Execute the Floe

Click the ‘Start Job’ button to launch the floe. Wait for the Floe status to be complete before moving on to the next step in the tutorial (include system preparation, a set of 10 iterations will take approximately 1 hour).

Resume/Restart a Previous Job (Optional)

The same Permeability - Run Permeability Simulation floe can be used to resume a simulation (either finished or unfinished). If a dataset generated by a previous job is provided as the input to the floe, the floe will automatically continue the simulation from the last iteration. Typically the output dataset would contain multiple records (one per iteration and containing data from all previous iterations), but only the last record (usually corresponding to the last iteration) will be used to resume the simulation. This behavior of only taking the last record from the input dataset is consistent with running a new job.

When resuming a job, most of the parameters are immutatable and their values will be kept the same as what was used in the initial simulation, except for the following:

  • Iterations. The simulation will be resumed to run up to the specified total number of iterations. If the specified number is smaller than the current number of iterations, then the floe will still run one iteration.

  • Reweighting, Reweighting Period, and Reweighting Window Size (WESS). Since the WESS reweighting procedure can be applied at any point of the simulation, the floe allows the user to turn on/off the reweighting feature when resuming a job.

  • Restart Simulation. The switch for restarting a simulation. If this is turned on, all the previous iterations will be discarded, and the simulation will run from iteration 1. Note that the simulation will still use the same prepared system, i.e., the system will not be prepared again.

Warning

If a simulation needs to be resumed with a different reweighting setup, please set the reweighting parameters to desired values. Otherwise, the simulation will be resumed to run using the default values.

For either resuming or restarting, a new output dataset will be created for accumulating the new records generated by the new job, but the trajectory data will still be added to the original collection used by the previous job.

In this example, simply find the output dataset from the job above and follow the steps below:

  1. Click on the Input Dataset button (Choose Input…), and click on the output dataset from the previous job.

  2. Change the value for ‘Iterations’ in ‘Weighted Ensemble Parameters’ to 20.

  3. Click the ‘Start Job’ button to launch the floe.

Then the simulation will be extended to a total of 20 iterations.

Perform a calculation using CPUs instead of GPUs (Optional)

In the case of a severe GPU shortage on an Orion stack, one may want to run the permeability calculation using CPUs only. While it is possible, it is not advised to run in this manner due to the quite steep increase in cost per compound. Despite this warning, below we will demonstrate how to run the Permeability Floe using CPUs.

At the bottom of the floe UI, slide the bar across so that “Show cube parameters” is in the “Yes” position.

cube_parameters

Under the Cube Parameters section, scroll down to the “Run WESTPA Segments (MD)” cube and click on the name to expand the cube parameters. Click on the “System” tab, and change the number of CPUs to 48, and change the number of GPUs to 0.

number_cpus_gpus

With these changes, the Permeability Floe will now run will CPUs for the weighted ensemble simulation.

Run the Permeability Analysis Floe

Note

A sample 500 iteration dataset of tacrine has been provided by OpenEye for use in this tutorial. This sample dataset can be found at tacrine_mab_wess_iter500.oedb.

In the Permeability Analysis Floe tutorial, a pre-generated dataset of tacrine will be analyzed and a floe report will be generated. The set of steps needed to analyze any general output from a previously run Permeability Simulation will be presented here.

Select the Permeability Analysis Floe

  1. Click on the ‘Floes’ button in the left menu bar.

  2. Click on the ‘Floes’ tab.

  3. Set the ‘Browse Workfloes’ drop down menu to ‘Show all packages’.

  4. Select ‘All’ under Browse Workfloes.

  5. In the search bar enter Analyze.

A single floe will now be visible to the right (see below).

find_perm_analysis_floe

Set the Input Analysis Parameters

While the example dataset provided in this tutorial consists of a single record, a typical permeability simulation job yields a dataset with multiple records. However, only the last record (i.e., the last iteration) is needed for performing the analysis and it is more efficient to run the floe with a single record as the input. While there are many ways to get the last record in a dataset, the following steps provide a way to navigate to the last record quickly and select it as the input for the floe:

  1. Click on the Input Dataset button (Choose Input…), and click on the ‘Existing Data’ tab.

  2. Navigate to the example tacrine dataset and click on ‘Open in Tile View’ next to the permeability dataset that you would like analyze.

  3. Sort the records in descending order by ‘Iteration Number’ (highlighted by the red box in the figure below).

  4. Select the first record after the sorting.

  5. Click on ‘Add Dataset & Select Records’.

find_last_record

Finally, back on the job form, fill in entries for the ‘Output Dataset of Failed Molecules’ and ‘Output Dataset’, which provide output storage for the analyzed simulation.

Execute the Floe

Since no remaining parameters need to be set for analysis, simply click the ‘Start Job’ button to launch the Floe. Once the floe is complete, a floe report will be available.

Inspect the Results

Once the Permeability Analysis Floe completes, it is best to visually inspect the results to assess the quality of the calculation. Here, we will discuss two important plots that are automatically generated to help the user inspect the quality of the results.

In the Permeability Analysis Floe report, a plot of the time evolution of the permeability estimate will be presented. An example from tacrine is presented below:

perm_evolution

Here in this plot, one can see the permeability evolution as a function of the WE iteration number. Initially, the permeability estimate is 0 since there is no crossing events. This is due to the fact that without crossing events, no weight can make it to the target state in the acceptor water compartment on the opposite side of the membrane, which means the rate constant is 0 therefore the permeability is also 0.

Each time a crossing event occurs, more instantaneous weight is averaged into the rate constant and the permeability estiamte “jumps” a small amount. A slow decrease in the permeability estimate could indicate that no new crossing events have been simulated, and the system lacks strong convergence.

Note

To go from iteration number to molecular time, simply multiply the amount of time simulated per-segment times the number of iterations.

For instance, if each MD segment requires 100 ps, and there are 500 iterations total, then the total “molecular time” simulated would be 50 ns.

To gain a better sense of what exactly is happening inside the permeability simulations, one should also visually inspect the plot of the sucessful crossing events. An example from the same tacrine dataset is given below:

perm_crossing

In this plot, the z-position of the successful crossing events are shown as a function of time. Here, roughly at iteration number 20, a single walker enters the membrane. At about iteration 50, the single walker splits into 3 separate walkers, where each separate walker attempt to exit the membrane through the opposing membrane leaflet. After about 100 iterations, all walkers are on the other side of the membrane attempting to escape to the acceptor aqueous phase.

Note

A small number of non-independent crossing events may lead to a permeability prediction that lacks statistical convergence. In such a case, multiple permeability simulations may be required.

What Does Bad Permeability Data Look like?

With any algorithm, it is important to know when you can, and more importantly when you cannot, trust your data. Here, we will show an example of a bad permeability result, and potential steps to correct the issue. For this section, it is assumed that you have already run the Permeability Analysis Floe.

Below is an example of “bad data” that was generated with the Permeability Floe, and analyzed with the Permeability Analysis Floe. First, take note of the permeability evolution plot:

bad_perm_evolution

Here, one should observe the very low permeability value (~10 -30 cm/s) at the end of 500 WE iterations compared to the much more reasonable MDCK-LE experiment (10 -4.6 cm/s).

In order to understand the source of the erroneous prediction, it can be useful to look at the plot of the “sucessful” crossings. Below we provide the corresponding plot of “successful” crossing events for the unsuccesful simulation.

bad_perm_crossing

In this plot, one may notice only a single crossing event takes place during 50ns of molecular time. This crossing event occurs at roughly WE iteration 50, even before the first splitting can occur inside the membrane. This is a clear indicator that only one independent trajectory crossed the membrane, which is far too few for an accurate estimate of permeation kinetics.

Here are the following steps one might take to mitigate such an issue:

  1. Continue the simulation for more WE iterations. It might be possible that an additional 10% of molecular time could provide enough sampling to produce another membrane crossing event.

  2. Restart the permeability simulation from scratch. It may be that your particular system became kinetically or thermodynamically trapped in a state that it cannot escape, so a fresh restart could help this situation.

  3. Is your molecule charged? If it is possible your molecule can have multiple charged states, check that a charged molecule was not inadvertently simulated. A charged molecule will likely have a low permeability value.

  4. If you have large, highly flexible molecule, it is possible that a separate progress coordinate may be needed to improve conformational sampling. Please email support@eyesopen.com to discuss methods to facilitate this type of samping within your permeability simulation.

Run the Auxiliary Coordinates Calculation Floe

The dataset generated by either the simulation or the analysis floe can be used to calculate extra auxiliary coordinates of the system in addition to the main progress coordinate, the position of the molecule relative to the center of the membrane along the lipid bilayer normal (z), which is calculated during the simulation. Currently, the floe supports the calculation of four auxiliary coordinates:

  • The cosine of the angle of the molecule relative to the lipid bilayer normal, which is defined through the vector product of the unit electric dipole moment of the molecule and bilayer normal

  • The number of hydrophobic contacts, which is defined to be the number of aliphatic atoms of the lipid tails within the 10 Å distance of any hydrophobic atoms of the drug molecule.

  • The number of hydrogen bonds between the drug and the membrane, which is identified using the Baker-Hubbard definition.

  • The end-to-end distance of each molecule.

Execute the Floe

Locate the floe with the name Permeability - Calculate Auxiliary Coordinates in the ‘Floes’ tab, set the dataset (or the last record in the dataset) generated by the analysis floe as the input, and simply click the ‘Start Job’ button to launch the floe. Once the floe is complete, the results will be presented in a floe report in the form of four 2D probability distributions of the main progress coordinate (z) and four auxiliary coordinates, respectively.

By default, these probability distributions are symmetrized along z to account for the symmetry of the membrane system. However, the original, unsymmetrized versions are also provided in Supplementary Information at the bottom of the floe report (expand the collapsible bar to view them).