Edge Mapper for RBFE calculations¶

A significant component of preclinical drug discovery involves optimizing the non-covalent binding between a target protein and ligands to improve affinity 1. To address this, many computational methods have been developed over the years; alchemical free energy methods are more expensive methods that target higher accuracy. These techniques can be used to compute either Absolute Binding Affinities (ABFE) or Relative Binding Affinities (RBFE). Although ABFE calculations are more attractive, RBFE calculations are more efficient, and are therefore more widespread.

The RBFE approach requires a map of transformations between pairs of ligands; each transformation is known as an “edge”, and the entire set of transformations, or edges, must connect all the ligands together forming map allowing any ligand to be transformed into any other by following a path of edges (individual transformations). This map is generated prior to running the RBFE calculations. This tutorial is about using the floe “Edge Mapper for RBFE calculations” in the MD Affinity package, starting from an Orion dataset of ligands, to generate the map of edges needed for the RBFE calculations in the “Non-Equilibrium Switching” floe.

In general, given \(n\) ligands \(n(n-1)/2\) edges are possible, but which ones are computationally feasible? The goal of the mapper is to come up with a small set of edges where the transformed pair of ligands are “similar”; this is important for the RBFE calculation of that edge to be successful. Each edge involves computational expense, so the mapper must try to minimize the number of edges necessary for a successful and accurate calculation. Nevertheless, within the map, all ligands should be part of a ring (cycle of edges) to provide a minimal redundancy of pathways to increase the accuracy of the RBFE calculation. In the free energy context, ligand similarity is still a quite broad field of research, and the heuristics are mainly based on graph and chemical similarity rules. The mapper implemented here is principally based on the Lead Optimization Mapper LOMAP 1 with some additional heuristics, including one based on OpenEye’s proprietary ROCS similarity.

The OE Mapper Floe with its Cubes and connections is shown below.

OE Mapper Floe¶

The Floe Inputs¶

To run the mapper floe, the usual input is an Orion dataset of posed ligands. The provided ligands must have reasonable 3D coordinates, all atoms, and correct chemistry such as bond orders and formal charges. This input ligand dataset could be the same one used as input for other protein-ligand MD floes such as “Bound Protein-Ligand MD”, “Short Trajectory MD with Analysis”, or “Ligand Bound and Unbound Equilibration for NES”; the output Bound dataset produced by “Ligand Bound and Unbound Equilibration for NES” could also be used.

Optionally, the user can also import into the mapper floe a map generated externally, to generate a mapper output dataset of only those edges defined by the external map. In this case the mapper will not perform any attempt to create the mapper graph, instead simply translating the edges given in the user-defined file into the output dataset. However, the edge score is evaluated ad displayed for the user provided edges as well. The user-provided map must be a text file containing one edge per line in the following text format:

ligA >> ligB

The first field is the title of the starting ligand, followed by “>>”, and the third field is title of the final ligand. With the above example line, the mapper will look in the input ligand dataset for a ligand titled “ligA” and another titled “ligB” and it will generate an edge record in the mapper output dataset defining the transformation of LigA into LigB. With user-defined maps, the ligand input dataset can contain ligands that are not used in any edge, but of course all user-defined edges must to have ligands by that title in the input ligand dataset.

For this tutorial the CDK2 receptor and few ligands have been selected. The files can be download below with the ligand mapping text file as well.

CDK2_ligsElf10.oeb.gz

CDK2_edges.txt

How to use the floe¶

Running the floe is straightforward: After defining the names of the job and output datasets, simply select the Orion dataset of input ligands, and then you are ready to go. The Mapper will upload a .tar.gz file with useful information and the user must specify this filename as well. In addition, the user must also set the dataset output name for all possible \(n(n-1)/2\) edges which could be used in the Orion UI Analyze page to select new set of edges. If you want run it using an external map file, towards the bottom of the input form there is an additional field “Ligand Edge Map File” where you can specify that file. In this second case, the OE Mapper will not attempt to connect “similar” ligands to create edges, but it will only form edges based on those defined in the input map file. Below is the input form for the “Edge Mapper for RBFE calculations” Floe.

OE Mapper Input Output Orion Selection¶

At the end of the run a floe report is produced with the generated map as shown below.

Each edge in the graph is represented by a pair of ligands connected with a blue line. The float value with each line is the score computed by the mapper for that edge. These scores are numbers in the range [0, 1] where zero is bad (i.e. this edge is unlikely to succeed) and 1 is best (likely to succeed). In addition to the graph diagram, two files are produced:

A map text file where each edge is tabled according to the map file grammar previously described; this file can be downloaded and customized by adding or removing edges, to make user-defined input for a subsequent mapper run to make a custom map.

A .tar.gz file containing edge information files including the edge scores for all \(n(n-1)/2\) possible edges; this could be useful in deciding which edges to add/remove in customizing the map. An example of the uploaded file can download here:

OEMapper_CDK2_result.tar.gz

How to Edit the Mapper Edges¶

In this part of the tutorial, we are going to modify the OE Mapper edges for CDK2 removing the edge involving the compounds 1oiu – 22 and adding two new edges 1oiu – 32 and 1oiu – 30. To this end we need to choose edges from the dataset containing all the \(n(n-1)/2\) possible edges and produced along the OE mapper floe. After selecting this dataset its main content is shown in the Orion Analysis page where a spreadsheet is presented (the layout could be different depending on how the Orion Analyze page layout has been set by the user):

The spreadsheet shows all the edges, in this case 120. Since we want to modify the edges generated by the mapper, the first step is to select these edges. To this end, the spreadsheet filter can be used based on the “Chosen_OPLMD” field column. Indeed, the mapper edges have a value of 1 in this field and 0 otherwise. The filter selection is shown below:

After the filter application to effectively select the records (spreadsheet rows) it is possible to use the scatter box selection tool in the plotting area as shown below or use the spreadsheet itself by using the shift key + mouse click or command key + mouse click combination:

If the selection is successful, all the spreadsheet rows should be marked as here:

Since we want to remove the 1oiu – 22 edge from the set we need to find this edge and unselect it by using the command key + mouse click combination:

At this stage we have to add two new extra edges that were not present in the generated mapper set. Therefore, we need to remove the “Chosen_OPLMD” column filter. At this point all the edges will be displayed again in the spreadsheet with the selected edges still marked. In the spreadsheet, scroll the rows till finding the 1oiu – 32 and 1oiu – 30 edges and add them to the selection by using the command key + mouse click combination:

Finally, save the selected records with the new edges in a dataset that can be used along the NES runs:

References

1(1,2)

Liu, Y. Wu, et al, (2013), “Lead Optimization Mapper: Automating free energy calculations for lead optimization”, J. Comput. Aided Mol. Design, 27(9):755-770