Dock Ten Million Molecules with Gigadock Warp and Analysis with Freeform Consensus¶
In this tutorial ten million molecules will be selected at random from the Enamine Diverse database and docked to the heat shock protein 90 (HSP90) target using the Gigadock Warp floe. The Freeform delta G values for the top docked molecules will be calculated with the FreeForm Pose floe and a consensus hit list of the best docked scores and Freeform delta G values will be generated with the Pareto Frontier Consensus.
Running all the Floes in this tutorial will cost approximately $50 in Orion compute charges; the cost will vary somewhat depending on current Amazon Web Services (AWS) pricing.
This tutorial uses the following Floes:
SPRUCE - Protein Preparation from PDB Code from the OpenEye classic floes package.
Filter Collection from the OpenEye-large-scale-floes package.
Gigadock Warp from the OpenEye-large-scale-floes package.
FreeForm Pose from the OpenEye-large-scale-floes package.
Pareto Frontier Consensus from the OpenEye-large-scale-floes package.
Create a Tutorial Project and Working Directory¶
Note
If you have already created a Tutorial project while doing another tutorial you can re-use the existing one and skip this step.
Log into Orion
Click the Home button at the top of the left menubar.
Click on the ‘Create New Project’ button and in the pop up dialog enter Tutorial for the name of the project and click ‘Save’.
Prepare Design Unit / Receptor¶
Note
If you have prepared this design until for another tutorial you can skip this step and re-used the design unit from the first tutorial.
This tutorial will use the HSP90 crystal structure 1uyg from the Protein Data Bank. To import this structure into Orion and prepare it for docking locate the floe SPRUCE - Protein Preparation from PDB Code Floe in the ‘Floes’ page as follows
Click on the ‘Floes’ button in the left menu bar
Click on the ‘Floes’ tab
In the Floes filter click ‘All Floes’
In the search bar enter Spruce
A list of three spruce Floes will now be visible to the right. Click on the SPRUCE - Protein Preparation from PDB Code and a Job Form will pop up. Specify the following parameter settings in the Job Form.
Job Properties
Output Path : Tutorials/My Data/Input Data
You will need to create the Input Data subfolder unless you have already created it in another tutorial (this can be done within the selection menu).
Promoted Parameters
Outputs
Dataset : hsp90_design_unit
PDB Codes(s) to Download : 1uyg
Click the ‘Start Job’ button to launch the Floe. Wait for the Floe status to be complete before moving on to the next step in the tutorial (this may take ~10min). The cost will be less than $1.
View Prepared Design Unit / Receptor¶
Once the SPRUCE - Protein Preparation from PDB Code job finishes make the resulting dataset active as follows:
Go to the Project Data page by clicking on the blue ‘Data’ button on the left menubar.
Select ‘My Data’ under ‘Project Data’ from the list of options to the left of the page.
Select the folder ‘Input Data’ in the main view
In the ‘Show’ menu in the top center of the screen check ‘Datasets’ if it is not already checked.
Check to make sure you have no datasets set as active by clicking on ‘Active Datasets’ in the top right of the window. If you do click ‘Clear All’ to clear them.
Make the hsp90_design_unit active by clicking on the circle with the plus symbol in the Active column next to the hsp90_design_unit name.
Now switch to the 3D Viewer by clicking on the 3D button in the left menu bar. Only the crystallographic ligand from the pdb structure will initially be visible. Do the following to make the receptor information in the design unit visible in the 3D view.
In the ‘All Data’ window expand the tree under ‘1UGY(A) > PU(A-1224)’ by clicking on the chevron immediately to the right of the name.
Expand the tree under ‘1UGY(A) > PU(A-1224)’ by clicking on the chevron immediately to the right of the name. Note: the ‘1UGY(A) > PU(A-1224)’ name appears twice, once by default and once again after completing step 1.
Expanded tree under ‘Receptor’ by clicking on the chevron immediately to the right.
Click the check button to the left of “Receptor Outer Contour” to make the contour visible in the 3D window.
The protein structure, crystallographic ligand and a blue contour are now visible. The blue contour (generally referred to in OpenEye documentation as ‘The Outer Contour’) encloses the region of space that all docked molecule heavy atoms will fit within.
Dock Molecules to Site¶
Locate the Gigadock Warp floe as follows
Click on the ‘Floes’ button in the left menu bar
Click on the ‘Floes’ tab in the upper left of the main window.
Click ‘All Floes’ in the left pane
In the search bar at the top of the right pane enter Gigadock Warp
The Gigadock Warp floe will be visible to the right. Click on the first Gigadock Warp floe to to bring up the Job Form and set the following parameters
Jobs Properties
Output Folder : Tutorial/My Data/Gigadock Warp and Analysis with Freeform Consensus
Job Cost Limits
Email me if this job cost exceeds : $40
Terminate this job if the cost exceeds : $50
Note
You will need to click on the ‘Job Costs Limits’ bar to open these options
Note
This floe is expected to cost ~$25. These cost limit values are specific to the input of this tutorial. They will need to be higher if this floe is run will more than 10M input molecules or a larger active site than HSP90.
Promoted Parameters
Inputs
Design Unit Or Receptor Dataset(s) : Tutorial/My Data/Input Data/hsp90_design_unit
This is the dataset with the protein we prepared at the beginning of this tutorial. Select it as follows.
Click the ‘Choose Input’ button for Design Unit Or Receptor Dataset(s) to open the Select Dataset modal.
Click the ‘My Data’ folder under ‘Project Data’ on the left of the modal that pops up.
Click the ‘Input Data’ folder in the main interface of the modal.
Select the hsp90_design_unit dataset in the main interface of the modal.
Click ‘Use dataset as Input’
Input Conformer Collection : Tutorial/Organization Data/OpenEye Data/Gigadock Collections/Mcule Ultimate 2020.2 Gigadock Random 10M v1.0 - external
Select the collection generated in the previous step of the tutorial as follows.
Click the ‘Choose Input’ button for Input Conformer Collection to open the Select Collection modal.
Click the ‘Organization Data’ folder on the left of the modal that pops up.
Click ‘OpenEye Data’ in the main interface of the modal.
Click ‘Gigadock Collections’ in the main interface of the modal.
Select the Mcule Ultimate 2020.2 Gigadock Random 10M v1.0 - external collection in the main interface of the modal.
Click ‘Use dataset as Input’.
Options
Hit List Size : 1000
Once the parameters are set scroll to the bottom of the page and click ‘Start Job’. The job will take ~2.5h and include an Orion compute charge of ~$25. Wait for the floe to complete before continuing with the tutorial.
Compute Freeform Delta G of the Hit List Molecules¶
Locate the FreeForm Pose floe as follows.
Click on the ‘Floes’ button in the left menu bar
Click on the ‘Floes’ tab in the upper left of the main window.
Click ‘All Floes’ in the left pane
In the search bar at the top of the right pane enter FreeForm Pose
The FreeForm Pose floe will be visible to the right. Click on the first FreeForm Pose floe to to bring up the Job Form and set the following parameters
Jobs Properties
Output Folder : Tutorial/My Data/Gigadock Warp and Analysis with Freeform Consensus
Promoted Parameters
Inputs
Input Dataset : Tutorial/My Data/Gigadock Warp and Analysis with Freeform Consensus/Gigadock Warp Hit List.
This dataset is the output hit list created by the Gigadock Warp in the previous step. Select it as follows.
Click the ‘Choose Input’ button for Input Folder to open the Select Dataset modal.
Click the ‘My Data’ folder under ‘Project Data’ on the left of the modal that pops up.
Click the ‘Gigadock Warp and Analysis with Freeform Consensus’ folder in the main interface of the modal.
Select the Gigadock Warp Hit List dataset in the main interface of the modal.
Click ‘Use dataset as Input’
Outputs
Output Dataset : Gigadock Warp Hit List with FreeForm Delta G
Now click the ‘Start Job’ button at the to start the FreeForm Pose job. The job will take roughly 45min to run and incur an Orion compute charge of about $25. Once the floe has finished move on to the next step of the tutorial below.
Compute Consensus of Docking Score and Freeform Delta G¶
Locate the Pareto Frontier Consensus floe as follows.
Click on the ‘Floes’ button in the left menu bar
Click on the ‘Floes’ tab in the upper left of the main window.
Click ‘All Floes’ in the left pane
In the search bar at the top of the right pane enter Pareto Frontier Consensus
The Pareto Frontier Consensus floe will be visible to the right. Click on the first Pareto Frontier Consensus floe to to bring up the Job Form and set the following parameters.
Jobs Properties
Output Folder : Tutorial/My Data/Gigadock Warp and Analysis with Freeform Consensus
Promoted Parameters
Inputs
Input Dataset : Tutorial/My Data/Gigadock Warp and Analysis with Freeform/Consensus/Gigadock Warp Hit List with Freefrom Delta G.
This dataset is the output hit list created by the FreeForm Pose in the previous step. Select it as follows.
Click the ‘Choose Input’ button for Input Folder to open the Select Dataset modal.
Click the ‘My Data’ folder under ‘Project Data’ on the left of the modal that pops up.
Click the ‘Gigadock Warp and Analysis with Freeform Consensus’ folder in the main interface of the modal.
Select the Gigadock Warp Hit List with Freefrom Delta G dataset in the main interface of the modal.
Click ‘Use dataset as Input’
Consensus Field(s) with Low Values Preferred :
Chemgauss4
FreeForm Pose Delta G
Note
Capitalization and spacing are important when entering these values.
To specify these values
Enter Chemgauss4 in the entry box for Consensus Field(s) with Low Values Preferred
Click the ‘+Add More’ button just below the value you entered in step 1
Enter FreeForm Pose Delta G in the new blank entry box that appears
The flow will take just a few minutes to run and cost less than $1.
View Results¶
Make the docked molecules and the structure they are docked to active as follows.
Go to the Project Data page by clicking on the blue ‘Data’ button on the left menubar.
Clicking on ‘Active Datasets’ menu in the upper right. If any dataset are active click ‘Clear All’ in the menu.
Select ‘My Data’ under ‘Project Data’ from the list of options to the left of the page.
Select the ‘Gigadock Warp and Analysis with Freeform Consensus’ Folder in the main window.
In ‘Type’ drop down menu in the top center check ‘Datasets’ if it is not already.
Locate the dataset named Gigadock Warp Design Unit and make it active by clicking on the greg circle with the plus symbol to the left of the name. The grey circle will turn green.
Locate the dataset named Pareto Frontier Consensus and make it active by clicking on the grey circle with the plus symbol to the left of the name. The grey circle will turn green.
Now move to the 3D window and setup the view
Click on the ‘3D’ button on the left menu bar and do the following in the ‘All Data’ window.
Click the faint grey dot to the right of ‘1UGY(A) > PU(A-1224)’. It will turn green.
On the same link click the blue ‘M’ badge. It will grey out.
Click on the first molecule under ‘Gigadock Pose Clustered Hit List’
You should now see the first docked molecule in the context of the active site. The up and down arrows can be used to select the next or previous docked structure.