Estimate the cost of a Giga Docking Run
Context
Running the Gigadock floe on billions of molecules generally incurs Orion compute charges on the order of tens of thousands of dollars. The exact cost varies significantly depending on the specifics of the receptor. This how-to explains how to estimate the cost of a full giga docking run the receptor.
Procedure
This procedure assumed you have the following
Receptor This is the receptor that will be used in the full giga docking run.
Full Collection A giga docking collection of input molecule that will be used in the full giga-docking run.
Create a collection of ~ 5M random molecules from the full collection.
This can be done with the Filter Collection floe (the Giga Docking tutorial covers with use case). For vendor databases OpenEye pre-generated for Orion this random ~5M collections may be pre-generated as well.
Run the Gigadock floe with the following parameters
Job Properties
Output Path : Choose/create any appropriate folder
Job Cost Limits
Email me if this job cost exceeds : Choose an appropriate value
Terminate this job if the cost exceeds : $500
Note
This should be much more than sufficient for most runs, although using multiple large design units in Fred or FastFred mode could hit this limit (such system are not recommended for the Gigadock floe due to excessive cost).
Promoted Parameters
Inputs
Receptor Dataset : The dataset with the design unit(s) / receptor(s) for the full giga docking run.
Input Conformer Collection : Set this to the random 5M collection.
Options
Docking Method
Set this to the value the full giga docking run will use. The cost estimate will differ for different docking methods.
Once the Gigadock started in the previous step finishes note the total job cost and estimate the total job cost by assuming the cost scales linearly with the number of molecules. E.g., if the full collection has 2.7 Billion molecules and the estimate floe cost was $50 then the estimate for the full giga docking job would be $27K ($50 * 2,700,000,000/5,000,000).
Note
More or less than 5M random molecule from the full collection can be used, but the cost estimate will be less accurate the fewer molecules are docked.