Estimate the Cost of a Gigadock Run

Context

Running the Gigadock Floe on billions of molecules generally incurs Orion compute charges on the order of tens of thousands of dollars. The exact cost varies significantly depending on the specifics of the receptor. This guide explains how to estimate the cost of a full Gigadock run with a receptor.

Procedure

This procedure assumes you have the following:

  • Receptor: A receptor that will be used in the full Gigadock run.

  • Full Collection: A Gigadock collection of input molecules that will be used in the full Gigadock run.

  1. Create a collection of ~5M random molecules from the full collection.

    • The Filter Collection Floe can create this collection, as explained in the Gigadock tutorial.

    • In the vendor databases provided by OpenEye, there may also be a random, pregenerated collection with ~5M molecules.

  2. Run the Gigadock Floe with the following parameters.

    • Job Cost Limits

      • Email me if this job cost exceeds: Choose an appropriate value

      • Terminate this job if the cost exceeds: $500

    Note

    This should be far more than sufficient for most runs, although using multiple large design units in Fred or Fast-Fred mode could hit this limit; such systems are not recommended for the Gigadock Floe due to excessive cost.

    • Inputs

      • Receptor Dataset: The dataset with the design unit(s)/receptor(s) for the full Gigadock run.

      • Input Conformer Collection: Use the random 5M collection.

    • Options

      • Docking Method: Select your preferred method for the full Gigadock run. The cost estimate will differ for each docking method.

  3. Once the Gigadock Floe finishes, note the total job cost. You can estimate the total job cost by assuming that the cost scales linearly with the number of molecules. For example, if the full collection has 2.7 billion molecules, and the estimated floe cost was $50, then the estimate for the full Gigadock job would be $27 thousand ($50/5M molecules * 2.7B molecules).

Note

More or less than 5M random molecules from the full collection can be used, but the accuracy of the cost estimate decreases as the number of docked molecules decreases.