ZINC Download to Reaction & Reagent Database

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

  • Solution-based/Hit to Lead/Generative Design/Reaction-based Libraries

  • Task-based/Library Prep & Design/Reaction-based Enumeration

  • Role-based/Cheminformatician/Medicinal Chemistry Support

  • Role-based/Cheminformatician/Corporate Collection Support

Description

This floe is used to populate a reaction & reagent database with a user-customized selection of ZINC compounds.

NOTE: The ZINC database is a shared resource for users all over the world - please be considerate in the number of compounds selected for processing, as well as the frequency of the update activity.

The input to the floe is either previously downloaded SMILES tranche file(s) from ZINC, or a set of tranche ids selected by the user based using the powerful tranche filtering capability for [ZINC tranche selection](http://zinc15.docking.org/tranches/home/).

ZINC Subset selection

A common filtering activity is to select “In-Stock” compounds from the Minimum Purchasability drop-down and some molecular weight (or other property) range of interest. In general, a conservative goal to limit the total number of compounds to something less than 10M total compounds should be employed, though higher limits may also be successful. Once the tranche subset/filtered selection has been finalized, choosing the download arrow from the web page will present the user with a list of tranche ids which can be copied to the clipboard from the browser window. This is the only information needed by this update floe. The tranches will be downloaded on the fly when the floe is run, thus no explicit download activity is required by the user.

Launching the Floe

The floe requires a valid reaction definition file that defines the reactions, and associated reagent chemistry for classification of input structures. A sample reaction definition file, sample_reaction_classification_2022_2.txt is available from the OpenEye Organization resources. At the time of this release, the documentation for generating a custom version of this file is not yet available. If you have the need to undertake this activity, contact OpenEye Support (mailto:support@eyesopen.com) for additional details.

The name of the output reaction & reagent database should be specified with a scheme that serves as a reminder of the tranche filtering used for the structure selection. The reaction & reagent database file generated will be an Orion file resource with the floe user’s credentials.

Promoted Parameters

Title in user interface (promoted name)

ZINC Processing Options

ZINC Tranche Data (tranche_file): A previously downloaded ZINC tranche data file (.smi) or ANY Orion file resource that minimally provides a smiles and a unique id for each structure. Note that the default Orion ETL (conversion to dataset) activity should be suppressed for such file uploads.

  • Type: file_in

Strict ZINC Format (strictformat): If set, forces a strict ZINC specification of input tranche file data (including header), relaxed implies simply SMI and ID input data

  • Type: boolean

  • Default: True

  • Choices: [True, False]

ZINC Tranche IDs (tranches): A selection of ZINC tranche ids (see http://zinc15.docking.org/tranches/home/)

  • Type: string

Functional Group Transformations (enablefngroups): If ON, allows interconversion of simple functional groups per the reaction definitions during reagent classification

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Strip Salts (saltchop): If set, retains only the largest fragment from each input structure prior to indexing

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Neutralize Charges (neutralize): If ON, removes all formal charges other than quaternary amines, correcting hydrogen counts.

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Molecule Filtering (filter_mols): If ON, performs filtering of reagents prior to classification

  • Required

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Custom Filter File (filter_file): An filter file resource to load (supersedes the default)

  • Type: file_in

Filter Summary Report (filter_summary): if ON, will enable a summary report of the rules for the filtered molecules

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Verbosity (verbosity): Sets the output logging verbosity

  • Type: string

  • Default: warning

  • Choices: [‘info’, ‘warning’, ‘error’, ‘debug’, ‘ddebug’]