Generative Structure Indexing - MMP Molecule Indexer

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

  • Solution-based/Hit to Lead/Generative Design/Match Molecular Pairs (MMP)

  • Task-based/Virtual Screening - Structure-Based

  • Role-based/Cheminformatician/Medicinal Chemistry Support

  • Role-based/Cheminformatician/Corporate Collection Support

Description

This floe is intended to provide the capability to index a user-provided set of structures for use in the Generative Design Floes for the Matched Molecular Pair (MMP) method.

The input to the floe is a (file) input, and a (file) output is generated. When uploading a structure set for indexing, use the advanced options in the upload process to suppress ETL (especially for large input structure sets). Large datasets are in the <1M range; larger datasets are not recommended at this time.

Once an MMP index has been created, run the Generative Structure Indexing - MMP Transform Extractor Floe to extract the desired transformations for use in the Generative Design Floes.

Promoted Parameters

Title in user interface (promoted name)

Inputs

Input File (input_file): Orion file resource containing the molecules to be indexed.

  • Required

  • Type: file_in

Outputs

Output MMP Index (output_mmpindex): Orion file resource for the output MMP index file.

  • Required

  • Type: file_out

Matched Molecular Pair Indexing Options

MolField (idxmol): Field containing the molecules to be indexed.

  • Type: field_parameter::mol

Sequence id (molid): Field containing the input molecule sequence identifier.

  • Required

  • Type: field_parameter

  • Default: molid

Data Handling (moldata): Desired handling of data on the input molecules.

  • Required

  • Type: string

  • Default: keep

  • Choices: [‘keep’, ‘clear’, ‘any’, ‘all’]

Fragmentation Cuts (fragcuts): Selects the type(s) of fragmentation cuts for the indexing activity.

  • Type: string

  • Default: [‘all’]

  • Choices: [‘all’, ‘single’, ‘double’, ‘triple’]

Min Indexed Fragment Size (fragGe): Set the minimum range of the indexable fragment filters as a percentage of the input structure.

  • Type: decimal

  • Default: 85.0

Max Indexed Fragment Size (fragLe): Set the maximum range of the indexable fragment filters as a percentage of the input structure.

  • Type: decimal

  • Default: 100.0

Advanced Options

Strip salts (strip_salts): Whether to strip salts from the input structures.

  • Required

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Strip stereochemistry (strip_stereo): Whether to strip stereochemistry from the input structures.

  • Required

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Finalize MMP Index (finalize_index): Whether to finalize the index file or allow it to have additional molecules added at some later time.

  • Required

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Log level (verbosity): Desired level of logging verbosity.

  • Type: string

  • Default: warning

  • Choices: [‘info’, ‘warning’, ‘error’, ‘debug’, ‘ddebug’]

Indexing Resource Options

Max Parallel Indexers (idx_max_parallel): Limit the number of parallel indexing instances to this limit.

  • Type: integer

  • Default: 1000

Disk Limit (mmp_disk_space): Maximum disk space for the serialized MMP index.

  • Type: decimal

  • Default: 30000

Memory Limit (Frag) (mmp_frag_memory): Maximum memory limit for molecule fragmentation.

  • Type: decimal

  • Default: 10000

Memory Limit (Merge) (mmp_merge_memory): Maximum memory limit for MMP index merge.

  • Type: decimal

  • Default: 50000

Memory Limit (Deduplication) (dedupe_memory): Structure deduplication may require significant memory resources; specify the desired memory limit in MB.

  • Type: decimal

  • Default: 1800