MMP Molecule Indexer

Category Paths

Follow one of these paths in the Orion user interface, to find the floe.

  • Solution-based/Hit to Lead/Generative Design/Match Molecular Pairs (MMP)

  • Task-based/Virtual Screening - Structure-Based

  • Role-based/Cheminformatician/Medicinal Chemistry Support

  • Role-based/Cheminformatician/Corporate Collection Support

Description

This Floe is intended to provide the capability to index a user-provided set of structures for use in the Generative Design Floes for the Matched Molecular Pair (MMP) method.

The input to the Floe is a (File) input and a (File) output is generated. When uploading a structure set for indexing, use the Advanced options in the upload process to suppress ETL (especially for large input structure sets). Large datasets are in the <1M range - larger datasets are not recommended at this time.

Once an MMP index has been created, there is another Floe to extract the desired MMP transformations to use in the Generative Design Floe.

Promoted Parameters

Title in user interface (promoted name)

Inputs

Input File (input_file): Orion file resource containing the molecules to be indexed

  • Required

  • Type: file_in

Outputs

Output MMP Index (output_mmpindex): Orion file resource for the output MMP index file

  • Required

  • Type: file_out

Matched Molecular Pair Indexing Options

MolField (idxmol): Field containing the molecules to be indexed

  • Type: field_parameter::mol

Sequence id (molid): Field containing the input molecule sequence identifier

  • Required

  • Type: field_parameter

  • Default: molid

Data Handling (moldata): Desired handling of data on the input molecules

  • Required

  • Type: string

  • Default: keep

  • Choices: [‘keep’, ‘clear’, ‘any’, ‘all’]

Fragmentation Cuts (fragcuts): Selects the type(s) of fragmentation cuts for the indexing activity

  • Type: string

  • Default: [‘all’]

  • Choices: [‘all’, ‘single’, ‘double’, ‘triple’]

Min Indexed Fragment Size (fragGe): Set the minimum range of the indexable fragment filter as a percentage of the input structure

  • Type: decimal

  • Default: 85.0

Max Indexed Fragment Size (fragLe): Set the maximum range of the indexable fragment filter as a percentage of the input structure

  • Type: decimal

  • Default: 100.0

Advanced Options

Strip salts (strip_salts): Whether to strip salts from the input structures

  • Required

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Strip stereochemistry (strip_stereo): Whether to strip stereochemistry from the input structures

  • Required

  • Type: boolean

  • Default: False

  • Choices: [True, False]

Finalize MMP Index (finalize_index): Whether to finalize the index file or allow it to have additional molecules added at some later time

  • Required

  • Type: boolean

  • Default: True

  • Choices: [True, False]

Log level (verbosity): Desired level of logging verbosity

  • Type: string

  • Default: warning

  • Choices: [‘info’, ‘warning’, ‘error’, ‘debug’, ‘ddebug’]

Indexing Resource Options

Max Parallel Indexers (idx_max_parallel): Limit the number of parallel indexing instances to this limit

  • Type: integer

  • Default: 1000

Disk Limit (mmp_disk_space): Maximum disk space for the serialized MMP index

  • Type: decimal

  • Default: 30000

Memory Limit (Frag) (mmp_frag_memory): Maximum memory limit for molecule fragmentation

  • Type: decimal

  • Default: 10000

Memory Limit (Merge) (mmp_merge_memory): Maximum memory limit for MMP index merge

  • Type: decimal

  • Default: 50000