CHOMP - Generate BROOD Fragment Database¶
Category Paths
Follow one of these paths in the Orion user interface, to find the floe.
Product-based/BROOD
Role-based/Computational Chemist
Solution-based/Virtual-screening/DB Search
Task-based/Scaffold-Hopping
Description
CHOMP - Generate BROOD Fragment Database is a utility to the lead generation tool BROOD. CHOMP allows users to fragment molecules, filter the fragments, generate 3D conformations, organize and index the fragments for rapid searching, and write a Brood database.
The minimal input into CHOMP is a dataset, file, or collection of 2D molecules.
The output from the CHOMP floe is a BROOD Database collection that can be used as input for the BROOD floe. The CHOMP floe also optionally produces a tarball database file that can be used to run the BROOD application on a local machine. Please ensure that the Output database name does not contain spaces; otherwise, the floe will fail.
The CHOMP floe requires high memory and disk machines, at different stages, based on the input. The default values for these parameters have been set to handle up to ~1 million drug-like molecules as input. For larger jobs, these cube parameters would need to be scaled up. It is recommended that you adjust the control parameters following the below guidelines, before starting a job.
Chunk Size: Set the chunk size so the number of chunks is ~250. For example, for ~1 million drug-like molecules, the suggested chunk size is the default value of 4000.
Memory (MiB) (Chomp Fragments): Multiply the default memory value by the ratio of change in chunk size. For example, if the default chunk size is doubled, multiply the default memory by 2.
Memory (MiB) (Chomp Builder): Set this value to ~0.002 times the number of fragments.
Memory (MiB) (Chomp DB Generator): Set this value to ~0.01 times the number of fragments.
Temporary Disk Space (MiB) (Chomp DB Generator): Set this value to ~0.01 times the number of fragments.
Please note that these guidelines are approximate, and the specific values may differ for each input. A recommended 10% increase in memory and disk space values is advised to provide a margin of safety.
Promoted Parameters
Title in user interface (promoted name)
Input parameters
Input dataset (in_dataset): Input dataset containing molecules or user fragments.
Type: data_source
Input file (in_file): Input file containing molecules or user fragments.
Type: file_in
Input collection (in_collection): Input collection containing molecules or user fragments.
Type: collection_source
Output parameters
Brood Fragments DB Collection (out_collection): Output collection containing fragments database.
Required
Type: collection_sink
Default: BROOD Fragments DB collection
Save BROOD Database Tarfile (save_db_file): Boolean flag indicating whether or not to save the BROOD database tarfile
Type: boolean
Default: False
Choices: [True, False]
Output database name (out_db): Output BROOD database name.
Required
Type: file_out
Default: brood_database
Write 2D Fragments output dataset (write_2d_frags): Whether or not to write 2D Fragments output dataset
Required
Type: boolean
Default: False
Choices: [True, False]
Output 2D Dataset (out_2d): Output dataset of 2D Fragments
Required
Type: dataset_out
Default: Output of CHOMP - 2D Fragments
Failed Dataset (failed): Output dataset of failed calculations.
Required
Type: dataset_out
Default: Failed Output for CHOMP - Generate BROOD Fragment Database
Fragment generation and filtering parameters
SMARTS (smarts): SMARTS definition for bonds to break
Type: string
Default: all
Choices: [‘recap’, ‘rlf’, ‘both’, ‘all’]
Custom SMARTS File (smarts_file): Custom SMARTS file with definition for bonds to breaking
Type: file_in
Filter (filter): Flag if the fragment filter to be applied
Type: boolean
Default: True
Choices: [True, False]
Custom Filter File (filter_file): Custom Filter file for fragments filtering
Type: file_in
Maximum Heavy (max_heavy): Maximum number of heavy atoms per fragment
Type: integer
Default: 15
Heavy Frags Min Frequency (minFrequency): Minimum number of source molecules a fragment must contain in
Type: integer
Default: 0
Heavy Fragment Size (minFreqHeavy): Minimum number of heavy atoms per fragment, beyond which the minimum frequency is applicable
Type: integer
Default: 9
Control parameters
Chunk Size (chunk_size): The chunk size for splitting records.
Type: integer
Default: 4000
Memory (MiB) (memory_builder): The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
Type: decimal
Default: 3686.4
Memory (MiB) (memory_generator): The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
Type: decimal
Default: 14745.6
Memory (MiB) (memory_merger): The minimum amount of memory in MiBs (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
Type: decimal
Default: 58982.4
Temporary Disk Space (MiB) (disk_space): The minimum amount of disk space in MiB (1048576 B) this cube requires. Due to overhead, request a couple hundred MiB more than required.
Type: decimal
Default: 58982.4