Parameters

Execution Parameters

-param

This flag specifies a parameter file that contains all of the command-line parameters in a simple text file. The parameter file is automatically written to “-prefix.param” with every execution. It is a record of the input that was used and can be used to rerun the exact same process. It can also be altered by hand to modify a prior execution. If a parameter is set in the param file and on the command line, the command line setting takes precedence. More information is available in the section on the parameter files below.

-mpi_np

This flag invokes BROOD with multi-processing on the current machine. It is recommended that you specify not more than one more than the number of cores available on your machine.

-mpi_hostfile

This flag invokes BROOD with multi-processing on the current machine OR multiple machines. Use this flag to specify a file that indicates the machines upon which to run MPI master and slaves and the number of processes to run on each machine. For additional information and the format of the file, please see section on MPI.

Input Parameters

-queryMol (-qMol)

Molecular file encapsulating the query. It contains the primary molecule of interest, the query fragment, description of the query potential function and constraints, and the protein structures for the active-site and for selectivity.

-queryFragment (-queryFrag)

File containing the molecular connection table for the fragment to be replaced in the query molecule. If you use the graphical interface to generate a query, the queryFragment will be specified directly on the -queryMolecule, and this parameter is no longer required. This parameter is being maintained for backwards compatibility.

-db (-database)

Specified the directory containing the BROOD database files. Additional details on the files contained in the directory can be found in the section on the Fragment Database.

-prot (-protein)

Specifies and optional protein file. If a protein is given, after fragments and analogs are overlaid on the query fragment or molecule, they will be checked for clashes with the protein. Any analogs with clashes will be eliminated. Thus if you have a co-crystal structure of your ligand of interest, this flag will allow BROOD to only identify analogs that can fit in the active site.

-select (-selection)

Specifies an optional protein file for selectivity. If a selection protein is given, after fragments and analogs ore overlaid on the query fragment, they will be checked for clashes with this protein. As opposed to the active-site protein, new analogs are required to have an atom within a specified radius of this protein’s atoms. This is a simplistic screen for selectivity.

-noqueryprot (-noqueryprotein)

When a query is created with the GUI and an active-site protein is included, the section active-site protein becomes part of the query. This flag makes BROOD ignore the active-site protein even though it is part of the query. The flag makes it easy to run the BROOD job with and without the active-site protein (see also brood_1.removed.oeb.gz).

-noqueryselectprot (-noqueryselectionprotein, -noqueryselection)

When a query is created with the GUI and a selection protein is included, the selection protein becomes part of the query. This flag makes BROOD ignore the selection protein even though it is part of the query. The flag makes it easy to run the BROOD job with and without the selection (see also brood_1.removed.oeb.gz).

-cpddb

The user can specify a file of available compounds (either commercial or internal) with this flag. If specified, molecules from this file that are similar to any created analogs will be used to annotate the newly suggested molecule. When possible, this will allow a chemist to consider available molecules for purchase rather than the synthetic effort required by the analog. This flag could also be used to specify known examples of inhibitors. In this manner, it could be used to annotate new analogs with competitor compounds which represent areas of IP space that the user would rather avoid.

Output Parameters

-prefix

This string flag determines the prefix of the info, log, report, param and output files. For instance, if -prefix is set to foo, then the output files will include foo.info, foo.log, foo.report, and foo.param. [default = brood]

-dots

When this flag is set, the program will write a single dot (.) to the terminal (stdout/cout) for every 500 fragments that are processed. [default = false]

-log

This flag can be used to override the -prefix flag to specify the filename of the log file. The log file contains general information concerning the program’s execution including a duplicate of the splash screen, all the parameters used in the execution, all the warnings and errors generated during the execution and a summary of the run.

-info

This flag can be used to override the -prefix flag to specify the filename of the info file. The info file contains running totals of the progress of the run. Examining the info file is the best means of checking on the progress of an execution.

-report

This flag can be used to override the -prefix flag to specify the filename of the report file. This file contains a 1-line per molecule encapsulation of the entire calculation. This data file is ready for import into a spreadsheet program for easy examination. The table contains complete entries for all of the molecules in the input file regardless of whether or not they are in a hitlist.

-format

This flag determines the file-format (vida infra) of the molecular output of the hitlist. The preceding ”.” is optional (e.g. both “.sdf” and “sdf” will work. The hitlist molecules are annotated with scores and a wide array of molecular properties, and molecular annotation (source data etc.) By default these are represented in an OpenEye binary format for efficient communication with visualization tools. Whenever possible, they are also annotated as SD data so they can be preserved with third party visualizers. [default = oeb.gz]

-csv

If set, this parameter causes BROOD to write a .csv file containing a comma separated hitlist, with one line per analog molecule. Each line of the file will contain the molecule specified with SMILES and all of the associated scores and physical property data. This file is designed to be easily imported into a number of spreadsheets for detail examination, particularly if you are not able to use the spreadsheet built into VIDA. This parameter was formerly known as the -txt parameter and the associated output was tab separated columns rather than comma separated. [default = true]

-idea (-cluster)

This flag determines if the hitlist will be organized according to a reduced graph hierarchy. This is quite useful for clustering similar analogs and thus allowing a user to quickly scan the different analog families and drill-down into the most interesting clusters without needing to examine hundreds of analogs. [default = true]

-neutralpH

This flag controls whether the final molecules are set to an ionization state suitable at pH=7.4. Because new bonds are formed in the process of generating analogs, functional groups sometimes need adjustment even if their state was properly normalized prior to fragmentation. This flag allows the user to turn off pKa normalization. [default = true]

-tautomer

Similar the -neutralpH, adjustment of tautomer states can be necessary after joining the new fragments with the static portions of the original molecule. This flag allows the user to turn off this normalization. [default = true]

-hitlistProt

By default, in cases where protein structures play an integral role in the fragment selection (see -protein and select), the protein is normally part of the hitlist in order to enable convenient visualization of the hitlist molecules in the active site. For some visualization environments, proteins are not appropriate. This flag allows the user to control whether the proteins appear in the hitlist. [default = true]

Control Parameters

-quickLook (-quick)

This Boolean flag allows a user to do a quick (~2 minute) BROOD search. BROOD is designed to identify as many interesting hits as soon as possible. Thus while a thorough search may take minutes to hours, useful preliminary results can be generated in one or two minutes with this option. [default = false]

-ringOnly (-ring)

This is a complex flag whose purpose is to allow the user to control the number of ring atoms in the selected fragments. It allows input of “true, false, none, or an integer 1 <= x <= 12”. This flag has to do with a count of the number of ring atoms in the shortest path between attachment points in a fragment. In cases with more than 2 attachment points, all shortest paths are calculated and the number of ring atoms is summed.

By default, the flag is set to “true” and requires at least 2 ring atoms in the path. If the flag is set to “false”, no ring-atom filtering is done, while if the flag is set to “none”, only fragments without ring atoms in the shortest paths are returned. If the flag is set to a number between 0 and 13, then that will be the minimum number of ring-atoms required. [default = true]

-ET (-et)

This Boolean flag determines whether or not the electrostatic Tanimoto similarity is calculated. If true, electrostatic similarity is calculated on all of the fragments that also have a good shape score. [default = false]

-linkOnly (-linkonly, -struc, -struct)

If this flag is set, then the shape and chemistry of the fragment are ignored and ONLY the attachment point geometry (and constraints) are used to identify similar fragments. This is similar to Bartlett’s Caveat algorithm, one of the first algorithms in this genre. This type of search can be useful when trying to bridge one or more fragments without any prior knowledge, such as when linking two fragments in fragment-based design.

-sdtag

This flag indicates whether the scoring information will be attached to output molecules as SD tag data. The possible values are “false”, which indicates that no SD data will be attached, and “verbose”, which indicates that all of the sub-scores will be attached. For further details on the scoring labels, please see the section on the Report File (vide infra). This parameter will only work for .sdf or .oeb file formats. [default = verbose]

-checkBond (-checkbond)

If this flag is set to True, BROOD will check all bonds formed between new fragments and the rest of the query molecule and eliminate fragments that form bond types that are typically unstable. Rather than being discarded entirely, these hits will be written in the brood_1.removed.oeb.gz file by default. [default = true]

-maxLocalStrain

Maximum local strain allowed to fit analog molecule on top of query. [default = 6.5]

-maxHit (-maxhit)

This flag determines the number of compounds saved in the hitlist. Allowable range is 1-50000. The legal range of this parameter has been increased by an order-of-magnitude. [default = 1000]

-title

When this string parameter is set, the score of each fragment in the hitlist will be appended to the title of the molecule. The parameter passed as the value is the delimiter between the original title and the addendum. [default = “”]

-attachColor

When this flag is set, each molecule in the hitlist is annotated with the color atoms used in calculation of the chemical similarity. [default = false]

-attachFrag

This flag allows the user to control the annotation of each hitlist molecule with the SMILES of the fragment replacement (which makes up a portion). In VIDA, this is displayed as a depiction in the spreadsheet associated with the row of the molecule. [default = true]

Advanced Parameters

-bondOrder (-bo)

When this Boolean flag is set, only fragments with the same attachment bond orders as the query fragment are allowed. [default = true]

-attachmentCutoff (-attachcut)

Minimum acceptable attachment point score cutoff. The cutoff is for the shape-overlap score of the beginning and ending atom of each attachment bond. The default was chosen empirically to assure all fragments in the hitlist would have sensible attachment geometries. This value does not affect runs when the -linkOnly flag is set. [default = 0.78]

-shapeCutoff

Indicates the minimum required shape Tanimoto score required for a fragment to appear in the color, elect or queryAnalog hitlists. This cutoff is useful for cases when few shape-similar fragments exist in the database being searched. [default = 0.6]

-attachmentScale (-attach, -aScale)

This floating point value determines the balance between the chemical color score and the attachment point scores. Higher values indicate more weighting for the attachment-point alignment. This parameter has complex effects, please use it with care. [default = 1.5]

-checkGeometry

By default, regardless of the attachment score (used primarily to drive the optimization) the fit of a fragment at the attachment points is controlled by geometric constraints. These require that the two attachment vectors that will be jointed into a bond are overlapping and pointing in roughly opposite directions. In some cases, particularly when the required geometry is unusual or strained, it can be useful to turn off this constraint check. [default = true]

-fromCT

This flag is rendered somewhat less relevant by the graphical query editing

This flag indicates that the 3D conformer of the query molecule should be generated from the molecular graph, independent of the conformer in the input file. If this flag is False, BROOD will attempt to read the query molecule as a 3D structure. If the input format is 2D in nature, BROOD will generate a 3D conformer for the query molecule regardless of this flag (i.e. a user can specify the query without a 3D structure). Please note that the database file should always contain 3D structures. [default = false]

-fileChrg

When this flag is true, the partial charges for electrostatic Tanimoto calculations are taken from the input files. If this flag is false, or the input file does not contain partial charges, MMFF charges will be used. The partial charges on the database molecules are pregenerated. [default = false]

-interval

This is the interval at which data is written to the info file. The info file contains running totals of the progress of the run. Examining the info file is the best means of checking on the progress of an execution. If this flag is 50, then the info file is re-written every 50 molecules. Early in searches, the file may be written more frequently. [default = 5000]

-hitinterval

This integer flag indicates the interval at which intermediate copies of the hitlists should be written to disk. This allows a user to examine preliminary results while the database search is still executing. If the value is set to 0, the intermediate files will not be written. Early in searches, the file may be written more frequently. [default = 3000]

-maxFrag

Maximum number of fragments to search. This is useful for quick testing of parameters. For rapid searching with good results, the -quicksearch flag is generally significantly superior. If set to 0, then there is no limit. [default = 0]

-rangeSize

By default, only fragments with a number of heavy atoms +/- N from the fragment being replaced are considered. This flag lets the user control this setting. It should not be necessary to change this value except in unusual circumstances. This flag determines the range of heavy atoms around query fragment’s number of heavy atoms to examine. [default = 6]

-rangeOffset

By default, the range of fragment heavy atoms specified by the -rangeSize flag is centered around the same number of heavy atoms as the original query fragment. Instead, the user can use this flag to bias search toward smaller or larger fragments with positive or negative values passed to this flag. [default = 0]

-bumpRadius

This real number flag determines the minimum distance between ligand heavy atoms and protein heavy atoms in the final relaxed solution for there to be considered a clash. New analogs with atoms closer than this cutoff to the active-site protein are discarded from the hitlist. If a selectivity protein is being used, at least one clashing atom must be found in order for the hitlist ligand to be retained. Because the molecular model of the protein is rigid it may be advantageous to this cutoff may need to be shorter than in a system with a flexible protein model. [default = 2.25]

-forcefield

Force-field for final analog optimization. The current choice is between OEMMFF94 and OEMMFF94s. In both cases, the Sheffield solvation model will be included in the potential.

Property selection

-property (-prop)

This Boolean flag indicates whether ANY of the molecular properties should be used to eliminate compounds from consideration. While these filters can be useful, if a user expects to see a particular fragment or analog and it does not appear in the hitlist, it will often have been eliminated by the property filters. The Report File will indicate which if any property filter affected each fragment. If set to false, all of the related flags below become irrelevant. [default = true]

-minMolWt (-minmolecularweight)
-maxMolWt (-maxmolecularweight)

These flags indicate the upper and lower bound of the molecular weight range of any analog. Analogs higher or lower in molecular weight than these cutoffs will be eliminated from consideration. [default = 100,500]

-minlogp
-maxlogp

These flags indicate the upper and lower bound of the calculated LogP range of any analog. Analogs higher or lower in LogP than these cutoffs will be eliminated from consideration. [default = -1.0,5.0]

-minpsa (-mintpsa)
-maxpsa (-maxtpsa)

These flags indicate the upper and lower bound of the topological polar surface area (TPSA) range of any analog. Analogs higher or lower in PSA than these cutoffs will be eliminated from consideration. [Clark-1999] [Ertl-2000]. [default = 60,150]

-minRotBond (-minRotatableBond)
-maxRotBond (-maxRotatableBond)

These flags indicate the upper and lower bound of the rotatable bond range of any analog. Analogs higher or lower in number of rotatable bonds than these cutoffs will be eliminated from consideration. [default = 0,13]

-minHvyAtom (-minHeavyAtom)
-maxHvyAtom (-maxHeavyAtom)

These flags indicate the upper and lower bound of the heavy atom range of any analog. Analogs higher or lower in number of heavy atoms than these cutoffs will be eliminated from consideration. [default = 7,35]

-minLipinskiDon (-minLipinskiDonors, -minDonors)
-maxLipinskiDon (-maxLipinskiDonors, -maxDonors)

These flags indicate the upper and lower bound of the Lipinski hydrogen-bond donor range of any analog. Analogs higher or lower in number of Lipinski hydrogen-bond donors than these cutoffs will be eliminated from consideration. For the purpose of this measure, h-bond donors are determined by the method of Lipinski [Lipinski-1997]. [default = 1,8]

-minLipinskiAcc (-minLipinskiAcceptors, -minAcceptors)
-maxLipinskiAcc (-maxLipinskiAcceptors, -maxAcceptors)

These flags indicate the upper and lower bound of the Lipinski hydrogen-bond acceptor range of any analog. Analogs higher or lower in number of Lipinski hydrogen-bond acceptors than these cutoffs will be eliminated from consideration. For the purpose of this measure, h-bond acceptors are determined by the method of Lipinski [Lipinski-1997]. [default = 2, 11]

Synthetic Properties

-minComplexity (-minComp)
-maxComplexity (-maxComp)

These flags indicate the upper and lower bound of the molecular complexity range of any analog. Analogs higher or lower in molecular complexity than these cutoffs will be eliminated from consideration. For more on molecular complexity, please see the theory section. [default = 0.0, 1.0]

-minFrequency (-minFreq)
-maxFrequency (-maxFreq)

These flags indicate the upper and lower bound of the frequency range of any analog. Analogs higher or lower in frequency than these cutoffs will be eliminated from consideration. For the purpose of this calculation, the frequency is a percentile number indicating the frequency of each fragment normalized relative to the frequency of fragments in CHEMBL. Frequency as assessed here is a measure of how common each fragment is among the source molecules. The most commonly occurring fragment would be in the 99th percentile, while the least commonly occurring fragments would be in the 1st percentile. [default = 0, 100]

Derived Property Selection

-minLipinski
-maxLipinski

These flags indicate the upper and lower bound of the Lipinski violations range of any analog. Analogs higher or lower in number of Lipinski violations than these cutoffs will be eliminated from consideration. [Lipinski-1997]. In Lipinski’s work, in order to segregate molecules that progressed through clinical trials, he determined that one violation was acceptable, but two were not. [default = 0, 1]

-minMartin (-minAbbott, -minABS)
-maxMartin (-maxAbbott, -maxABS)

These flags indicate the upper and lower bound of the Abbott Bioavailability Score (ABS) range of any analog. Analogs higher or lower in ABS than these cutoffs will be eliminated from consideration. This floating point ABS parameter (range 0.0-1.0) indicates the minimum allowable probability that F will be >10% in rats according the QSAR model developed and published by Yvonne Martin [Martin-2005]. A value of 0.0 will allow all compounds to pass. [default = 0.2, 1.0]

-eganEgg (-egan)

This Boolean parameter determines whether analog compounds will be required to fulfill the “Egan egg” measure of bioavailability. This measure was published by Bill Egan while at Pharmacopia [Egan-2000], and rejects compounds with a LogP > 5.88 or a PSA > 131.6. [default = true]

-veber (-gsk)

This Boolean parameter determines whether analog compounds will be required to fulfill the measure of bioavailability Veber published at GSK [Veber-2002]. His measure of bioavailability eliminates compounds with a PSA > 140 or more than 10 rotatable bonds. [default = false]

-minFsp3C
-maxFsp3C

These flags indicate the upper and lower bound of the fraction of carbons that are sp3 range of any analog. Analogs higher or lower in fraction of carbons that are sp3 than these cutoffs will be eliminated from consideration. There is some evidence that increasing the fraction of carbons in a series that are sp3 (escaping from “Flatland”) improved success in clinical trials [Lovering-2009]. [default = 0.3, 1.0]

-minAromFJCt
-maxAromFJCt

These flags indicate the upper and lower bound of the aromatic ring range of any analog. Analogs higher or lower in number of aromatic rings than these cutoffs will be eliminated from consideration. For the purpose of this calculation, the number of aromatic rings is the number of aromatic rings is #aromatic bonds - #aromatic atoms + 1. [default = 0, 5]