• Docs »
  • BROOD Command Line Usage

BROOD Command Line Usage

Getting Started

If you are just getting started with BROOD, we highly recommend you use the graphical interface. Even if you prefer to use the command line, walking through the graphical interface one time will give you a good overview of the workflow involved in running BROOD.

The basic idea behind BROOD is that you have a lead molecule and would like to generate analogs with somewhat different properties by changing a portion of the molecules. Normally, you will load the lead molecule into the GUI and select and edit the fragment you would like to replace. The GUI will also allow you to edit the chemical description of the query as well as define required motifs (constraints).

The BROOD GUI will allow you to specify all the control parameters allowed in the BROOD application. The tooltips in the GUI show which command-line argument each button or input controls. When the GUI executed BROOD, it is executing the command-line application. This guarantees that any run carried out in the GUI can also be carried out on the command line. At the beginning of each run, the GUI reports in its log window the command-line parameters used to initiate the BROOD search.

BROOD will search the database of fragments for fragments with similar shape, chemistry and electrostatics to your query fragment, replace them in your query molecule and generate a hitlist of analog molecules. To carry out a search like this, use the command line:

prompt> vbrood

When using the GUI to generate a query, the GUI can be used to write the query file and the .param file (which contains all of the pertinent control parameters). These files are written to the working directory and can be used to subsequently execute the command line (if running from inside the GUI is not desirable).

prompt> brood -queryMol brood_1.query.oeb -param brood_1.param

The output file brood.oeb can be viewed in VIDA. We recommend you use the VIDA extension that ships with the BROOD application. This extension allows you to simultaneously view the 3D structure of the analogs generated by BROOD and various of their physical properties. Further, it aids in the exploration of the clusters in the BROOD hitlist. If that is not possible, you should examine the SD data attached to each molecule in whatever viewer you deem best. In addition, a .txt file of all the molecular data is written that can be imported into most spreadsheet programs for detailed examination.

Command-Line Interface

Help

Executing BROOD with no arguments will result in basic information.

prompt> brood

will generate the following output:

Brood 3.0.0.1, 20150619
OEChem version 2.0.1 , 20150619
Platform: osx-10.10-clang++6-x64
**OpenEye Scientific Software, Inc.**

Supported Run Modes:

       Single processor

Licensed for the exclusive use of OpenEye.
Licensed for use only in Worldwide.

A description of the command-line interface can be obtained by executing BROOD with the --help option. This lists help for the simple BROOD parameters as well as minor instructions for accessing additional help.

prompt> brood --help

will generate the following output:

Simple parameter list
  Execute Options
    -param : A parameter file

  Brood
    Input
      -queryFrag : Query fragment to use as search template (required)
      -queryMol : Query molecule for building analogs (not required)
      -db : Fragment database to search

    Control parameters
      -quickLook : Do a brief (~2 min) search and return a quick set of
                   results
      -ringOnly : Only select fragments with ring in attachment path
                  (-ringOnly parameter must be true, false, none or a number 1 <= x <= 12
                  indicating the minimum number of ring atoms required in the shortest
                  path between attachments).

    Property Selection
      -property : Filter fragments by property



Additional help functions:
  brood --help simple      : Get a list of simple parameters (as seen above)
  brood --help all         : Get a complete list of parameters
  brood --help defaults    : List the defaults for all parameters
  brood --help <parameter> : Get detailed help on a parameter
  brood --help html        : Create an html help file for this program
  brood --help versions    : List the toolkits and versions used in the application

Simple Help

If you desire to see the most important command-line options use --help simple.

prompt> brood --help simple

will generate the following output:

Simple parameter list
  Execute Options
    -param : A parameter file

  Brood
    Input
      -queryFrag : Query fragment to use as search template (required)
      -queryMol : Query molecule for building analogs (not required)
      -db : Fragment database to search

    Control parameters
      -quickLook : Do a brief (~2 min) search and return a quick set of
                   results
      -ringOnly : Only select fragments with ring in attachment path
                  (-ringOnly parameter must be true, false, none or a number 1 <= x <= 12
                  indicating the minimum number of ring atoms required in the shortest
                  path between attachments).

    Property Selection
      -property : Filter fragments by property

Complete Help

If you desire to see all of the command-line options use --help all.

prompt> brood --help all

will generate the following output:

Complete parameter list
  Execute Options
    -param : A parameter file
    -mpi_np : Number of MPI processes to launch.
    -mpi_hostfile : Path to hostfile to be used for launching MPI processes.

  Brood
    Input
      -queryFrag : Query fragment to use as search template (required)
      -queryMol : Query molecule for building analogs (not required)
      -db : Fragment database to search
      -prot : Macro molecule for bump-check of fragments and build analogs
      -select : Macro molecule for required bump of fragments and build
                analogs
      -noqueryprot : Ignore the protein in a query
      -noqueryselectprot : Ignore the selection protein in a query
      -cpddb : Database of known compounds to identify available compounds
               similar to hits
      -param : Control parameter file

    Output
      -prefix : Prefix for generic output files
      -dots : Write a dot to the terminal for every 500 cpds processed
      -log : Write to specified log file (override -prefix)
      -info : Write to specified info file (override -prefix)
      -report : Write complete output in table form (override -prefix)
      -format : Molecular output format
      -csv : Generate comma separated hitlist for reading into spreadsheets
      -idea : Generate cluster information for hitlists
      -neutralpH : Apply neutral pH model to newly constructed analogs
      -tautomer : Apply reasonable tautomer model to newly constructed analogs
      -hitlistProt : Include protein(s) in hitlist (if applicable)

    Control parameters
      -quickLook : Do a brief (~2 min) search and return a quick set of
                   results
      -ringOnly : Only select fragments with ring in attachment path
                  (-ringOnly parameter must be true, false, none or a number 1 <= x <= 12
                  indicating the minimum number of ring atoms required in the shortest
                  path between attachments).
      -ET : Use electrostatic Tanimoto to generate the similarity hitlist
      -linkOnly : Identify linkers that mimic geometry attachment ONLY
                  (Caveat-like). Requires 2 or more attachment points.
      -sdTag : Add bioisostere scores as SD Tags (SDF and OEB only)
      -checkBond : Check for chemically stable attachment bonds
      -maxLocalStrain : Maximum local strain allowed to fit analog molecule on top of query.
      -maxHit : Size of hitlists (1-50000)
      -title : Add scores to molecule title with this delimiter
      -attachColor : Attach color potential surface to each molecule
      -attachFrag : Attach fragment SMILES as SD data to each molecule

    Advanced parameters
      -bondOrder : Require same attachment bond order
      -attachmentCutoff : Minimum acceptable attachment point Tanimoto
      -shapeCutoff : Minimum acceptable shape Tanimoto
      -attachmentScale : Scale factor weighting the importance of attachment
                         points
      -checkGeometry : Require minimal geometric difference at attachment
                       points
      -fromCT : Generate query conformer from the connection table.
      -fileChrg : Take partial charges from the input molecule.
      -interval : Update info file every N molecules
      -hitinterval : Write intermediate hitlist files every N molecules
      -maxFrag : Maximum number of fragments to search
      -rangeSize : Range of heavy atoms around query to examine
      -rangeOffset : Bias search toward smaller or larger fragments with - or
                     + values
      -bumpRadius : Protein-ligand bump radius
      -forcefield : Force-field for final analog optimization

    Property Selection
      -property : Filter fragments by property
      -maxMolWt : Require molecular weight less than this value
      -minMolWt : Require at least this molecular weight
      -maxlogp : Require LogP less than this value
      -minlogp : Require at least this LogP
      -maxpsa : Require PSA less than this value
      -minpsa : Require at least this PSA
      -maxRotBond : Require fewer rotatable bonds than this value
      -minRotBond : Require at least this many rotatable bonds
      -maxHvyAtom : Require fewer heavy atoms than this value
      -minHvyAtom : Require at least this many heavy atoms
      -maxLipinskiDon : Require fewer Lipinski donors than this value
      -minLipinskiDon : Require at least this many Lipinski donors
      -maxLipinskiAcc : Require fewer Lipinski acceptors than this value
      -minLipinskiAcc : Require at least this many Lipinski acceptors

    Synthetic Properties
      -maxComplexity : Maximum acceptable molecular complexity
      -minComplexity : Minimum acceptable molecular complexity
      -maxFreq : Maximum frequency of fragment in molecules as scaled
                 percentile
      -minFreq : Minimum frequency of fragment in molecules as scaled
                 percentile

    Derived Property Selection
      -maxLipinski : Maximum number of allowed Lipinski violations
      -minLipinski : Minimum number of allowed Lipinski violations
      -maxMartin : Maximum probability of F>10% in rats according to Martin's
                   QSAR model
      -minMartin : Minimum probability of F>10% in rats according to Martin's
                   QSAR model
      -eganEgg : Boolean for whether to require analogs to pass Egan
                 bioavailability model
      -veber : Boolean for whether to require analogs to pass Veber
               bioavailability model
      -maxFsp3C : Maximum allowed fraction of carbons that are sp3
      -minFsp3C : Minimum allowed fraction of carbons that are sp3
      -maxAromFJCt : Maximum number of aromatic rings
      -minAromFJCt : Minimum number of aromatic rings

Optional Parameters

Execution Parameters

-param

This flag specifies a parameter file that contains all of the command-line parameters in a simple text file. The parameter file is automatically written to “-prefix.param” with every execution. It is a record of the input that was used and can be used to rerun the exact same process. It can also be altered by hand to modify a prior execution. If a parameter is set in the param file and on the command line, the command line setting takes precedence. More information is available in the section on the parameter files below.

-mpi_np

This flag invokes BROOD with multi-processing on the current machine. It is recommended that you specify not more than one more than the number of cores available on your machine.

-mpi_hostfile

This flag invokes BROOD with multi-processing on the current machine OR multiple machines. Use this flag to specify a file that indicates the machines upon which to run MPI master and slaves and the number of processes to run on each machine. For additional information and the format of the file, please see section on MPI.

Input Parameters

-queryMol (-qMol)

Molecular file encapsulating the query. It contains the primary molecule of interest, the query fragment, description of the query potential function and constraints, and the protein structures for the active-site and for selectivity.

-queryFragment (-queryFrag)

File containing the molecular connection table for the fragment to be replaced in the query molecule. If you use the graphical interface to generate a query, the queryFragment will be specified directly on the -queryMolecule, and this parameter is no longer required. This parameter is being maintained for backwards compatibility.

-db (-database)

Specified the directory containing the BROOD database files. Additional details on the files contained in the directory can be found in the section on the Fragment Database as well as in the section on Database Preparation

-prot (-protein)

Specifies and optional protein file. If a protein is given, after fragments and analogs are overlaid on the query fragment or molecule, they will be checked for clashes with the protein. Any analogs with clashes will be eliminated. Thus if you have a co-crystal structure of your ligand of interest, this flag will allow BROOD to only identify analogs that can fit in the active site.

-select (-selection)

Specifies an optional protein file for selectivity. If a selection protein is given, after fragments and analogs ore overlaid on the query fragment, they will be checked for clashes with this protein. As opposed to the active-site protein, new analogs are required to have an atom within a specified radius of this protein’s atoms. This is a simplistic screen for selectivity.

-noqueryprot (-noqueryprotein)

When a query is created with the GUI and an active-site protein is included, the section active-site protein becomes part of the query. This flag makes BROOD ignore the active-site protein even though it is part of the query. The flag makes it easy to run the BROOD job with and without the active-site protein (see also brood_1.removed.oeb.gz).

-noqueryselectprot (-noqueryselectionprotein, -noqueryselection)

When a query is created with the GUI and a selection protein is included, the selection protein becomes part of the query. This flag makes BROOD ignore the selection protein even though it is part of the query. The flag makes it easy to run the BROOD job with and without the selection (see also brood_1.removed.oeb.gz).

-cpddb

The user can specify a file of available compounds (either commercial or internal) with this flag. If specified, molecules from this file that are similar to any created analogs will be used to annotate the newly suggested molecule. When possible, this will allow a chemist to consider available molecules for purchase rather than the synthetic effort required by the analog. This flag could also be used to specify known examples of inhibitors. In this manner, it could be used to annotate new analogs with competitor compounds which represent areas of IP space that the user would rather avoid.

Output Parameters

-prefix

This string flag determines the prefix of the info, log, report, param and output files. For instance, if -prefix is set to foo, then the output files will include foo.info, foo.log, foo.report, and foo.param. [default = brood]

-dots

When this flag is set, the program will write a single dot (.) to the terminal (stdout/cout) for every 500 fragments that are processed. [default = false]

-log

This flag can be used to override the -prefix flag to specify the filename of the log file. The log file contains general information concerning the program’s execution including a duplicate of the splash screen, all the parameters used in the execution, all the warnings and errors generated during the execution and a summary of the run.

-info

This flag can be used to override the -prefix flag to specify the filename of the info file. The info file contains running totals of the progress of the run. Examining the info file is the best means of checking on the progress of an execution.

-report

This flag can be used to override the -prefix flag to specify the filename of the report file. This file contains a 1-line per molecule encapsulation of the entire calculation. This data file is ready for import into a spreadsheet program for easy examination. The table contains complete entries for all of the molecules in the input file regardless of whether or not they are in a hitlist.

-format

This flag determines the file-format (vida infra) of the molecular output of the hitlist. The preceding ”.” is optional (e.g. both “.sdf” and “sdf” will work. The hitlist molecules are annotated with scores and a wide array of molecular properties, and molecular annotation (source data etc.) By default these are represented in a OpenEye binary format for efficient communication with visualization tools. Whenever possible, they are also annotated as SD data so they can be preserved with third party visualizers. [default = oeb.gz]

-csv

If set, this parameter causes BROOD to write a .csv file containing a comma separated hitlist, with one line per analog molecule. Each line of the file will contain the molecule specified with SMILES and all of the associated scores and physical property data. This file is designed to be easily imported into a number of spreadsheets for detail examination, particularly if you are not able to use the spreadsheet built into VIDA. This parameter was formerly known as the -txt parameter and the associated output was tab separated columns rather than comma separated. [default = true]

-idea (-cluster)

This flag determines if the hitlist will be organized according to a reduced graph hierarchy. This is quite useful for clustering similar analogs and thus allowing a user to quickly scan the different analog families and drill-down into the most interesting clusters without needing to examine hundreds of analogs. [default = true]

-neutralpH

This flag controls whether the final molecules are set to an ionization state suitable at pH=7.4. Because new bonds are formed in the process of generating analogs, functional groups sometimes need adjustment even if their state was properly normalized prior to fragmentation. This flag allows the user to turn off pKa normalization. [default = true]

-tautomer

Similar the :option:-neutralpH, adjustment of tautomer states can be necessary after joining the new fragments with the static portions of the original molecule. This flag allows the user to turn off this normalization. [default = true]

-hitlistProt

By default, in cases where protein structures play an integral role in the fragment selection (see -protein and select), the protein is normally part of the hitlist in order to enable convenient visualization of the hitlist molecules in the active site. For some visualization environments, proteins are not appropriate. This flag allows the user to control whether the proteins appear in the hitlist. [default = true]

Control Parameters

-quickLook (-quick)

This Boolean flag allows a user to do a quick (~2 minute) BROOD search. BROOD is designed to identify as many interesting hits as soon as possible. Thus while a thorough search may take minutes to hours, useful preliminary results can be generated in one or two minutes with this option. [default = false]

-ringOnly (-ring)

This is a complex flag whose purpose is to allow the user to control the number of ring atoms in the selected fragments. It allows input of “true, false, none, or an integer 1 <= x <= 12”. This flag has to do with a count of the number of ring atoms in the shortest path between attachment points in a fragment. In cases with more than 2 attachment points, all shortest paths are calculated and the number of ring atoms is summed.

By default, the flag is set to “true” and requires at least 2 ring atom in the path. If the flag is set to “false”, no ring-atom filtering is done, while if the flag is set to “none”, only fragments without ring atoms in the shortest paths are returned. If the flag is set to a number between 0 and 13, then that will be the minimum number of ring-atoms required. [default = true]

-ET (-et)

This Boolean flag determines whether or not the electrostatic Tanimoto similarity is calculated. If true, electrostatic similarity is calculated on all of the fragments that also have a good shape score. [default = false]

-linkOnly (-linkonly, -struc, -struct)

If this flag is set, then the shape and chemistry of the fragment are ignored and ONLY the attachment point geometry (and constraints) are used to identify similar fragments. This is similar to Bartlett’s Caveat algorithm, one of the first algorithms in this genre. This type of search can be useful when trying to bridge one or more fragments without any prior knowledge, such as when linking two fragments in fragment-based design.

-sdtag

This flag indicates whether the scoring information will be attached to output molecules as SD tag data. The possible values are “false”, which indicates that no SD data will be attached, and “verbose”, which indicates that all of the sub-scores will be attached. For further details on the scoring labels, please see the section on the Report File (vide infra). This parameter will only work for .sdf or .oeb file formats. [default = verbose]

-checkBond (-checkbond)

If this flag is set to True, BROOD will check all bonds formed between new fragments and the rest of the query molecule and eliminate fragments that form bond types that are typically unstable. Rather than being discarded entirely, these hits will be written in the brood_1.removed.oeb.gz file by default. [default = true]

-maxLocalStrain

Maximum local strain allowed to fit analog molecule on top of query. [default = 6.5]

-maxHit (-maxhit)

This flag determines the number of compounds saved in the hitlist. Allowable range is 1-50000. The legal range of this parameter has been increased by an order-of-magnitude. [default = 1000]

-title

When this string parameter is set, the score of each fragment in the hitlist will be appended to the title of the molecule. The parameter passed as the value is the delimiter between the original title and the addendum. [default = “”]

-attachColor

When this flag is set, each molecule in the hitlist is annotated with the color atoms used in calculation of the chemical similarity. [default = false]

-attachFrag

This flag allows the user to control the annotation of each hitlist molecule with the SMILES of the fragment replacement (which makes up a portion). In VIDA, this is displayed as a depiction in the spreadsheet associated with the row of the molecule. [default = true]

Advanced Parameters

-bondOrder (-bo)

When this Boolean flag is set, only fragments with the same attachment bond orders as the query fragment are allowed. [default = true]

-attachmentCutoff (-attachcut)

Minimum acceptable attachment point score cutoff. The cutoff is for the shape-overlap score of the beginning and ending atom of each attachment bond. The default was chosen empirically to assure all fragments in the hitlist would have sensible attachment geometries. This value does not affect runs when the -linkOnly flag is set. [default = 0.78]

-shapeCutoff

Indicates the minimum required shape Tanimoto score required for a fragment to appear in the color, elect or queryAnalog hitlists. This cutoff is useful for cases when few shape-similar fragments exist in the database being searched. [default = 0.6]

-attachmentScale (-attach, -aScale)

This floating point value determines the balance between the chemical color score and the attachment point scores. Higher values indicate more weighting for the attachment-point alignment. This parameter has complex effects, please use it with care. [default = 1.5]

-checkGeometry

By default, regardless of the attachment score (used primarily to drive the optimization) the fit of a fragment at the attachment points is controlled by geometric constraints. These require that the two attachment vectors that will be jointed into a bond are overlapping and pointing in roughly opposite directions. In some cases, particularly when the required geometry is unusual or strained, it can be useful to turn off this constraint check. [default = true]

-fromCT

This flag is rendered somewhat less relevant by the graphical query editing

This flag indicates that the 3D conformer of the query molecule should be generated from the molecular graph, independent of the conformer in the input file. If this flag is False, BROOD will attempt to read the query molecule as a 3D structure. If the input format is 2D in nature, BROOD will generate a 3D conformer for the query molecule regardless of this flag (i.e. a user can specify the query without a 3D structure). Please note that the database file should always contain 3D structures. [default = false]

-fileChrg

When this flag is true, the partial charges for electrostatic Tanimoto calculations are taken from the input files. If this flag is false, or the input file does not contain partial charges, MMFF charges will be used. The partial charges on the database molecules are pregenerated. [default = false]

-interval

This is the interval at which data is written to the info file. The info file contains running totals of the progress of the run. Examining the info file is the best means of checking on the progress of an execution. If this flag is 50, then the info file is re-written every 50 molecules. Early in searches, the file may be written more frequently. [default = 5000]

-hitinterval

This integer flag indicates the interval at which intermediate copies of the hitlists should be written to disk. This allows a user to examine preliminary results while the database search is still executing. If the value is set to 0, the intermediate files will not be written. Early in searches, the file may be written more frequently. [default = 3000]

-maxFrag

Maximum number of fragments to search. This is useful for quick testing of parameters. For rapid searching with good results, the -quicksearch flag is generally significantly superior. If set to 0, then there is no limit. [default = 0]

-rangeSize

By default, only fragments with a number of heavy atoms +/- N from the fragment being replaced are considered. This flag lets the user control this setting. It should not be necessary to change this value except in unusual circumstances. This flag determines the range of heavy atoms around query fragment’s number of heavy atoms to examine. [default = 6]

-rangeOffset

By default, the range of fragment heavy atoms specified by the -rangeSize flag is centered around the same number of heavy atoms as the original query fragment. Instead, the user can use this flag to bias search toward smaller or larger fragments with positive or negative values passed to this flag. [default = 0]

-bumpRadius

This real number flag determines the minimum distance between ligand heavy atoms and protein heavy atoms in the final relaxed solution for there to be considered a clash. New analogs with atoms closer than this cutoff to the active-site protein are discarded from the hitlist. If a selectivity protein is being used, at least one clashing atom must be found in order for the hitlist ligand to be retained. Because the molecular model of the protein is rigid it may be advantageous to this cutoff may need to be shorter than in a system with a flexible protein model. [default = 2.25]

-forcefield

Force-field for final analog optimization. The current choice is between OEMMFF94 and OEMMFF94s. In both cases, the Sheffield solvation model will be included in the potential.

Property selection

-property (-prop)

This Boolean flag indicates whether ANY of the molecular properties should be used to eliminate compounds from consideration. While these filters can be useful, if a user expects to see a particular fragment or analog and it does not appear in the hitlist, it will often have been eliminated by the property filters. The Report File will indicate which if any property filter affected each fragment. If set to false, all of the related flags below become irrelevant. [default = true]

-minMolWt (-minmolecularweight)
-maxMolWt (-maxmolecularweight)

These flags indicate the upper and lower bound of the molecular weight range of any analog. Analogs higher or lower in molecular weight than these cutoffs will be eliminated from consideration. [default = 100,500]

-minlogp
-maxlogp

These flags indicate the upper and lower bound of the calculated LogP range of any analog. Analogs higher or lower in LogP than these cutoffs will be eliminated from consideration. [default = -1.0,5.0]

-minpsa (-mintpsa)
-maxpsa (-maxtpsa)

These flags indicate the upper and lower bound of the topological polar surface area (TPSA) range of any analog. Analogs higher or lower in PSA than these cutoffs will be eliminated from consideration. [Clark-1999] [Ertl-2000]. [default = 60,150]

-minRotBond (-minRotatableBond)
-maxRotBond (-maxRotatableBond)

These flags indicate the upper and lower bound of the rotatable bond range of any analog. Analogs higher or lower in number of rotatable bonds than these cutoffs will be eliminated from consideration. [default = 0,13]

-minHvyAtom (-minHeavyAtom)
-maxHvyAtom (-maxHeavyAtom)

These flags indicate the upper and lower bound of the heavy atom range of any analog. Analogs higher or lower in number of heavy atoms than these cutoffs will be eliminated from consideration. [default = 7,35]

-minLipinskiDon (-minLipinskiDonors, -minDonors)
-maxLipinskiDon (-maxLipinskiDonors, -maxDonors)

These flags indicate the upper and lower bound of the Lipinski hydrogen-bond donor range of any analog. Analogs higher or lower in number of Lipinski hydrogen-bond donors than these cutoffs will be eliminated from consideration. For the purpose of this measure, h-bond donors are determined by the method of Lipinski [Lipinski-1997]. [default = 1,8]

-minLipinskiAcc (-minLipinskiAcceptors, -minAcceptors)
-maxLipinskiAcc (-maxLipinskiAcceptors, -maxAcceptors)

These flags indicate the upper and lower bound of the Lipinski hydrogen-bond acceptor range of any analog. Analogs higher or lower in number of Lipinski hydrogen-bond acceptors than these cutoffs will be eliminated from consideration. For the purpose of this measure, h-bond acceptors are determined by the method of Lipinski [Lipinski-1997]. [default = 2, 11]

Synthetic Properties

-minComplexity (-minComp)
-maxComplexity (-maxComp)

These flags indicate the upper and lower bound of the molecular complexity range of any analog. Analogs higher or lower in molecular complexity than these cutoffs will be eliminated from consideration. For more on molecular complexity, please see the theory section. [default = 0.0, 1.0]

-minFrequency (-minFreq)
-maxFrequency (-maxFreq)

These flags indicate the upper and lower bound of the frequency range of any analog. Analogs higher or lower in frequency than these cutoffs will be eliminated from consideration. For the purpose of this calculation, the frequency is a percentile number indicating the frequency of each fragment normalized relative to the frequency of fragments in CHEMBL. Frequency as assessed here is a measure of how common each fragment is among the source molecules. The most commonly occurring fragment would be in the 99th percentile, while the least commonly occurring fragments would be in the 1st percentile. [default = 0, 100]

Derived Property Selection

-minLipinski
-maxLipinski

These flags indicate the upper and lower bound of the Lipinski violations range of any analog. Analogs higher or lower in number of Lipinski violations than these cutoffs will be eliminated from consideration. [Lipinski-1997]. In Lipinski’s work, in order to segregate molecules that progressed through clinical trials, he determined that one violation was acceptable, but two were not. [default = 0, 1]

-minMartin (-minAbbott, -minABS)
-maxMartin (-maxAbbott, -maxABS)

These flags indicate the upper and lower bound of the Abbott Bioavailability Score (ABS) range of any analog. Analogs higher or lower in ABS than these cutoffs will be eliminated from consideration. This floating point ABS parameter (range 0.0-1.0) indicates the minimum allowable probability that F will be >10% in rats according the QSAR model developed and published by Yvonne Martin [Martin-2005]. A value of 0.0 will allow all compounds to pass. [default = 0.2, 1.0]

-eganEgg (-egan)

This Boolean parameter determines whether analog compounds will be required to fulfill the “Egan egg” measure of bioavailability. This measure was published by Bill Egan while at Pharmacopia [Egan-2000], and rejects compounds with a LogP > 5.88 or a PSA > 131.6. [default = true]

-veber (-gsk)

This Boolean parameter determines whether analog compounds will be required to fulfill the measure of bioavailability Veber published at GSK [Veber-2002]. His measure of bioavailability eliminates compounds with a PSA > 140 or more than 10 rotatable bonds. [default = false]

-minFsp3C
-maxFsp3C

These flags indicate the upper and lower bound of the fraction of carbons that are sp3 range of any analog. Analogs higher or lower in fraction of carbons that are sp3 than these cutoffs will be eliminated from consideration. There is some evidence that increasing the fraction of carbons in a series that are sp3 (escaping from “Flatland”) improved success in clinical trials [Lovering-2009]. [default = 0.3, 1.0]

-minAromFJCt
-maxAromFJCt

These flags indicate the upper and lower bound of the aromatic ring range of any analog. Analogs higher or lower in number of aromatic rings than these cutoffs will be eliminated from consideration. For the purpose of this calculation, the number of aromatic rings is the number of aromatic rings is #aromatic bonds - #aromatic atoms + 1. [default = 0, 5]

Example Executions

This section has a series of example BROOD command-line executions. Most BROOD runs are carried out using vBROOD. Each example is followed by a brief description of its behavior.

If you would like to execute the following examples as written, the appropriate paths to the executable file and the database file must be included. In addition, the file amide.smi will need to be in the working directory. This can be accomplished with the following command:

prompt> echo "*C(=O)NC*" >> amide.smi

This file can now be used as the query for each case below.

prompt> brood -param 4dfr.param

This execution of BROOD will read all the command-line arguments from the file 4dfr.param. Every time BROOD is executed, a param file is generated that can be used to exactly reproduce the run (vida infra). This option is most useful when the job is set up in vBROOD and the query and param file are written out for later execution. Using the command detailed here with a query and param file from vBROOD will give the same results as running the search from within vBROOD.

prompt> brood -param 4dfr.param -ringOnly false

This example illustrates two points; that command-line arguments take precedence over arguments in the .param file, and that the -ringOnly flag is an important flag in altering BROOD results. This command line will execute BROOD with the parameters from “4dfr.param”, but the -ringOnly parameter will be overridden (taking precedence) over the -ringOnly parameter that is specified in 4dfr.param. A similar outcome could be achieved by editing 4drf.param in a text editor, but this command-line alteration is a more direct means to the same behavior.

By default, BROOD only returns fragments that have at least two ring atoms on the shortest path between attachment points, or at least three ring atoms total if there is only one attachment point. This example shows that a user can turn off this constraint using the -ringOnly flag. If one wants to generate results that only avoid the rings as described above, one can pass None to the -ringOnly parameter.

prompt> brood -queryMol brood.query_1.oeb -db pubchemDB -quickLook

This execution will use the query in brood.query_1.oeb to search the fragments in the database pubchemDB. The use of the -quickLook parameter will limit the search to approximately 2 minutes. The parameter is a great way to get a quick notion of interesting results. The default databases are organized to allow rapid identification of some interesting results. The full search can take minutes to hours, and is often more appropriate when designing molecules that may take days or weeks to synthesize, nevertheless, in an iterative design session, it can be useful to quickly generate ideas using the -quickLook parameter.

prompt> brood -queryMol brood.query_1.oeb -db pubchemDB -cpddb myCorporateCollection.smi

This example points out the utility of the -cpddb parameter. When users pass a collection of molecules to this flag, BROOD will annotate any new analogs it generates with similar molecules from the file passed to this parameter. This can be useful either for identifying desirable analogs (perhaps similar compounds that are publicly available) or undesirable analogs (perhaps known inhibitors of off-target proteins). This execution will use the query in brood.query_1.oeb to search the fragments in the database pubchemDB.

prompt> brood -queryMol brood.query_1.oeb -db pubchemDB -prot target.pdb -select antiTarget.pdb

This example shows that BROOD can accept both an active-site protein, specified by -prot and a selection protein, specified by -select. Both proteins can be specified in vBROOD and in that case would already be encapsulated in brood.query_1.oeb. This example shows how the proteins can be added to a query originally constructed without them. It is critical, that the 3D query, active-site protein and selection protein be oriented in the same frame of reference. When this is true, BROOD will build the analog molecules in the active site of the protein specified by -prot, and eliminate compounds which clash. Similarly, BROOD will compare the analog molecules with the selection protein and require that all hits have at least one clash with that protein. This execution will use the query in brood.query_1.oeb to search the fragments in the database pubchemDB.

prompt> brood -queryMol brood.query_1.oeb -db pubchemDB -neutralpH false -tautomer false

When BROOD is building analogs by putting similar fragments into a molecular framework, sometimes the electronic environment of certain functional groups change dramatically (amines converted to amides for instance). To properly handle these cases, it is useful for BROOD to normalize the ionization and tautomer states of the new molecules before generating the final hitlist. In some cases users are better served without these normalizations. The ionization and tautomer normalizations can be turned off with the -neutralpH and -tautomer flags respectively.

prompt> brood -queryMol broodFragments.query_1.oeb -db pubchemDB -linkOnly

By default, BROOD compares the shape, chemistry, electrostatics and attachment geometry of fragments in order to suggest analog molecules. For some applications, such as joining fragments or closing rings, it is useful to build fragments into empty space. In these cases, one wants to only compare the attachment geometry of fragments, in a manner similar to the original CAVEAT searches [Bartlett-1994]. The -linkOnly flag instructs BROOD to compare fragments using only their ability to form low-energy bridges between attachment points.

prompt> brood -queryMol brood.query_1.oeb -db pubchemDB -attachColor

Like ROCS, BROOD uses color-atoms to encode and compare the chemistry of molecules. The -attachColor flag indicates the BROOD should annotate the hitlist molecules with the color atoms that were used in the comparison. The color atoms are added as a separate molecule attached to the analog molecule and can be visualized in BROOD results viewer.

prompt> brood -queryMol brood.query_1.oeb -db pubchemDB -property false

By default, BROOD filters the final analog hitlist with a series of property filters, each specified with a pair of range parameters. While this can be useful, on some occasions, users prefer to avoid all of the property filters. By passing false to the -property flag, all of the property filters can be turned off in one easy step.

Molecular File Formats

BROOD can read and write a variety of molecular file formats. The file format is automatically interpreted from the filename suffix.

File Type Extension
SMILES .smi .ism .can .smi.gz .ism.gz .can.gz
SDF .sdf .mol .sdf.gz .mol.gz
SKC .skc .skc.gz
CDK .cdk .cdk.gz
MOL2 .mol2 .mol2.gz
PDB .pdb .ent .pdb.gz .ent.gz
MacroModel .mmod .mmod.gz
OEBinary v2 .oeb .oeb.gz

Gzipped OEBinary version 2 (oeb.gz) is the recommended output format.

BROOD is capable of piping formatted input and output. The simple “-” can be used in place of a filename to indicate std::cin or std::cout with the default SMILES format.

prompt> brood -in .oeb.gz -db myDB < brood.query_1.oeb.gz

This execution will run BROOD with std::cin as the input with .oeb.gz format. The format is controlled by the suffix.