Usage

Introduction

EON calculates the electrostatic similarity between two small molecules in the form of an Electrostatic Tanimoto (ET) score. Given a query molecule and a set of interesting molecules (ROCS overlay hits, for example), EON will calculate the Electrostatic Tanimoto between each database molecule and the query. Note that EON does not perform any overlay or alter the input orientation of the structures. They must be pre-aligned to the query on input. Also, since electrostatics calculations require high quality partial charges, EON will calculate new partial charges for the input structures using MMFF94. If the user provides an input file that contains structures with higher-quality partial charges, EON can use them as well.

EON is also dependent on pKa state and formal charges as these have a significant impact on electrostatics. EON now has the ability to adjust both the query and database molecule to a neutral pH model. This feature is on by default, but can be turned off by using appropriate command line flags.

Since electrostatics overlays are very dependent on alignment and require a good quality alignment between query and database molecule, ROCS provides the best input to EON. However, electrostatic complementarity is more dependent on subtle conformational changes than shape is, so there are several steps that can be taken to ensure the best possible success with EON.

Firstly, one can ensure that ROCS outputs multiple interesting conformers per molecule. ROCS includes a flag -eon_input that allows generation of a multi-conformer set of ROCS-aligned output specifically for input into EON. This file can be generated in parallel with a ROCS hit list so that in a single ROCS run you can find ROCS hits and prepare EON input. Please see the ROCS documentation for more detail on these flags.

Secondly, EON reads one or more conformers from the input file and uses technology from OMEGA to expand terminal torsions to search for subtle changes in conformation that might increase the score without changing the overall shape overlap with the query. To score just the input conformers and not search for alternate terminal conformations, the -scoreonly flag is provided.

Part of understanding EON results is visualization of the electrostatic grids used in the calculation. Although off by default, when writing EON results to a binary (OEB) file, ET grids can be attached to each molecule and visualized using the EON View mode in VIDA.

Since EON calculations can be time-consuming (approximately 1 molecule per second per CPU), EON can use the same distributed computing technology, Open MPI, that ROCS uses to help distribute the workload across a cluster of machines.

Command Line Interface

A description of the commandline interface can be obtained by executing EON with the --help option.

prompt> eon --help

will generate the following output:

Help functions:
  eon --help simple      : Get a list of simple parameters
  eon --help all         : Get a complete list of parameters
  eon --help <parameter> : Get detailed help on a parameter
  eon --help html        : Create an html help file for this program

Required Parameters

-dbase <filename>

File containing one or more 3D molecules to score against a reference or query molecule. If only this flag is given, EON will use the first molecule in the dbase file as the query and score all the remaining molecules against it. This is most useful when scoring ROCS results when ROCS was run with -eon_input equal to true since the ROCS query and therefore EON query will be the first in the input file. The query or reference molecule can also be specified separately using the -query flag below.

File format for -dbase can be one of:

File type Extension
OEBinary .oeb .oeb.gz
SDF .sdf .mol .sdf.gz .mol.gz
MOL2 .mol2 .mol2.gz
PDB .pdb .ent .pdb.gz .ent.gz
MacroModel .mmod .mmod.gz

Optional Parameters

Execute Options

-param
The argument for this flag is the name of a file containing control parameters. The control parameter file acts to either replace or augment the command line interface. All parameters necessary for program execution may be provided in the control parameter file, although any command given explicitly on the command line will supersede options found in the parameter file. The application generates a new parameter file containing the full set of execution parameters upon every execution. The name of the parameter file is created by combining the prefix base name with the ‘.parm’ extension.
-mpi_np <n>
Specifies the number of processors n when the application is run in MPI mode.
-mpi_hostfile <filename>
Specifies the name of the file containing processors configuration. For every host this file should contain a line host_name slots=n where n is the number of processors on the host.

Input Options

-query <filename>
File containing one 3D molecule to use as a query. File format can be any of the formats given in the table for -dbase above.
-charges

Specifies charges to be used on the query. Default is to calculate them internally with mmff. The option -charges existing will use precalculated charges that must be set in the input files.

[default = mmff]

-scdbase

Since EON reads multi-conformer input molecules, when reading from non-binary files, EON will compare consecutive molecules and if they are determined to be the same structure, they will be concatenated into a single, multi-conformer molecule. If the user desires to score each input conformation independently, then using this flag will turn off the conformer comparison step.

[default = false]

Output Options

-oformat <extension>

Format of output structure file(s). The default is oeb so that ET score can be included as tag data. The format for the file is determined by giving the extension. Valid values include all formats listed above for -query

[default = oeb]

-prefix <prefix>

Defines a prefix used to name output files. Using -prefix FOO will create a hits structure file named like FOO_hits.oeb and a report file, FOO.rpt

[default = EON]

-besthits <N>

Process entire dbase file but only keep N best scores sorted by property specified by -rankby. Using a value of 0 implies no hitlist will be maintained and structures will all be scored and output in input order.

[default = 500]

-maxhits <N>

Stop after finding first N hits. This option overrides any setting for -besthits

[default=0]

-rankby <score>

Property to use to sort hitlist. Values include ET_combo, ET_pb and ET_coul.

[default = ET_combo]

-cutoff <score>

Minimum score to keep as a hit.

[default = -1.0]

-outputquery
Write the query to the top of the hits file. This make visualization of results much easier inside VIDA.
-scoreonly
Turn off the terminal torsion conformer search and just score each input conformation as-is.
-hitsfile <filename>
Explicit filename for writing hits. Overrides the default filename created from -prefix.
-reportfile <filename>
Explicit filename for writing hits. Overrides the default filename created from -prefix.
-sdTags
This parameter controls whether to attach score information to output molecules as SD data.

Log Output Options

-logfile <filename>
Filename for log file. Overrides log filename created from -prefix.
-progress

Method for showing job progress on the command line. Choices include:

  • percent - show a percent complete progress bar (DEFAULT)
  • log - echo the log message for each molecule
  • dots - show dots as in EON 1.1
  • none - print nothing to console

[default = percent]

-statusfile
Write status info to this file. Use “none” for no status file.
-verbose

Give verbose output to console instead of simple progress.

[default = false]

ZAP/PB Options

-fixpka_query

Apply a neutral pH model to the query molecule.

[default = true]

-fixpka_dbase

Apply a neutral pH model to the database molecules.

[default = true]

-salt

Add salt to the Zap calculation. To aid in moderating large, local charges, salt is added into the calculation. Legal values are between 0.0 and 0.1 (mM).

[default = 0.04]

-spacing

Sets the grid spacing for the internal Zap calculations.

[default 0.5]

-writegrid

Write ET grid to output attached to each molecule. Useful for visualization in VIDA but this only works when writing hits to an OEB (.oeb or .oeb.gz) file. Note that while this feature is quite useful, grids do take a large amount of memory so care should be taken when using this feature for hit lists of more than 500 molecules.

[default = false]

Omega Options

-ewindow

Omega energy window used for conformer selection.

[default = 10.0]

-rms

Omega RMS threshold used to determine duplicate conformations

[default = 0.3]

-sampleHydrogens

Sets whether hydrogens will be sampled. This option enables sampling of hydrogen locations for -OH, -SH, and amines.

[default = false]

Example Commands

The simplest way to run EON is to use the -eon_input flag in ROCS to create a set of ROCS-aligned structures with the ROCS query at the top of the file. By default, if EON is only provided a dbase file, it will assume the first molecule is the query.

prompt> eon -dbase rocs_eon_input.oeb.gz

Or you can provide a single molecule in a query file and a set of molecules in a dbase file. A common example would be to use a ROCS query as the EON query and the ROCS hits file as the dbase file for EON.

So for example:

prompt> eon -dbase rocs_hits_1.oeb -query rocsquery.sdf

will score all the structures in rocs_hits_1.oeb against the molecule in rocsquery.sdf and place the structures in EON_hits.oeb (with ET scores in SD tag data) and a table of results in EON.rpt.

To keep only the best 100 hits, use:

prompt> eon -dbase rocs_hits_1.oeb -query rocsquery.sdf -besthits 100

Note that the use of an output structure file is mostly for generation of a single file containing structures and tag data such that loading this single file into VIDA provides easy analysis of the results. If however, the only real desire is for the numerical scores, use -nostructs to suppress the creation of an output structure file.

prompt> eon -dbase rocs_hits_1.oeb -query rocsquery.sdf -nostructs

By default, EON calculates partial charges using MMFF94. However, if you have structure files that already contain good partial charges in both your dbase file and query file, you can tell EON to use them instead:

prompt> eon -dbase rocs_hits_chgs.mol2
-query rocsquery_chg.mol2 -charges existing

To prevent continually over-writing output files, the -prefix flag allows you to give unique names to these files.

prompt> eon -dbase rocs_hits_1.oeb -query rocsquery.sdf -prefix FOO

will write the hit structures into a file named FOO_hits.oeb and the numerical values will be in FOO.rpt. The parameter file for this run will likewise be named FOO.parm.

To prevent EON for searching alternate terminal torsions, use the -scoreonly flag.

prompt> eon -dbase rocs_eon_input.oeb.gz -scoreonly

Finally, to create ET grids as used in the calculation and attach them to each output molecule. Note, this only works for OEB output and can be very memory intensive.

prompt> eon -dbase rocs_eon_input.oeb.gz -oformat .oeb.gz -writegrid

Report File

The EON report file format appears as a tab-delimited file with the following fields. Since the names of the query and the hits are of indeterminate length, fixed size fields for these names could result in loss of information. Unfortunately this gives a file that is hard to read in a terminal session, but it can easily be read into a spreadsheet program or into the data manager in VIDA.

Name
This is the name of the database molecule.
EONQuery
This is the name of the query molecule.
Rank
The numerical ranking in the hitlist, based on the chosen score to rank by. Using the defaults, this is ET_combo. Can be altered by using -rankby command line switches. If no hitlist was used in the calculation, this field will be 0 (zero).
ET_pb
This is the value of electrostatic Tanimoto, using full Poisson-Boltzmann (PB) electrostatics.
ET_coul
This is the value of electrostatic Tanimoto using only the coulombic part of PB electrostatics.
ET_combo
Sum of ET_pb and EON_shape_tani. This is a useful score that takes into account both shape match and ET match.
EON_shape_tani
This the shape Tanimoto between the given molecule and the query. For calculations that use -shapeonly, this will be the same as the output Tanimoto from ROCS. When EON is allowed to alter terminal torsion, this will give the final shape Tanimoto.