ScorePose¶
Overview¶
ScorePose scores poses in a -dbase
database in the context
of a single receptor, using the
Chemgauss4 scoring function. Poses may also optionally
be optimized with the -optimize
option vs. Chemgauss4.
Input Preparation¶
Ligand Preparation¶
The ligand input to ScorePose should already be docked into the
receptor site. For the purposes of this document, we’ll
call the file(s) of poses to be scored the database file(s), or dbase file(s).
Supported formats of the database file include SDF, MOL2 and PDB.
ScorePose determines the database file format from the file extension,
.sdf
or .mol
for SDF, .mol2
for MOL2, .pdb
or
.ent
for PDB. Gzip compressed files of these same formats are allowed
as well. ScorePose will interpret infile.sdf.gz
as a gzip’ed
SDF file.
Note
Note that even though all these formats are supported, using SDF, PDB or MOL2 can result in a loss of speed due to the I/O penalty of these formats. We recommend using Gzipped OEB format for maximum speed.
By default ScorePose will interpret conformers in the database file(s) as part of a single multi-conformer molecule as long as they:
Are contiguous in the input file.
Have the same numbers of atoms and bonds in the same order
Have identical atom and bond properties with their order correspondent in the subsequent connection table
Have the same atom and bond stereochemistry
While this may appear to be a restrictive list, many programs write multi-conformer molecules into SDF or MOL2 files such that the above rules will be satisfied. If the conformers are named differently, (i.e. they have a conformer number appended to the base name like acetsali_1, acetsali_2), ScorePose will still consider them part of a single multi-conformer molecule if the criteria above are met.
Receptor Preparation¶
Note
Beginning with ScorePose 4.0.0, ScorePose now only accepts an OEDU with receptor as the receptor input file.
ScorePose requires the receptor file that the ligands in the database were docked to. This should generally already be available if the ligands were docked with FRED or HYBRID. If the ligands were docked with another program a receptor can be created using one of the following programs.
Program |
Type |
Description |
---|---|---|
GUI |
Interactive GUI for creating a receptor. |
|
Command Line |
Generates a receptor inside a prepared OEDesignUnit in an OEDU file. |
|
Command Line |
Converts a molecule based receptor OEB or OEB.GZ file into a receptor inside an OEDesignUnit, in an OEDU file |
Note
Receptors can also be created using the OpenEye Docking Toolkit (see the Docking Toolkit documentation).
Command Line Help¶
A description of the command line interface can be obtained by executing ScorePose with the –help option.
> scorepose --help
will generate the following output:
Help functions:
scorepose --help simple : Get a list of simple parameters
scorepose --help all : Get a complete list of parameters
scorepose --help defaults : List the defaults for all parameters
scorepose --help <parameter> : Get detailed help on a parameter
scorepose --help html : Create an html help file for this program
scorepose --help versions : List the toolkits and versions used in the application
Required Parameters¶
-
-receptor
<receptor file>
¶ Receptor file to rescore poses with.
[ Aliases = -rec ]
-
-dbase
<input filename1> [<input filename2> ...]
¶ File(s) containing ligand poses to rescore (see section Input Preparation).
The following file formats are supported.
File type
Extension
OEBinary
.oeb .oeb.gz
SDF
.sdf .mol .sdf.gz .mol.gz
MOL2
.mol2 .mol2.gz
PDB
.pdb .ent .pdb.gz .ent.gz
MacroModel
.mmod .mmod.gz
More than one file can be specified.
[ Aliases = -database, -in ]
Optional Parameters¶
Input Options¶
-
-param
<parameter filename> [No Default]
¶ A parameter file is a text file that lists parameter settings to be used during a run. If a parameter is specified both on the command line and in the parameter file, the value specified on the command line is used.
The format of the parameter file is as follows:
One parameter per line
For non-list parameters one key-value pair per line. (e.g., -receptor rec.oedu).
For list parameters a key followed by all the values (e.g., -dbase lig1.oeb.gz ligs2.oeb.gz)
Boolean parameters must be listed as a key followed by true or false (e.g. -annotate_poses true).
The parameter file may not contain the
-param
parameter.Lines beginning with # are considered comments
-
-molnames
<input filename> [No Default]
¶ This parameter specifies a text file containing a list of molecule names (one name per line in the file). If this parameter is set then only molecules in the database file(s) (see parameter
-dbase
) with names that match those in the text files will be read in.The general purpose of this flag is to provide an easy mechanism for reading a few specific molecule(s) that are contained in a large database, without having to extract those molecules by hand from the database.
Score Options¶
-
-optimize
<level> [No Default]
¶ If this parameter is specified each pose will be optimized with a systematic solid body optimization with a resolution given by the table below.
Level
Translational Stepsize
Rotational Stepsize
High
0.5 Ångström
0.5 Ångström
Standard
0.5 Ångström
0.75 Ångström
Low
0.75 Ångström
1.0 Ångström
If this parameter is not specified poses will not be optimized prior to scoring.
[ Aliases : -opt ]
Output Files¶
-
-prefix
<value> [Default: rescore]
¶ This flag prefixes all default output filenames with the specified value.
Note
This flag does not affect output filenames explicitly set by the user.
Note
Values in parenthesis are default values.
-
-rescored_mol_output_file
<output filename> [Default: scored.oeb.gz]
¶ File rescored molecules will be written to. The file format is controlled by the extension of the filename. The following output formats are supported.
Format
Extension
OEBinary
.oeb
SDF
.sdf
Gzipped OEBinary
.oeb.gz
Gzipped SDF
.sdf.gz
Scores will be attached as SD data to each pose with the tag FRED Chemgauss4 Score, unless the
-score_tag
option is used to specify another tag.By default the top 500 scoring molecules will be outputted to this file (see
-hitlist_size
flag).Note
If this flag is not set by the user the default filename (i.e., scored.oeb.gz) will be automatically prefixed with the setting of the
-prefix
flag.[ Aliases = -docked_file, -docked, -out ]
-
-score_file
<filename> [Default: score.txt]
¶ Specifies a tab separated text file with the name and scores of the molecules.
Note
If this flag is not set by the user the default filename (i.e., score.txt) will be automatically prefixed with the setting of the
-prefix
flag.[ Aliases : -score ]
-
-report_file
<filename> [Default: report.txt]
¶ Specifies a file that a text report of the run will be written to.
Note
If this flag is not set by the user the default filename (i.e., report.txt) will be automatically prefixed with the setting of the
-prefix
flag.[ Aliases : -report ]
-
-settings_file
<filename> [Default: settings.param]
¶ Writes the settings of all parameters of the run to the specified output file. The settings will be listed in plain text with one parameter name follow by its value(s). This format is compatible with the format of parameter files, and therefore a settings file from a previous run can be passed to the
-param
flag to re-run the program with the same settings.Note
If this flag is not set by the user the default filename (i.e., settings.param) will be automatically prefixed with the setting of the
-prefix
flag.[ Aliases : -settings ]
-
-status_file
<filename> [Default: status.txt]
¶ If this parameter is set then the status of the run will be written to the given output file every few seconds (the previous contents of the file will be overwritten) during the run.
Note
If this flag is not set by the user the default filename (i.e., status.txt) will be automatically prefixed with the setting of the
-prefix
flag.[ Aliases : -status ]
Output Options¶
-
-hitlist_size
<num> [Default: 500]
¶ This parameter controls whether docked molecules are outputted as they are docked or in an internal hitlist and outputted at the end of the run.
If -hitlist_size is zero the run will be in serial mode, i.e. each molecule will be outputted as it is docked (unsorted). For single processor runs this will be the order the molecules appear in the database file(s). For MPI runs the order will not be strictly the order the molecules appear in the database file(s).
If -hitlist_size is non-zero a sorted internal hitlist of docked molecules that will be maintained and outputted at the end of the run. The maximum size of the hitlist is -hitlist_size. If more than this number of molecules are docking during the run only the top scoring molecules will be outputted and the rest will be discarded.
There is no formal limit on the number of molecule that can be sorted and outputted at the end of the run. However, retaining a large number of molecules significantly increases the memory requirements. A good rule of thumb is that the setting total number of poses retained should not be larger than 10,000.
[ Aliases = -hitlist_size, -hitlist ]
-
-sort_poses
[Default: false]
¶ If this option is selected the poses of each molecule will be sorted by score.
If the molecules in the database do not have multiple poses this flag has no effect.
[ Aliases = -sortposes ]
-
-score_tag
<tag> [No Default]
¶ This parameter overrides the default SD Data Tag used to store molecule scores (the default is FRED Chemgauss4 Score).
[ Aliases = -scoretag ]
-
-annotate_scores
[Default: false]
¶ If the value of this flag is set to true VIDA score annotations will be added to the processed molecules. These annotations are visible in VIDA (OpenEye’s molecular visualization program) and show a per atom breakdown of the score.
Note
The docked molecule output file format (see
fred -docked_molecule_file
) must be OEBinary when using score annotations.[ Aliases = -annotate ]
-
-save_component_scores
[Default: false]
¶ If the value of this flag is set to true individual components of the total score will be saved to SD data on each pose and appear in the score file (see
-score_file
).[ Aliases = -component_scores, -component ]
-
-no_extra_output_files
[Default: false]
¶ When set the only default output to the program will be the docked structure file (see
-rescored_mol_output_file
).Using this flag suppresses the default output of the following
Output
Default file
Parameter
text score file
score.txt
report file
report.txt
settings file
scorepose.param
status file
status.txt
Only default output is suppressed. If any of these output parameters are explicitly set by the users the relevant output file will still be written even if this switch is turned on.
[ Aliases = -no_extra, -noextra, -noextraoutputfiles, -no_extra_output, -noextraoutput ]
-
-no_dots
[Default: false]
¶ When this flag is set to true, a dot is being written to standard error for each docking molecule (or x in the case of a failure). Setting this flag to false to suppress dot/x writing.
[ Aliases = -nodots ]