The ROCS report file format appears as a tab-delimited file with the following fields. Since the names of the query and the hits are of indeterminate length, fixed size fields for these names could result in loss of information. Unfortunately this gives a file that is hard to read in a terminal session, but it can easily be read into a spreadsheet program or into the spreadsheet in VIDA.
This is the name of the database molecule. If the database contains multi-conformer molecules, the specific conformer index is appended to the molecule name with an underscore if
This is the name of the query molecule. If the query is a multi-conformer molecule, then the specific conformer index is appended to the molecule name with an underscore.
The numerical ranking in the hitlist, based on the chosen score to sort by. Can be altered by using
-rankbycommand line switches. If the
-statscommand line switch is used with best or all, data is written into the report file in the order that the search is performed. If no hitlist was used in the calculation, this field will be 0 (zero).
To provide a score that includes both shape fit and color, the Shape Tanimoto is added to the Color Tanimoto, resulting in the TanimotoCombo score. This has a value between 0 and 2 and is the score used for ranking the hitlist when the
-rankbyTanimotoCombo command line switch is used.
This column gives the Shape Tanimoto, a value between 0 and 1 as calculated by the Tanimoto equation (see the Theory section).
This column gives the Color Tanimoto, a value between 0 and 1 as calculated by the Tanimoto equation (see the Theory section).
To provide a Tversky score that includes both shape and color, the FitTversky is added to the FitColorTversky, resulting in the FitTverskyCombo score. This has a nominal value between 0 and 2, although do due the field nature of shape matching, the value can be higher than 2.
Shape Tversky is calculated using the Tversky equation (see the Theory section) with the fit (database) molecule as the main self-overlap with beta = 0.95. This was previously called Tversky(d).
Color Tversky is calculated using the Tversky equation (see the Theory section) with the database molecule as the main self-overlap with beta = 0.95.
To provide a Tversky score that includes both shape and color, the RefTversky is added to the RefColorTversky, resulting in the RefTverskyCombo score. This has a nominal value between 0 and 2, although do due the field nature of shape matching, the value can be higher than 2.
Shape Tversky is calculated using Tversky equation with the reference (query) molecule as the main self-overlap term with alpha = 0.95. This was previously called Tversky(q).
Color Tversky is calculated using the Tversky equation with the reference (query) molecule as the main self-overlap term with alpha = 0.95.
This column provides the actual color score. Since differing color force field files can use different strengths for the color forces and since each molecule may have a different number of color atoms, there is no upper bound on this score. By default, the color score is calculated by looping over all the color atoms in the query molecule and summing the single best color interaction with the hit molecule. This leads to scores that mirror the one-to-one correspondence of features sometimes seen in pharmacophore matching programs.
One additional score can be calculated by giving the
-subtancommandline argument. Since there is an additional time cost for this calculation, it is not included by default. Subtan is defined by taking the positions of the query and dbase molecule at the final overlay and removing all dbase atoms greater than 1.5 Angstroms from any query atom. Then a shape Tanimoto calculation is performed using these 2 structures and this Tanimoto coefficient is recorded as SubTan. Note that this has the effect of raising scores for small queries against much larger dbase molecules. In some respects, this is similar to Tversky for a sub-shape match, but does result in different rankings than Tversky. It is recommended that for searches involving a small query against a dbase of large molecules that both Tversky and SubTan be considered.
This is the absolute value of volume overlap between the query and the dbase molecule. The value is in arbitrary units, and is most useful when using a grid as query.