Data Analysis Functions

The functions detailed in this section provide access to the data analysis and spreadsheet functionality in VIDA. An important distinction should be made in that the “Datatable” functions refer to internal data collections while the “Spreadsheet” functions refer to the actual tabular views of the associated data that actually appear in the Spreadsheet display.

DataAdd

void DataAdd(const OEPropDB::OEKey &key, const std::string &str,
             const std::string &data)

Adds a piece of data to the object specified by key. If a column in the spreadsheet does not already exist for this data type, it will be added.

DataGetDB

VFDataBase *DataGetDB()

Return the VDDataBase object.

DataGetTable

VFDataTable *DataGetTable(const std::string &name, bool throwError=true)

Return the list of all internal databases.

DatatableAddColumn

void DatatableAddColumn(std::string datatable, std::string columnName,
                        bool genericData=false)

Add a column to the specified data table with the specified name. If a column with the specified name already exists, this call will be ignored.

DatatableCommitChanges

void DatatableCommitChanges(const std::string &datatable)

Commits any outstanding changes to the internal database.

DatatableCurrentGet

std::string DatatableCurrentGet()

Returns the name of the current data table (e.g. ‘Molecules’).

DatatableCurrentSet

bool DatatableCurrentSet(const std::string &datatable)

Sets the current data table.

DatatableData

std::string DatatableData(const std::string &datatable, unsigned int row,
                          std::string header)

Returns the string representation of the data in the specified data table at the specified row and column.

DatatableDeleteColumn

void DatatableDeleteColumn(const std::string &datatable,
                           const std::string &name)

Delete the specified column from the specified data table.

DatatableEditableGet

bool DatatableEditableGet(const std::string &ss)

Returns whether or not the specified data table is editable.

DatatableEditableSet

void DatatableEditableSet(const std::string &ss, bool val)

Sets whether or not the specified data table is editable.

DatatableFilter

void DatatableFilter(const std::string &datatable, const std::string &filter,
                     const std::string &name, bool staticView=true,
                     bool permaFilter=false)

void DatatableFilter(const std::string &datatable, const OEDataFilter &filter,
                     const std::string &name, bool staticView=true,
                     bool permaFilter=false)

Creates a new data table view with the specified name from the specified data table using the specified filter. The filter is an arbitrary Python expression that is applied to every row in the source data table to determine whether or not it should be included in the filtered view.

The filter expression may assume that a local variable ROW is set to the current row number being evaluated. An example appears below:

 int(DatatableData(DatatableCurrent(), ROW, foo)) < 400

The example shows an example expression where the string representation of
the data in the current data table (which may not necessarily be the
specified datatable) at row ``ROW`` and in column ``foo`` is converted to an
integer and compared with the value 400. If this expression is ``True`` for
the row being tested, it will be included in the filtered view, otherwise it
will not be included.

DatatableFromList

void DatatableFromList(unsigned int listid, const std::string &name)

Creates a filtered data table with the specified name with row entries for each object in the specified list. If the specified name already exists, a new name will be used (e.g. name –> name1, name2, and so on).

DatatableGetColumn

std::vector<std::string> DatatableGetColumn(const std::string &datatable,
                                            unsigned int index)

std::vector<std::string> DatatableGetColumn(const std::string &datatable,
                                            const std::string &name)

Returns a list of string representations for each entry in the specified column in the specified data table.

DatatableGetCurrentRow

int DatatableGetCurrentRow(std::string datatable)

Returns the row index for the first active or selected row in the specified data table.

DatatableGetDatatables

std::vector<std::string> DatatableGetDatatables()

Returns a list of the names of all the available datatables.

DatatableGetImageStreamAtRow

bool DatatableGetImageStreamAtRow(const std::string &datatable,
                                  const std::string &filename, int row,
                                  int width=256, int height=256)

Write an image file to the file specified by filename for the datatable named datatable at the row specified by row. The size of the image is specified by width and height.

DatatableGetKeys

std::vector<OEKey> DatatableGetKeys(const std::string &datatable)

Returns a list of keys for the database specified by datatable. The keys are returned in the current datatable sort order and are consistent with the ordering from the call to DatatableGetColumn.

DatatableGetNumRows

int DatatableGetNumRows(std::string datatable)

Return the number of rows for the specified datatable.

DatatableHeaders

std::vector<std::string> DatatableHeaders(const std::string &datatable)

Return a list of all the names of all the headers for the specified datatable.

DatatableLingoSimSort

bool DatatableLingoSimSort(const std::string &spreadsheet, unsigned int row)

Sorts the specified datatable according to Lingo similarity to the molecule contained in the specified row.

DatatableMolNumberFunction

double DatatableMolNumberFunction(const std::string &datatable, int row,
                                  const std::string &func)

Calculates molecular data that returns a numeric value. datatable identifies the datatable containing the molecule and row is the datatable row with the molecule.

The function func to compute is one of:

  • “mw” Molecular weight.

  • “Num Atoms” Number of atoms in the molecule.

  • “Num Bonds” number of bonds in the molecule.

  • “Carbon-Hetero ratio” the ratio of carbons to hetero atoms in the molecule.

    Returns -1 if there are no carbons.

  • “Energy” The molecular energy of the molecule as specified in the input file.

  • “Actual Charge” The sum of the partial charges on all atoms as specified in the input file.

  • “Formal Charge” The sum of the formal charges on all atoms.

  • “Halide Count” The number of halogen atoms in the molecule.

  • “Num Carbons” The number of carbon atoms in the molecule.

  • “Num Formal Charges” The number of more atoms with a specified formal charge.

  • “Num Heavy Atoms” The number of heavy atoms (non hydrogen) in the molecule.

  • “Num Hetero Atoms” The number of hetero atoms in the molecule.

  • “Num Hydrogens” The number of hydrogen atoms in the molecule.

  • “Num Rigid Bonds” The number of rigid bonds in the molecule.

  • “Nom Rotatable Bonds” The number of rotatable bonds in the molecule.

  • “Num Chiral Atoms” returns the number of chiral atoms in the molecule.

  • “Num Chiral Bonds” returns the number of chiral bonds in the molecule.

DatatableMolStringFunction

std::string DatatableMolStringFunction(const std::string &datatable, int row,
                                       const std::string &func)

Calculates molecular data that returns a string value. datatable identifies the datatable containing the molecule and row is the datatable row with the molecule.

The function func to compute is one of:

  • “molformula” the molecular formula for the molecule.

DatatableNumRows

int DatatableNumRows(std::string datatable)

Return the number of rows for the datatable named datatable.

DatatableSetData

void DatatableSetData(const std::string &datatable, unsigned int row,
                      std::string header, std::string value, bool update=true)

Set the cell data for the datatable named datatable at row for the column named header to the string value value.

If update is True, immediately update the datatable. Set this to False if you are updating a bunch of data and then call DatatableUpdateContents on this datatable.

DatatableSetExpression

void DatatableSetExpression(const std::string &datatable,
                            const std::string &col, const std::string &expr)

A datatable expression defines an arbitrary piece of python code to call that generates data to be displayed in the datatable.

This function creates a new column named col for the specified datatable using the function defined by expr.

This expression may assume that a local variable ROW is assigned to the row currently being evaluated. e.g.

DatatableData(DatatableCurrent(), ROW, foo)

returns the string representation of the data in column foo for the current datatable which, for this function, is always datatable.

DatatableSetRowData

void DatatableSetRowData(const std::string &datatable, unsigned int row,
                         std::vector<std::string> headers,
                         std::vector<std::string> rows, bool update=true)

Set data in the datatable named datatable. row is the row index to set. headers defines the names of the columns to set and rows is the string representation of the data.

Note: headers[i] should be the name of the column for row[i].

The update parameter is no longer used and is kept for backwards compatibility.

SpreadsheetAddColumn

void SpreadsheetAddColumn(std::string spreadsheet, std::string column,
                          bool genericData=false)

Add a column to spreadsheet spreadsheet with the name columnName. If columnName already exists in the spreadsheet, it will be ignored.

For the “Atoms” and “Residues” spreadsheets, only certain column names are allowed, and the values in these columns cannot be changed.

For the “Atoms” spreadsheet, the valid column names are:

  • x

  • y

  • z

  • Radius

  • Element

  • Residue_Idx

  • Name

  • oe_atom_Formal_Charge

  • oe_atom_Partial_Charge

  • oe_atom_Idx

  • oe_atom_Map_Idx

  • oe_atom_Isotope

  • oe_atom_Hyb

  • oe_atom_Implicit_H_Count

  • oe_atom_Explicit_H_Count

  • oe_atom_Int_Type

  • oe_atom_Type

  • oe_atom_Aromatic

  • oe_atom_Chiral

  • oe_atom_In_Ring

  • oe_residue_Name

  • oe_residue_Occupancy

  • oe_residue_BFactor

  • oe_residue_Number

  • oe_residue_Serial_Number

  • oe_residue_Model_Number

  • oe_residue_Fragment_Number

  • oe_residue_Secondary_Structure

  • oe_residue_Alternate_Location

  • oe_residue_Chain_ID

  • oe_residue_IsHetAtom

The oe_residue_ and oe_atom_ prefixes are not displayed in the spreadsheet headers, and are omitted when referring to these columns in other functions, such as SpreadsheetData and SpreadsheetRemoveColumn.

For the “Residues” spreadsheet, the valid column names are:

  • Residue

  • Average BFactor

  • Min Occupancy

  • Max Occupancy

  • Alt Group

  • Num AltConfs

SpreadsheetColumnColorerSet

void SpreadsheetColumnColorerSet(const std::string &spreadsheet,
                                 const std::string &column,
                                 const std::string &colorer, double min,
                                 double max)

Sets a colorer for the specified column in the specified spreadsheet. Valid values for the colorer parameter include:

  • “redtoblue”

  • “bluetored”

  • “rainbow”

  • “reverse rainbow”

  • “redyellowblue”

  • “greytogreen”

SpreadsheetColumnController

void SpreadsheetColumnController()

Open the spreadsheet column controller dialog. The column controls allows controlling the visibility and other aspects for the current spreadsheets column.

SpreadsheetColumnFontGet

std::string SpreadsheetColumnFontGet(const std::string &spreadsheet,
                                     const std::string &column)

Returns the font to be used when rendering text in the specified column of the specified spreadsheet.

SpreadsheetColumnFontSet

void SpreadsheetColumnFontSet(const std::string &spreadsheet,
                              const std::string &column,
                              const std::string &font)

Sets the font to be used when rendering text in the specified column of the specified spreadsheet.

SpreadsheetColumnReadonly

bool SpreadsheetColumnReadonly(std::string spreadsheet, unsigned int col)
bool SpreadsheetColumnReadonly(std::string spreadsheet, unsigned int col,
                               bool readonly)

Set a spreadsheet column to be read-only.

Without a readOnly value, returns the current readonly state for column col in spreadsheet.

With a readOnly specified, column col in the spreadsheet named spreadsheet is set to value set by readOnly.

SpreadsheetColumnSigFigGet

int SpreadsheetColumnSigFigGet(const std::string &spreadsheet,
                               const std::string &column)

Returns the number of significant figures used to display numerical values data in the specified column.

SpreadsheetColumnSigFigSet

void SpreadsheetColumnSigFigSet(const std::string &spreadsheet,
                                const std::string &column, int digits)

Sets the number of significant figures used to display numerical values data in the specified column.

SpreadsheetCommitChanges

void SpreadsheetCommitChanges(const std::string &spreadsheet)

Commit any unsaved changes to the spreadsheet named spreadsheet.

Normally, this is done automatically behind the scenes but sometimes is necessary to keep the data in the spreadsheet synchronized with the molecules and data in the repository.

SpreadsheetCopy

std::string SpreadsheetCopy(const std::string &spreadsheet)

Copies to the clipboard and returns the currently selected set of cells in the specified spreadsheet.

SpreadsheetCreateColumnExpression

void SpreadsheetCreateColumnExpression()

Open the dialog to create a new column expression for the current spreadsheet. A column expression is a user-defined function that populates a column with arbitrary data.

SpreadsheetCreateFilter

void SpreadsheetCreateFilter()

Open the dialog to generate a new view of the spreadsheet with all rows filtered by an arbitrary python expression.

SpreadsheetCurrentGet

std::string SpreadsheetCurrentGet()

Returns the name of the current spreadsheet. e.g. ‘Molecules’.

SpreadsheetCurrentSet

bool SpreadsheetCurrentSet(const std::string &spreadsheet)

Sets the current spreadsheet.

SpreadsheetData

std::string SpreadsheetData(const std::string &spreadsheet, unsigned int row,
                            std::string header)

Return the string representation of the data in the spreadsheet named spreadsheet for the row row and the column named header.

SpreadsheetDeleteColumn

void SpreadsheetDeleteColumn(const std::string &column)

Delete the column named column. Note that this deletes column from all spreadsheets.

SpreadsheetEditableGet

bool SpreadsheetEditableGet(const std::string &ss)

Returns True if spreadsheet is editable.

SpreadsheetEditableSet

void SpreadsheetEditableSet(const std::string &ss, bool val)

Sets spreadsheet to be editable if val is True, otherwise sets spreadsheet to be read-only.

SpreadsheetFilter

void SpreadsheetFilter(const std::string &spreadsheet, std::string filter,
                       std::string name, bool staticView)
void SpreadsheetFilter(const std::string &spreadsheet, const OEDataFilter &p,
                       const std::string &name, bool staticview,
                       bool permaFilter=false)

Create a new spreadsheet view from the spreadsheet named spreadsheet using the filter defined in filter. The new view is named name. filter is an arbitrary python expression that is applied to every row in the spreadsheet.

This expression may assume that a local variable ROW is assigned to the row currently being evaluated. e.g.

SpreadsheetData(SpreadSheetCurrent(), ROW, foo)

returns the string representation of the data in column foo for the current spreadsheet. Note: the current spreadsheet may not be spreadsheet.

SpreadsheetFromList

void SpreadsheetFromList(unsigned int listid, const std::string &name)

Generate a filtered spreadsheet with name name from the list listid.

If name is already in the spreadsheet, it will be renamed name1, name2 and so on.

SpreadsheetGetColumn

std::vector<std::string> SpreadsheetGetColumn(const std::string &spreadsheet,
                                              unsigned int index)
std::vector<std::string> SpreadsheetGetColumn(const std::string &spreadsheet,
                                              const std::string &name)

Returns string representations of the entire column for the spreadsheet named spreadsheet for column name or, alternatively, for column index by index.

SpreadsheetGetCurrentRow

int SpreadsheetGetCurrentRow(std::string spreadsheet)

Returns the row index for the first active or selected row for the spreadsheet named spreadsheet.

SpreadsheetGetIDForRow

unsigned int SpreadsheetGetIDForRow(std::string spreadsheet, unsigned int row)

Returns the repository id for the first active or selected row for the spreadsheet named spreadsheet.

SpreadsheetGetImageStreamAtRow

bool SpreadsheetGetImageStreamAtRow(const std::string &spreadsheet,
                                    const std::string &filename, int row,
                                    int width=256, int height=256)

Write an image file to the file specified by filename for the spreadsheet named spreadsheet at the row specified by row. The size of the image is specified by width and height.

SpreadsheetGetKeyForRow

OEPropDB::OEKey SpreadsheetGetKeyForRow(std::string spreadsheet,
                                        unsigned int row)

Returns the OEKey value for the first active or selected row for the spreadsheet named spreadsheet.

SpreadsheetGetNumRows

int SpreadsheetGetNumRows(std::string spreadsheet)

Return the number of rows for the spreadsheet named spreadsheet.

SpreadsheetGetRowForKey

unsigned int SpreadsheetGetRowForKey(std::string spreadsheet, OEPropDB::OEKey)

Returns the row that is generated from OEKey key. Returns “4294967295L” if key is not in the spreadsheet.

SpreadsheetGetSpreadsheets

std::vector<std::string> SpreadsheetGetSpreadsheets()

Return a list of the names for all the currently available spreadsheets.

SpreadsheetHeaders

std::vector<std::string> SpreadsheetHeaders(const std::string &spreadsheet)

Return a list of all the names of all the headers for the spreadsheet named spreadsheet.

SpreadsheetHideColumn

void SpreadsheetHideColumn(const std::string &spreadsheet,
                           const std::string &name)

Hide the column named name for the spreadsheet named spreadsheet.

SpreadsheetHideTab

bool SpreadsheetHideTab(std::string name)

Hide the named spreadsheet from the spreadsheet view.

SpreadsheetImport

int SpreadsheetImport(const VFSpreadSplitter &splitter,
                      const std::string &filename,
                      unsigned int matchBy=ImportSpreadsheet::ImportByOrder,
                      const std::string &matchList="",
                      unsigned int importBy=ImportSpreadsheet::ImportAsIs,
                      const std::string &importColumnOrFunction="")

Imports the specified file into the spreadsheet.

SpreadsheetLingoSimSort

void SpreadsheetLingoSimSort(const std::string &spreadsheet, unsigned int row)

Sorts the specified spreadsheet according to Lingo similarity to the molecule in the specified row.

SpreadsheetLoadFilter

void SpreadsheetLoadFilter(const std::string &spreadsheet,
                           const OEDataFilter &filter)
void SpreadsheetLoadFilter(const std::string &spreadsheet,
                           const OEDataFilter &filter, bool wasStatic,
                           bool isStatic)

Creates a new spreadsheet using the specified filter.

SpreadsheetMolNumberFunction

double SpreadsheetMolNumberFunction(const std::string &spreadsheet, int row,
                                    const std::string &func)

Calculates molecular data that returns a numeric value.

spreadsheet identifies the spreadsheet containing the molecule and row is the spreadsheet row with the molecule.

The function func to compute is one of:

  • “mw” Molecular weight.

  • “Num Atoms” Number of atoms in the molecule.

  • “Num Bonds” number of bonds in the molecule.

  • “Carbon-Hetero ratio” the ratio of carbons to hetero atoms in the molecule.

    Returns -1 if there are no carbons.

  • “Energy” The molecular energy of the molecule as specified in the input file.

  • “Actual Charge” The sum of the partial charges on all atoms as specified in the input file.

  • “Formal Charge” The sum of the formal charges on all atoms.

  • “Halide Count” The number of halogen atoms in the molecule.

  • “Num Carbons” The number of carbon atoms in the molecule.

  • “Num Formal Charges” The number of more atoms with a specified formal charge.

  • “Num Heavy Atoms” The number of heavy atoms (non hydrogen) in the molecule.

  • “Num Hetero Atoms” The number of hetero atoms in the molecule.

  • “Num Hydrogens” The number of hydrogen atoms in the molecule.

  • “Num Rigid Bonds” The number of rigid bonds in the molecule.

  • “Nom Rotatable Bonds” The number of rotatable bonds in the molecule.

  • “Num Chiral Atoms” returns the number of chiral atoms in the molecule.

  • “Num Chiral Bonds” returns the number of chiral bonds in the molecule.

SpreadsheetMolStringFunction

std::string SpreadsheetMolStringFunction(const std::string &spreadsheet,
                                         int row, const std::string &func)

Calculate molecular data that returns a string value.

spreadsheet identifies the spreadsheet containing the molecule and row is the spreadsheet row with the molecule.

The function func to compute is one of:

  • ‘molformula’ the molecular formula for the molecule.

SpreadsheetMoveColumn

void SpreadsheetMoveColumn(const std::string &spreadsheet,
                           const std::string &columnName, int position)

Moves the specified column to the specified position.

SpreadsheetNumRows

int SpreadsheetNumRows(std::string spreadsheet)

Return the number of rows for the spreadsheet named spreadsheet.

SpreadsheetPromptColumnExpression

std::string SpreadsheetPromptColumnExpression()

This function opens a dialog and returns the expression to evaluate to add the specified column expression.

SpreadsheetPromptExport

std::string SpreadsheetPromptExport(const std::string &filename)

This function opens a dialog and returns the expression to export the specified spreadsheet and columns.

SpreadsheetPromptFilter

std::string SpreadsheetPromptFilter()

This function opens a dialog and returns the expression to evaluate to filter the specified spreadsheet.

SpreadsheetPromptFormat

std::string SpreadsheetPromptFormat()

This function opens a dialog which allows the user to specify the formatting of the spreadsheet.

SpreadsheetPromptGraphemeOpts

std::string SpreadsheetPromptGraphemeOpts()

This function opens a dialog which allows the user to specify how the 2D depiction is rendered using OpenEye’s Grapheme toolkit to apply a property map of computed properties or even user defined properties which are tagged to the atoms using generic data.

SpreadsheetPromptImport

std::string SpreadsheetPromptImport(const std::string &filename)

This function opens a dialog and returns the expression to import spreadsheet filename.

SpreadsheetPromptSort

std::string SpreadsheetPromptSort()

This function opens a dialog and returns the expression to evaluate to sort the specified spreadsheet and columns.

SpreadsheetRemoveTab

bool SpreadsheetRemoveTab(std::string name)

Remove the spreadsheet with name name. The base “Molecules” spreadsheet cannot be removed.

SpreadsheetRowHeightGet

int SpreadsheetRowHeightGet(const std::string &spreadsheet)

Returns the row height for the specified spreadsheet.

SpreadsheetRowHeightSet

void SpreadsheetRowHeightSet(const std::string &spreadsheet, int height)

Sets the default row height for the specified spreadsheet.

SpreadsheetSetData

void SpreadsheetSetData(const std::string &spreadsheet, unsigned int row,
                        std::string header, std::string value, bool update=true)

Set the cell data for the spreadsheet named spreadsheet at row for the column named header to the string value value.

If update is true, immediately update the spreadsheet. Set this to false if you are updating a bunch of data and then call SpreadsheetUpdateContents on the spreadsheet.

SpreadsheetSetExpression

void SpreadsheetSetExpression(const std::string &spreadsheet,
                              const std::string &col, const std::string &expr)

A spreadsheet expression defines an arbitrary piece of python code to call that generates data to be displayed in the spreadsheet.

SpreadsheetSetExpression creates a new column named col for the spreadsheet named spreadsheet using the function defined by expr.

This expression may assume that a local variable ROW is assigned to the row currently being evaluated. e.g.

SpreadsheetData(SpreadSheetCurrent(), ROW, foo)

returns the string representation of the data in column foo for the current spreadsheet which, for this function, is always spreadsheet.

SpreadsheetSetRowData

void SpreadsheetSetRowData(const std::string &spreadsheet, unsigned int row,
                           std::vector<std::string> headers,
                           std::vector<std::string> rows, bool update=true)

Set data in the spreadsheet named spreadsheet. row is the row index to set. headers defines the names of the columns to set and rows is the string representation of the data.

Note: headers[i] should be the name of the column for row[i].

If update is true, immediately update the spreadsheet. Set this to false if you are updating a bunch of data and then call SpreadsheetUpdateContents(ss).

SpreadsheetShowAllColumns

void SpreadsheetShowAllColumns(const std::string &spreadsheet)

Make all columns visible for the spreadsheet named spreadsheet.

SpreadsheetShowAllTabs

void SpreadsheetShowAllTabs()

Make all spreadsheets visible.

SpreadsheetShowColumn

void SpreadsheetShowColumn(const std::string &spreadsheet, int section,
                           bool visible=true)
void SpreadsheetShowColumn(const std::string &spreadsheet,
                           const std::string &name, bool visible=true)

Make column name visible for the spreadsheet named spreadsheet.

SpreadsheetShowStats

void SpreadsheetShowStats(const std::string &spreadsheet, bool show)

If show is true, show the column statistics for the spreadsheet named spreadsheet. Otherwise, hide the statistics.

SpreadsheetShowStatsGet

bool SpreadsheetShowStatsGet(const std::string &spreadsheet)

Returns True if spreadsheets has the statistics window shown, False otherwise.

SpreadsheetShowStatsSet

void SpreadsheetShowStatsSet(const std::string &spreadsheet, bool show)

If show is True, the statistics window for spreadsheet is shown. Otherwise it is hidden.

SpreadsheetShowTab

bool SpreadsheetShowTab(std::string name)

Ensure that spreadsheet is shown as a spreadsheet selection in the spreadsheet window.

SpreadsheetSort

void SpreadsheetSort(const std::string &spreadsheet,
                     const std::vector<std::string> &columns,
                     const std::vector<int> &directions,
                     bool moveToFirst=true)

Sort spreadsheet by the specified columns and directions. If a direction is 1 then the column is ascending, if a direction is 2 the column is descending.

Example:

SpreadsheetSort( "Molecules", ["target", "IC50"], [1,2] )

target | IC50
cox2   | 1.60
cox2   | 1.40
cox1   | 1.80
cox1   | 1.67

If *moveToFirst* is True then the sorted columns will be placed first in the spreadsheet.

This would sort the molecules by target ascending and then by IC50 descending