• Docs »
• Data Analysis Functions

# Data Analysis Functions¶

The functions detailed in this section provide access to the data analysis and spreadsheet functionality in VIDA. An important distinction should be made in that the “Datatable” functions refer to internal data collections while the “Spreadsheet” functions refer to the actual tabular views of the associated data that actually appear in the Spreadsheet display.

void DataAdd(const OEPropDB::OEKey &key, const std::string &str,
const std::string &data)

Adds a piece of data to the object specified by key. If a column in the spreadsheet does not already exist for this data type, it will be added.

## DataGetDB¶

VFDataBase *DataGetDB()

Return the VDDataBase object.

## DataGetTable¶

VFDataTable *DataGetTable(const std::string &name, bool throwError=true)

Return the list of all internal databases.

bool genericData=false)

Add a column to the specified data table with the specified name. If a column with the specified name already exists, this call will be ignored.

## DatatableCommitChanges¶

void DatatableCommitChanges(const std::string &datatable)

Commits any outstanding changes to the internal database.

## DatatableCurrentGet¶

std::string DatatableCurrentGet()

Returns the name of the current data table (e.g. ‘Molecules’).

## DatatableCurrentSet¶

bool DatatableCurrentSet(const std::string &datatable)

Sets the current data table.

## DatatableData¶

std::string DatatableData(const std::string &datatable, unsigned int row,

Returns the string representation of the data in the specified data table at the specified row and column.

## DatatableDeleteColumn¶

void DatatableDeleteColumn(const std::string &datatable,
const std::string &name)

Delete the specified column from the specified data table.

## DatatableEditableGet¶

bool DatatableEditableGet(const std::string &ss)

Returns whether or not the specified data table is editable.

## DatatableEditableSet¶

void DatatableEditableSet(const std::string &ss, bool val)

Sets whether or not the specified data table is editable.

## DatatableFilter¶

void DatatableFilter(const std::string &datatable, const std::string &filter,
const std::string &name, bool staticView=true,
bool permaFilter=false)
void DatatableFilter(const std::string &datatable, const OEDataFilter &filter,
const std::string &name, bool staticView=true,
bool permaFilter=false)

Creates a new data table view with the specified name from the specified data table using the specified filter. The filter is an arbitrary Python expression that is applied to every row in the source data table to determine whether or not it should be included in the filtered view.

The filter expression may assume that a local variable ROW is set to the current row number being evaluated. An example appears below:

int(DatatableData(DatatableCurrent(), ROW, foo)) < 400

The example shows an example expression where the string representation of
the data in the current data table (which may not necessarily be the
specified datatable) at row ROW and in column foo is converted to an
integer and compared with the value 400. If this expression is True for
the row being tested, it will be included in the filtered view, otherwise it
will not be included.

## DatatableFromList¶

void DatatableFromList(unsigned int listid, const std::string &name)

Creates a filtered data table with the specified name with row entries for each object in the specified list. If the specified name already exists, a new name will be used (e.g. name –> name1, name2, and so on).

## DatatableGetColumn¶

std::vector<std::string> DatatableGetColumn(const std::string &datatable,
unsigned int index)
std::vector<std::string> DatatableGetColumn(const std::string &datatable,
const std::string &name)

Returns a list of string representations for each entry in the specified column in the specified data table.

## DatatableGetCurrentRow¶

int DatatableGetCurrentRow(std::string datatable)

Returns the row index for the first active or selected row in the specified data table.

## DatatableGetDatatables¶

std::vector<std::string> DatatableGetDatatables()

Returns a list of the names of all the available datatables.

## DatatableGetImageStreamAtRow¶

bool DatatableGetImageStreamAtRow(const std::string &datatable,
const std::string &filename, int row,
int width=256, int height=256)

Write an image file to the file specified by filename for the datatable named datatable at the row specified by row. The size of the image is specified by width and height.

## DatatableGetKeys¶

std::vector<OEKey> DatatableGetKeys(const std::string &datatable)

Returns a list of keys for the database specified by datatable. The keys are returned in the current datatable sort order and are consistent with the ordering from the call to DatatableGetColumn.

## DatatableGetNumRows¶

int DatatableGetNumRows(std::string datatable)

Return the number of rows for the specified datatable.

Return a list of all the names of all the headers for the specified datatable.

## DatatableLingoSimSort¶

bool DatatableLingoSimSort(const std::string &spreadsheet, unsigned int row)

Sorts the specified datatable according to Lingo similarity to the molecule contained in the specified row.

## DatatableMolNumberFunction¶

double DatatableMolNumberFunction(const std::string &datatable, int row,
const std::string &func)

Calculates molecular data that returns a numeric value. datatable identifies the datatable containing the molecule and row is the datatable row with the molecule.

The function func to compute is one of:

• “mw” Molecular weight.

• “Num Atoms” Number of atoms in the molecule.

• “Num Bonds” number of bonds in the molecule.

• “Carbon-Hetero ratio” the ratio of carbons to hetero atoms in the molecule.

Returns -1 if there are no carbons.

• “Energy” The molecular energy of the molecule as specified in the input file.

• “Actual Charge” The sum of the partial charges on all atoms as specified in the input file.

• “Formal Charge” The sum of the formal charges on all atoms.

• “Halide Count” The number of halogen atoms in the molecule.

• “Num Carbons” The number of carbon atoms in the molecule.

• “Num Formal Charges” The number of more atoms with a specified formal charge.

• “Num Heavy Atoms” The number of heavy atoms (non hydrogen) in the molecule.

• “Num Hetero Atoms” The number of hetero atoms in the molecule.

• “Num Hydrogens” The number of hydrogen atoms in the molecule.

• “Num Rigid Bonds” The number of rigid bonds in the molecule.

• “Nom Rotatable Bonds” The number of rotatable bonds in the molecule.

• “Num Chiral Atoms” returns the number of chiral atoms in the molecule.

• “Num Chiral Bonds” returns the number of chiral bonds in the molecule.

## DatatableMolStringFunction¶

std::string DatatableMolStringFunction(const std::string &datatable, int row,
const std::string &func)

Calculates molecular data that returns a string value. datatable identifies the datatable containing the molecule and row is the datatable row with the molecule.

The function func to compute is one of:

• “molformula” the molecular formula for the molecule.

## DatatableNumRows¶

int DatatableNumRows(std::string datatable)

Return the number of rows for the datatable named datatable.

## DatatableSetData¶

void DatatableSetData(const std::string &datatable, unsigned int row,
std::string header, std::string value, bool update=true)

Set the cell data for the datatable named datatable at row for the column named header to the string value value.

If update is True, immediately update the datatable. Set this to False if you are updating a bunch of data and then call DatatableUpdateContents on this datatable.

## DatatableSetExpression¶

void DatatableSetExpression(const std::string &datatable,
const std::string &col, const std::string &expr)

A datatable expression defines an arbitrary piece of python code to call that generates data to be displayed in the datatable.

This function creates a new column named col for the specified datatable using the function defined by expr.

This expression may assume that a local variable ROW is assigned to the row currently being evaluated. e.g.

DatatableData(DatatableCurrent(), ROW, foo)

returns the string representation of the data in column foo for the current datatable which, for this function, is always datatable.

## DatatableSetRowData¶

void DatatableSetRowData(const std::string &datatable, unsigned int row,
std::vector<std::string> rows, bool update=true)

Set data in the datatable named datatable. row is the row index to set. headers defines the names of the columns to set and rows is the string representation of the data.

Note: headers[i] should be the name of the column for row[i].

The update parameter is no longer used and is kept for backwards compatibility.

bool genericData=false)

For the “Atoms” and “Residues” spreadsheets, only certain column names are allowed, and the values in these columns cannot be changed.

For the “Atoms” spreadsheet, the valid column names are:

• x
• y
• z
• Element
• Residue_Idx
• Name
• oe_atom_Formal_Charge
• oe_atom_Partial_Charge
• oe_atom_Idx
• oe_atom_Map_Idx
• oe_atom_Isotope
• oe_atom_Hyb
• oe_atom_Implicit_H_Count
• oe_atom_Explicit_H_Count
• oe_atom_Int_Type
• oe_atom_Type
• oe_atom_Aromatic
• oe_atom_Chiral
• oe_atom_In_Ring
• oe_residue_Name
• oe_residue_Occupancy
• oe_residue_BFactor
• oe_residue_Number
• oe_residue_Serial_Number
• oe_residue_Model_Number
• oe_residue_Fragment_Number
• oe_residue_Secondary_Structure
• oe_residue_Alternate_Location
• oe_residue_Chain_ID
• oe_residue_IsHetAtom

The oe_residue_ and oe_atom_ prefixes are not displayed in the spreadsheet headers, and are omitted when referring to these columns in other functions, such as SpreadsheetData and SpreadsheetRemoveColumn.

For the “Residues” spreadsheet, the valid column names are:

• Residue
• Average BFactor
• Min Occupancy
• Max Occupancy
• Alt Group
• Num AltConfs

const std::string &column,
const std::string &colorer, double min,
double max)

Sets a colorer for the specified column in the specified spreadsheet. Valid values for the colorer parameter include:

• “redtoblue”
• “bluetored”
• “rainbow”
• “reverse rainbow”
• “redyellowblue”
• “greytogreen”

Open the spreadsheet column controller dialog. The column controls allows controlling the visibility and other aspects for the current spreadsheets column.

const std::string &column)

Returns the font to be used when rendering text in the specified column of the specified spreadsheet.

const std::string &column,
const std::string &font)

Sets the font to be used when rendering text in the specified column of the specified spreadsheet.

const std::string &column)

Returns the number of significant figures used to display numerical values data in the specified column.

const std::string &column, int digits)

Sets the number of significant figures used to display numerical values data in the specified column.

Normally, this is done automatically behind the scenes but sometimes is necessary to keep the data in the spreadsheet synchronized with the molecules and data in the repository.

Copies to the clipboard and returns the currently selected set of cells in the specified spreadsheet.

Open the dialog to create a new column expression for the current spreadsheet. A column expression is a user-defined function that populates a column with arbitrary data.

Open the dialog to generate a new view of the spreadsheet with all rows filtered by an arbitrary python expression.

Returns the name of the current spreadsheet. e.g. ‘Molecules’.

Return the string representation of the data in the spreadsheet named spreadsheet for the row row and the column named header.

Delete the column named column. Note that this deletes column from all spreadsheets.

Returns True if spreadsheet is editable.

void SpreadsheetEditableSet(const std::string &ss, bool val)

std::string name, bool staticView)
const std::string &name, bool staticview,
bool permaFilter=false)

Create a new spreadsheet view from the spreadsheet named spreadsheet using the filter defined in filter. The new view is named name. filter is an arbitrary python expression that is applied to every row in the spreadsheet.

This expression may assume that a local variable ROW is assigned to the row currently being evaluated. e.g.

returns the string representation of the data in column foo for the current spreadsheet. Note: the current spreadsheet may not be spreadsheet.

void SpreadsheetFromList(unsigned int listid, const std::string &name)

Generate a filtered spreadsheet with name name from the list listid.

If name is already in the spreadsheet, it will be renamed name1, name2 and so on.

unsigned int index)
const std::string &name)

Returns string representations of the entire column for the spreadsheet named spreadsheet for column name or, alternatively, for column index by index.

Returns the row index for the first active or selected row for the spreadsheet named spreadsheet.

Returns the repository id for the first active or selected row for the spreadsheet named spreadsheet.

const std::string &filename, int row,
int width=256, int height=256)

Write an image file to the file specified by filename for the spreadsheet named spreadsheet at the row specified by row. The size of the image is specified by width and height.

unsigned int row)

Returns the OEKey value for the first active or selected row for the spreadsheet named spreadsheet.

Returns the row that is generated from OEKey key. Returns “4294967295L” if key is not in the spreadsheet.

Return a list of the names for all the currently available spreadsheets.

const std::string &name)

const std::string &filename,
const std::string &matchList="",
const std::string &importColumnOrFunction="")

Imports the specified file into the spreadsheet.

Sorts the specified spreadsheet according to Lingo similarity to the molecule in the specified row.

const OEDataFilter &filter)
const OEDataFilter &filter, bool wasStatic,
bool isStatic)

Creates a new spreadsheet using the specified filter.

const std::string &func)

Calculates molecular data that returns a numeric value.

The function func to compute is one of:

• “mw” Molecular weight.

• “Num Atoms” Number of atoms in the molecule.

• “Num Bonds” number of bonds in the molecule.

• “Carbon-Hetero ratio” the ratio of carbons to hetero atoms in the molecule.

Returns -1 if there are no carbons.

• “Energy” The molecular energy of the molecule as specified in the input file.

• “Actual Charge” The sum of the partial charges on all atoms as specified in the input file.

• “Formal Charge” The sum of the formal charges on all atoms.

• “Halide Count” The number of halogen atoms in the molecule.

• “Num Carbons” The number of carbon atoms in the molecule.

• “Num Formal Charges” The number of more atoms with a specified formal charge.

• “Num Heavy Atoms” The number of heavy atoms (non hydrogen) in the molecule.

• “Num Hetero Atoms” The number of hetero atoms in the molecule.

• “Num Hydrogens” The number of hydrogen atoms in the molecule.

• “Num Rigid Bonds” The number of rigid bonds in the molecule.

• “Nom Rotatable Bonds” The number of rotatable bonds in the molecule.

• “Num Chiral Atoms” returns the number of chiral atoms in the molecule.

• “Num Chiral Bonds” returns the number of chiral bonds in the molecule.

int row, const std::string &func)

Calculate molecular data that returns a string value.

The function func to compute is one of:

• ‘molformula’ the molecular formula for the molecule.

const std::string &columnName, int position)

Moves the specified column to the specified position.

This function opens a dialog and returns the expression to evaluate to add the specified column expression.

This function opens a dialog and returns the expression to export the specified spreadsheet and columns.

This function opens a dialog and returns the expression to evaluate to filter the specified spreadsheet.

This function opens a dialog which allows the user to specify the formatting of the spreadsheet.

This function opens a dialog which allows the user to specify how the 2D depiction is rendered using OpenEye’s Grapheme toolkit to apply a property map of computed properties or even user defined properties which are tagged to the atoms using generic data.

This function opens a dialog and returns the expression to import spreadsheet filename.

This function opens a dialog and returns the expression to evaluate to sort the specified spreadsheet and columns.

Remove the spreadsheet with name name. The base “Molecules” spreadsheet cannot be removed.

Returns the row height for the specified spreadsheet.

Sets the default row height for the specified spreadsheet.

std::string header, std::string value, bool update=true)

Set the cell data for the spreadsheet named spreadsheet at row for the column named header to the string value value.

If update is true, immediately update the spreadsheet. Set this to false if you are updating a bunch of data and then call SpreadsheetUpdateContents on the spreadsheet.

const std::string &col, const std::string &expr)

A spreadsheet expression defines an arbitrary piece of python code to call that generates data to be displayed in the spreadsheet.

SpreadsheetSetExpression creates a new column named col for the spreadsheet named spreadsheet using the function defined by expr.

This expression may assume that a local variable ROW is assigned to the row currently being evaluated. e.g.

returns the string representation of the data in column foo for the current spreadsheet which, for this function, is always spreadsheet.

std::vector<std::string> rows, bool update=true)

Set data in the spreadsheet named spreadsheet. row is the row index to set. headers defines the names of the columns to set and rows is the string representation of the data.

Note: headers[i] should be the name of the column for row[i].

If update is true, immediately update the spreadsheet. Set this to false if you are updating a bunch of data and then call SpreadsheetUpdateContents(ss).

bool visible=true)
const std::string &name, bool visible=true)

If show is true, show the column statistics for the spreadsheet named spreadsheet. Otherwise, hide the statistics.

Returns True if spreadsheets has the statistics window shown, False otherwise.

If show is True, the statistics window for spreadsheet is shown. Otherwise it is hidden.

const std::vector<std::string> &columns,
const std::vector<int> &directions,
bool moveToFirst=true)

Sort spreadsheet by the specified columns and directions. If a direction is 1 then the column is ascending, if a direction is 2 the column is descending.

Example:

SpreadsheetSort( “Molecules”, [“target”, “IC50”], [1,2] )

target | IC50 cox2 | 1.60 cox2 | 1.40 cox1 | 1.80 cox1 | 1.67

If moveToFirst is True then the sorted columns will be placed first in the spreadsheet.

This would sort the molecules by target ascending and then by IC50 descending