Analyze Page

The Analyze page comprises multiple features, including a molecular spreadsheet, graphing capabilities, 2D and 3D viewing, subset creation, filtering and tagging, and numerous cheminformatics functions, such as substructure search, R-group decomposition, and property calculation. There is a robust regression feature with which users can add a multilinear regression best-fit prediction of a real-valued property using other properties. Floes that calculate new properties are also available from this page.

One pane is devoted to presenting information on recent floes. All panes can be moved, removed, resized, or duplicated. There are two views: a spreadsheet view and a 3D analysis view.

The graphing feature can represent data with five different variables, some categorical and some continuous, via the choice of point shapes, size, color, and their x and y values. Scatter plots, box plots, histograms, heatmaps, and violin plots are available, where appropriate, for the selected data type(s). Multiple graphs can be created. Points can be selected by lasso or by point-and-click. Depictions of structures can be accessed for any selected points. The graph can be moved, panned, or reset. Scales can be changed between linear to log forms. Regression lines (with error bars) can be added. Box plots contain quartile ranges and other annotations.

_images/analyze_point_and_row.png

Figure 1. When a point in the plot is selected, the spreadsheet navigates to the row representing that point. You can also select a row in the spreadsheet and see the corresponding point identified in the plot.

Spreadsheet

The spreadsheet allows typical actions: column sorting, pinning, reordering or deleting, and indication of selected rows. Pop-up views of structures are also available. The spreadsheet data can be manipulated by a data-handling panel that allows columns to be turned on or off. It also allows for a style to be chosen for treating imprecise data (for example, ranges, inequalities, multiple values).

JSON data fields are viewable in the spreadsheet.

Categorical string fields can be filtered using a “free text” setting. This allows creation of filters that can match multiple distinct categorical values that contain a provided string.

_images/analyze_copy_cell.png

Figure 2. Spreadsheet cells can be copied to the system clipboard.

Components of the Analyze Page

Analyze Page Layout

Name

Description

Graphing Panel

Allows users to create plots. The plot type (scatter, boxplot, heatmap,
histogram, or violin) can be selected from the Plot drop-down menu.

Spreadsheet

Displays imported data, molecular structures, and images. Users can append
calculated properties, such as PSA and XLogP.

Data & Columns Panel

Allows users to compute molecular properties, perform R-Group decomposition,
and predict properties using a regression model.

Layout Drop-down

Allows 3D viewing with the Analyze with 3D mode.

Job History

Displays a summary of running and recent jobs.

Hint

Saved Views allow users to save the current state of the Analyze page, switch to another page or task, and then come back to exactly the same state. The Data Handling control allows users to apply column settings to all data columns of the same name.