Tutorial for Floe Report Summary Statistics

This tutorial addresses the statistics and output produced by model building floes and describes how to analyze each type. Analysis for these floes is quite different than that of other floe packages. This tutorial seeks to show generalized data; more specific details are included in each tutorial.

Types of data acquired for model building and prediction floes are:


  • Output analysis

  • Classification of performance metrics

  • Interesting molecules

  • Classification of hERG class prediction analysis

Model Building Analysis

These summary statistics are important for the model building floes:

  • Histograms of regression-based machine learning models

  • Summaries of model hyperparameters on the validation set

  • Heat maps for the hyperparameters most sensitive to MAE

  • Neural network epoch training plots

  • Score-sorted lists of the fully-connected neural network models generated

  • NN classification models and LIME analysis

  • NN hyperparameters and fingerprint hyperparameters

  • Regression outputs

  • Outlier prediction

After the floe has finished, click the link on the Floe Report tab in your window to preview the report. The report is large, so it may take a while to load. If this is the case, try refreshing or popping the report into a new window. All results reported in the Floe Report are on the validation data.

The histograms below summarize statistics on the whole input data. They show the spread of data for each property of the input molecules. For well-built models, the histograms should be close to Gaussian in shape. Skewed histograms or outliers might need cleanup to train better models.


Here is another example.


The graphs show the Mean Absolute Error (MAE) for different values of the neural network hyperparameters. The MAE helps us analyze how sensitive the different hyperparameter values are and plan future model builds accordingly. Individual plots show how model parameters fluctuate with a change in a single hyperparameter when all other hyperparameters are kept constant; this can analyze the sensitivity of the model to each hyperparameter. A lower value of MAE is better, as it brings down the dependence on the parameter.


Here is another example.


There is also a plot between the top two most sensitive hyperparameters. In the example below, the top two most sensitive parameters are the regularization 1 and regularization 0. Choosing a value around the minima in the MAE heatmap (0 for reglayer1 and 0.04 for reglayer2) will build better models in future runs.


Next, we tabulate the list of all models built by the fully connected network. These models are sorted by the lowest R2 Scores (for validation data). On the face of it, the first row should be the best model since it has the least errors on the validation data. But several other factors besides the best R2 can determine this, starting with VLoss in the next column. Click on the Model Link to see a sample model. This will take you to a new Floe Report page. You can then look at the training curves under Neural Network Hyperparam and Fingerprint Parameters.


For each model, a linear color scale shows the rank. We can also see the cheminformatics and machine learning parameters the model was trained on.


We see that the training and validation MAE follow a similar trend, which is a good sign of not overfitting. Had they diverged, we might have to go back and tweak parameters such as number of hidden layer nodes, dropouts, and regularizers. Too much fluctuation in the graph suggests we need a slower learning rate.

(howto1) Regularizers are a great way to prevent overfitting; that is, the model learns so many details on the training set that is it unable to predict for unseen molecules. While there are many techniques to use, we chose R2 regularizers since they are smoother and converge easily. Here is an example of a training history of a model which suggest divergence. Adding a regularizer would certainly improve the ability of the model to generalize.


Once a model is trained, we can look into the training graphs in the Floe Report (Model Link) to gather more insight into the quality of the model built. The graphs illustrate how the mean absolute error (MAE) and mean squared error (MSE) change with every epoch of training. The first picture tells us that the learning rate is too high leading to large fluctuations in every epoch.

Neural Network Epoch Training Plots

High learning rate training

This figure tells us that the validation error is much higher than the training, meaning we have overfit our model. Increasing the regularization parameter or decreasing the number of nodes might be a good way to stop overfitting.

High learning rate training

Finally, this picture shows us how a well trained model should look.

High learning rate training

The Regression Output plots real versus predicted data to show how well the training is correlated.

For some floes, it can be useful to look at the Sigmoid Confidence data.


The Sigmoid Confidence Charts show the certainty of the final sigmoid layer of the neural network. As shown in the diagram, for the first graph, the IC Class zero has a greater softmax probability than either 1 or 2. The same holds true for the prediction of the other two IC Classes. These plots illustrate that the sigmoid layer is confident in its prediction on the validation data. This is another sign of good model training.

Below that, click on an interesting molecule to see the annotated explainer of the machine learning model.


Molecular Property Prediction

Analyze OEModel Floe Report

Here is a sample image of how the floe report should look (assuming you ran a fully connected neural network model like M_1):


The top part contains the hyperparameters on which the model was trained. Then we have histogram of the output prediction and the confidence with each prediction. We also have a plot for confidence of prediction versus the actual output. These overall statistics help analyze the input molecules predicted.


We also have outlier prediction using:

Lastly, there is a link to a page under Interesting Molecules that shows the annotated images that explains sample molecules with low, medium, and high confidence.


Analyze Output

  • Go to the data section of Orion and activate the data that the floe produced. This should have the same name you chose

for the Output Prediction field of your floe. The data can be activated by clicking on the small plus sign in a circle right next to it.

  • In the Analyze page in Orion, you should be able to see the molecules, their predicted pyrrolamide values, and the explanation of the output.

The output columns and their explanations are:

  • Confidence: How confident the model is with its prediction on a scale of 0-1.

  • Contributions: Explanation of predictions based on a local model. If the image has a dark background, it means there is an error or warning issued. Based on the choice of molecule explainer (fragment by default), different parts will be color-annotated, with blue denoting more contribution towards the physical property (solubility) and red denoting the opposite.

  • HighestTaniSimilarity: what is the highest 2d Tanimoto similarity with any molecule in the training set.

  • HighestTaniSimilarityProperty: what is the NegIC50 of that training set molecule on record.

    • These two fields basically tell us if there is a similar molecule in the training set, and if so, what is its physical property value.

  • Scope: if there is an error or warning, what caused the issue.

  • Class Predict (Physical Property): Predicts property as High, Medium or Low. Background color suggests how confident the model is with green (most confidence), yellow(average confidence), and red (less confidence or out of scope).

  • Prediction (Physical Property): Physical property prediction of the molecule.



We assign IDs (#) on each record molecule. This follows a linear ordering over all molecule. If you activate both the successful and failure predictions, and sort them based on #, the order should be same as the input.

(HowTo2) We can also look at the annotated images of molecules. Shown below is an annotated image of a molecule with the bits that our algorithm thinks is important for the predicted property (solubility). We translated bit importance to ligand or atom importance and annotated them based on a color scale. We can view (a) ligand annotated, (b)atom annotated, and (c) ligand+atom annotated explainer images as shown in the picture below.


Fragments such as amide bonds and hydroxyl groups are considered more soluble than the hydrophobic (greasy) benzene or nitrile groups. Blue represents hydrophobic areas, red represents hydrophilic areas, and colors in between fall somewhere in between. The models should be trained on different sets of fingerprints to see which explainer makes sense for the model. The color scheme can be tweaked using the QQAR option under “Parallel Find Property and Explanation Cube”. QQAR indicates the quantile distributions of the LIME votes based on which default color stops are defined. It allows you to put color stops on the color bar.

The data shown here are representative of the types of data you will see in the tutorials, how-to guides, and actual floes. This tutorial is meant to assist in the interpretation of your data. More specific information regarding each floe will appear in each tutorial and how-to guide.