ML Predict: Classification using Fingerprints for Small Molecules

This floe predicts properties of small, drug-like molecules using a pretrained ML model. Predictions are based on discreet string classes that the model was trained on.

It runs a TensorFlow-based fully connected neural network regression model for prediction. This user-provided model can be generated using the ML Build: Classification Model with Tuner using Fingerprints for Small Molecules Floe.

The floe uses convex box and Monte Carlo approaches for domain of application prediction. The input TensorFlow dataset also contains a model agnostic system to explain the predictions on the molecule.

The floe is very cheap and quick, costing a few cents for the property prediction of 50 molecules.

Outputs:

Failure Data: The molecule (a) is too large or too small or (b) has an unknown atom.

No confidence Data: The molecule’s property falls out of scope of the training set. In this case, the model predicts with no guarantees. The explainer image has a red background.

Success Data: The molecule falls (a) within scope and the explainer has a green background or (b) at the edge of scope and the explainer has a yellow background.

Molecules outside the scope of the training set will be sent to the “No Confidence” port, as a prediction cannot be considered reliable. Specifically, the scope is defined as a range in molecular weight, atom count, polar surface area, and calculated logP from the training set molecules. These ranges are reported in the Floe Report. If the trained model had the Preprocess Molecule parameter On, it is recommended to do set it On here as well.

Inputs

Name

Description

Type

Input Small Molecule(s) Dataset
to Predict Property of

The dataset(s) to read records from.

Molecule Dataset

Input TensorFlow Model

Machine learning model to predict property.

Machine Learning TensorFlow Model Dataset

ML Model Options

Name

Description

Type

Model ID of TensorFlow model to Use to Predict*

Which model to select. Make sure this matches the input model ID.

Int

Preprocess Molecule

Preprocess by neutral pH, largest molecule.

Bool

Apply Blockbuster Filter

For every molecule, stores only largest component, adjusts ionization to neutral pH.

Bool

Explanation and Validation

Name

Description

Type

Property Validation Field

If the dataset has a baseline, the floe reports
a comparison between predictions in the Floe Report.

Float

Molecule Explainer Type

Select explainer visualization.
Atom: annotate atoms only
Fragment: annotate fragments
Combined: annotate both

List

Outputs

Name

Description

Type

Output Property

Output dataset to write to.

Dataset

Failure Property

Output dataset to write to.

Dataset