ML Predict: Regression Using Fingerprints for Small Molecules¶
This floe predicts properties of small, drug-like molecules using a pretrained machine learning (ML) model.
It runs a TensorFlow-based fully-connected neural network regression model for prediction. This model needs to be provided by the user and can be generated using the ML Build: Regression Model with Tuner using Fingerprints for Small Molecules Floe.
This floe uses a convex box approach for domain of application predictions. The input TensorFlow dataset also contains a model agnostic system to explain the predictions on the molecule.
The user can also provide an optional TensorFlow-based probabilistic fully-connected neural network for better error bar prediction. All the models run on 2D fingerprints.
This floe runs quickly and is very inexpensive, costing about one cent for a property prediction of 10 molecules.
Outputs:
Failure Data: (a) The molecule is too large or too small, or (b) the molecule has an unknown atom.
No Confidence Data: The molecule’s property falls out of scope of training set. In this case, the model predicts with no guarantees. The explainer image has a red background.
Success Data: (a) The molecule falls within scope; the explainer has a green background, or (b) it falls at the edge of scope; the explainer has a yellow background.
Molecules outside the scope of the training set will be sent to the “No Confidence” port, as a prediction cannot be considered reliable. Specifically, the scope is defined as a range in molecular weight, atom count, polar surface area, and calculated logP from the training set molecules. These ranges are given in the Floe Report.
Name |
Description |
Type |
---|---|---|
Input Small Molecule Dataset(s)
to Predict Property of
|
The dataset(s) to read records from. |
Molecule Dataset |
Input TensorFlow Model |
Machine learning model to predict property. |
Machine Learning TensorFlow Model Dataset |
Input TensorFlow Probability Model |
The dataset(s) to read records from. |
Machine Learning TensorFlow Probability Model Dataset |
Name |
Description |
Type |
---|---|---|
Model ID of TensorFlow Model to Use to Predict |
Which model to select. Make sure this matches input model ID. |
Int |
Model ID of TensorFlow Probability (TFP) Model to Use to Predict* |
Which model to select. Make sure this matches the input model ID. |
Int |
Preprocess Molecule |
For every molecule, stores only largest component and adjusts ionization to neutral pH. |
Bool |
Apply Blockbuster Filter |
Apply blockbuster filter. |
Bool |
Name |
Description |
Type |
---|---|---|
Property Validation Field |
If the dataset has a baseline, the floe provides
a comparison between predictions in the Floe Report.
|
Float |
Molecule Explainer Type |
Select explainer visualization.
Atom: annotate atoms only;
Fragment: annotate fragments;
Combined: annotate both.
|
List |
Name |
Description |
Type |
---|---|---|
Output Property |
Output dataset to write to. |
Dataset |
Failure Property |
Output dataset to write to. |
Dataset |