ML Predict: Classification Using Fingerprints for Small Molecules¶
This floe predicts a property of small, drug-like molecules using a pretrained machine learning model. The predictions are based on discreet string classes that the model was trained on.
It runs a TensorFlow-based fully-connected neural network regression model for prediction. This model needs to be provided by the user and can be generated with the ML Build: Classification Model with Tuner using Fingerprints for Small Molecules Floe.
The floe uses convex box and Monte Carlo methods for domain of application predictions. The input TensorFlow dataset also contains a model agnostic system to explain the predictions on the molecule.
This floe runs quickly and is very inexpensive, costing about one cent for a property prediction of 50 molecules.
Outputs:
Failure Data: (a) The molecule is too large or too small, or (b) the molecule has an unknown atom.
No Confidence Data: The molecule’s property falls out of scope of the training set. In this case, the model predicts with no guarantees. The explainer image has a red background.
Success Data: (a) The molecule falls within scope; the explainer has a green background, or (b) it falls at the edge of the scope; the explainer has a yellow background.
Molecules outside the scope of the training set will be sent to the “No Confidence” port, as a prediction cannot be considered reliable. Specifically, the scope is defined as a range in molecular weight, atom count, polar surface area, and calculated logP from the training set molecules. These ranges are provided in the Floe Report. If the trained model had the preprocess button On, it is recommended to do the same in this floe.
Name |
Description |
Type |
---|---|---|
Input Small Molecule Dataset(s)
to Predict Property of
|
The dataset(s) to read records from. |
Molecule Dataset |
Input TensorFlow Model |
Machine learning model to predict property. |
Machine Learning TensorFlow Model Dataset |
Name |
Description |
Type |
---|---|---|
Model ID of TensorFlow Model to Use to Predict |
Which model to select. Make sure this matches input model ID. |
Int |
Preprocess Molecule |
Preprocess by neutral pH, largest molecule, blockbuster filter. |
Bool |
Apply Blockbuster Filter |
For every molecule, stores only largest component and adjusts ionization to neutral pH. |
Bool |
Name |
Description |
Type |
---|---|---|
Property Validation Field |
If the dataset has a baseline, the floe provides
a comparison between predictions in Floe Report.
|
Float |
Molecule Explainer Type |
Select explainer visualization.
Atom: annotate atoms only;
Fragment: annotate fragments;
Combined: annotate both.
|
List |
Name |
Description |
Type |
---|---|---|
Output Property |
Output dataset to write to. |
Dataset |
Failure Property |
Output dataset to write to. |
Dataset |