ML Predict: Regression Using Feature Input¶
This is a floe that predicts the properties of small, drug-like molecules using a pretrained machine learning (ML) model.
It runs a TensorFlow-based fully-connected neural network regression model for prediction. This model needs to be provided by the user and can be generated using the ML Build: Regression Using Feature Input Floe. Every molecule needs user-provided features and float vectors as inputs.
The floe uses a convex box approach for domain of application predictions. The input TensorFlow dataset also contains a model agnostic system to explain the predictions on the molecule.
This floe runs quickly and is very inexpensive, costing about one cent for a property prediction of 50 molecules.
Outputs:
Failure Data: (a) The molecule is too large or too small, or (b) the molecule has an unknown atom.
No confidence Data: The molecule’s property falls out of scope of the training set. In this case, the model predicts with no guarantees. The explainer image has a red background.
Success Data: (a) The molecule falls within scope; the explainer has a green background, or (b) it falls at the edge of scope; the explainer has a yellow background.
Molecules outside the scope of the training set will be sent to the “No Confidence” port, as a prediction cannot be considered reliable. Specifically, the scope is defined as a range in molecular weight, atom count, polar surface area, and calculated logP from the training set molecules. These ranges are given in the Floe Report.
Name |
Description |
Type |
---|---|---|
Input Small Molecule Dataset(s)
to Predict Property of
|
The dataset(s) to read records from. |
Molecule Dataset |
Input TensorFlow Model |
Machine learning model to predict property. |
Machine Learning TensorFlow Model Dataset |
Name |
Description |
Type |
---|---|---|
Model ID of TensorFlow Model to Use to Predict* |
Which model to select. Make sure this matches input model ID. |
Int |
Preprocess Molecule |
For every molecule, stores only largest component and adjusts ionization to neutral pH. |
Bool |
Apply Blockbuster Filter |
Apply blockbuster filter. |
Bool |
Number of Features to Explain |
Number of top features to provide results for LIME explanations. |
Int |
Name |
Description |
Type |
---|---|---|
Property Validation Field |
If the dataset has a baseline, the floe provides
a comparison between predictions in the Floe Report.
|
Float |
Custom Feature |
Field containing feature vector to train model on.
|
FloatVec |
Name |
Description |
Type |
---|---|---|
Output Property |
Output dataset to write to. |
Dataset |
Failure Property |
Output dataset to write to. |
Dataset |