ML Predict: Regression using Fingerprints for Small Molecules
This floe predict properties of small, drug-like molecules using a pretrained ML model.
It runs a TensorFlow-based fully connected neural network regression model for prediction. This user-provided model can be generated using the ML Build: Regression Model with Tuner using Fingerprints for Small Molecules Floe.
The floe uses a convex box approach for domain of application prediction. The input TensorFlow dataset also contains a model agnostic system to explain the predictions on the molecule.
The user can also provide an optional TensorFlow-based probabilistic fully connected neural network for better error bar prediction. All models run on 2D fingerprints.
This floe is very cheap and quick. It costs a few cents for a property prediction of 10 molecules.
Outputs:
Failure Data: The molecule (a) is too large or too small, or (b) has an unknown atom.
No Confidence Data: The molecule’s property falls out of scope of the training set. In this case, the model predicts with no guarantees. The explainer image has a red background.
Success Data: The molecule falls (a) within scope and the explainer has a green background or (b) at the edge of scope and the explainer has a yellow background.
Molecules outside the scope of the training set will be sent to the “No Confidence” port, as a prediction cannot be considered reliable. Specifically, the scope is defined as a range in molecular weight, atom count, polar surface area, and calculated logP from the training set molecules. These ranges are reported in the Floe Report.
| Name | Description | Type | 
|---|---|---|
| Input Small Molecule(s) Dataset to Predict Property of | The dataset(s) to read records from. | Molecule Dataset | 
| Input TensorFlow Model | Machine Learning model to predict property. | Machine Learning TensorFlow Model Dataset | 
| Input TensorFlow Probability Model | The dataset(s) to read records from. | Machine Learning TensorFlow Probability Model Dataset | 
| Name | Description | Type | 
|---|---|---|
| Model ID of which TensorFlow Model to Use to Predict | Which model to select. Make sure this matches the input model ID. | Int | 
| Model ID of which TensorFlow Probability (TFP) Model to Use to Predict | Which model to select. Make sure this matches the model ID. | Int | 
| Preprocess Molecule | For every molecule, stores only largest component, adjusts ionization to neutral pH. | Bool | 
| Apply Blockbuster Filter | Apply Blockbuster filter. | Bool | 
| Name | Description | Type | 
|---|---|---|
| Property Validation Field | If the dataset has a baseline, the floe reports a comparison between predictions in the Floe Report. | Float | 
| Molecule Explainer Type | Select explainer visualization. Atom: annotate atoms only Fragment: annotate fragments Combined: annotate both | List | 
| Name | Description | Type | 
|---|---|---|
| Output Property | Output dataset to write to. | Dataset | 
| Failure Property | Output dataset to write to. | Dataset |