ML Predict: Regression using Fingerprints for Small Molecules¶
A floe that predicts a property of small, drug-like molecules using a pretrained ML(machine learning) model.
It runs a Tensorflow based Fully-connected Neural Network regression Model for prediction. This model which needs to be provided by user can be generated using the ML Regression Model Building using Fingerprints for Small Molecules floe.
Uses ConvexBox Approach Domain of Application prediction. The Input Tensorflow dataset also contains a model agnostic system, to explain the predictions on the molecule.
The user can also provide an optional Tensorflow based probabilistic fully-connected neural network for better error bar prediction. All the models run on 2D Fingerprints.
Very cheap and quick. Takes about a cent for property prediction of 10 molecules.
Outputs:
Failure Data: (a) Molecule is too large or too small. or, (b) Molecule has an unknown atom.
No confidence Data: Molecule’s property falls out of scope of training set. In this case, the model predicts with no guarantees. Explainer image has a red background.
Success Data: (a) Falls within scope; explainer has green background. (b) Falls at the edge of scope; explainer has yellow background.
Molecules outside the scope of the training set will be sent to the ‘No Confidence’ port, as a prediction cannot be considered reliable. Specifically, the scope is defined as a range in molecular weight, atom count, polar surface area, and calculated logP from the training set molecules. These ranges are reported in the Floe report.
Name |
Description |
Type |
---|---|---|
Input Small Molecule(s) Dataset
to predict property of
|
The dataset(s) to read records from |
Molecule Dataset |
Input tensorflow Model |
Machine Learning model to predict property |
Machine Learing Tensorflow Model Dataset |
Input tensorflow probability Model |
The dataset(s) to read records from |
Machine Learing Tensorflow Probaility Model Dataset |
Name |
Description |
Type |
---|---|---|
Model ID of which Tensorflow model to use to predict |
Which model to select. Make sure this matches with input Model ID |
Int |
Model ID of which Tensorflow Probability (TFP) model to use to predict. |
Which model to select. Make sure this matches with input Model ID |
Int |
Preprocess Molecule |
For every molecule, stores only largest component, adjusts ionization to Neutral Ph |
Bool |
Apply Blockbuster filter |
Apply blockbuster filter |
Bool |
Name |
Description |
Type |
---|---|---|
Property Validation Field |
If the dataset has a baseline, the floe reports
a comparison between prediction in Floereport
|
Float |
Molecule Explainer Type |
Select explainer visualisation.
Atom: annotate atoms only,
Fragment: Annotate Fragments,
Combined: Annotate Both
|
List |
Name |
Description |
Type |
---|---|---|
Output Property |
Output dataset to write to |
Dataset |
Failure Property |
Output dataset to write to |
Dataset |