ML Predict: Classification using Fingerprints for Small Molecules¶

A floe that predicts a property of small, drug-like molecules using a pretrained ML(machine learning) model. Predictions based on discreet string classes that the model was trained on.

It runs a Tensorflow based Fully-connected Neural Network regression Model for prediction. This model which needs to be provided by user can be generated using the ML Classifier Model Building using Fingerprints for Small Molecules floe.

Uses ConvexBox and Monte Carlo Approach for Domain of Application prediction. The Input Tensorflow dataset also contains a model agnostic system, to explain the predictions on the molecule.

Very cheap and quick. Takes about a cent for property prediction of 50 molecules.

Outputs:

Failure Data: (a) Molecule is too large or too small. or, (b) Molecule has an unknown atom.

No confidence Data: Molecule’s property falls out of scope of training set. In this case, the model predicts with no guarantees. Explainer image has a red background.

Success Data: (a) Falls within scope; explainer has green background. (b) Falls at the edge of scope; explainer has yellow background.

Molecules outside the scope of the training set will be sent to the ‘No Confidence’ port, as a prediction cannot be considered reliable. Specifically, the scope is defined as a range in molecular weight, atom count, polar surface area, and calculated logP from the training set molecules. These ranges are reported in the Floe report. If the trained model had the preprocess button on, it is recommended to do the same.

Inputs¶
Name	Description	Type
Input Small Molecule(s) Dataset to predict property of	The dataset(s) to read records from	Molecule Dataset
Input tensorflow Model	Machine Learning model to predict property	Machine Learing Tensorflow Model Dataset

ML Model Options¶
Name	Description	Type
Model ID of which Tensorflow model to use to predict	Which model to select. Make sure this matches with input Model ID	Int
Preprocess Molecule	Preprocess by Neutral Ph, Largest Mol, Blockbuster Filter	Bool
Apply Blockbuster filter	For every molecule, stores only largest component, adjusts ionization to Neutral Ph	Bool

Explanation and Validation¶
Name	Description	Type
Property Validation Field	If the dataset has a baseline, the floe reports a comparison between prediction in Floereport	Float
Molecule Explainer Type	Select explainer visualisation. Atom: annotate atoms only, Fragment: Annotate Fragments, Combined: Annotate Both	List

Outputs¶
Name	Description	Type
Output Property	Output dataset to write to	Dataset
Failure Property	Output dataset to write to	Dataset