ML Predict: Regression using Feature Input

A floe that predicts a property of small, drug-like molecules using a pretrained ML(machine learning) model.

It runs a Tensorflow based Fully-connected Neural Network regression Model for prediction. This model which needs to be provided by user can be generated using the ML Regression Model Building using Feature Input floe. Every molecule need User provided features as float vectors as inputs.

Uses ConvexBox Approach Domain of Application prediction. The Input Tensorflow dataset also contains a model agnostic system, to explain the predictions on the molecule.

Very cheap and quick. Takes about a cent for property prediction of 50 molecules.

Outputs:

Failure Data: (a) Molecule is too large or too small. or, (b) Molecule has an unknown atom.

No confidence Data: Molecule’s property falls out of scope of training set. In this case, the model predicts with no guarantees. Explainer image has a red background.

Success Data: (a) Falls within scope; explainer has green background. (b) Falls at the edge of scope; explainer has yellow background.

Molecules outside the scope of the training set will be sent to the ‘No Confidence’ port, as a prediction cannot be considered reliable. Specifically, the scope is defined as a range in molecular weight, atom count, polar surface area, and calculated logP from the training set molecules. These ranges are reported in the Floe report.

Inputs

Name

Description

Type

Input Small Molecule(s) Dataset
to predict property of

The dataset(s) to read records from

Molecule Dataset

Input tensorflow Model

Machine Learning model to predict property

Machine Learing Tensorflow Model Dataset

Machine Learning Model Options

Name

Description

Type

Model ID of which Tensorflow model to use to predict

Which model to select. Make sure this matches with input Model ID

Int

Preprocess Molecule

For every molecule, stores only largest component, adjusts ionization to Neutral Ph

Bool

Apply Blockbuster filter

Apply blockbuster filter

Bool

Number of features to explain

Number of top features to provide LIME explanations for

Int

Explanation and Validation

Name

Description

Type

Property Validation Field

If the dataset has a baseline, the floe reports
a comparison between prediction in Floereport

Float

Custom Feature

Field containing feature vector to train model on.

FloatVec

Outputs

Name

Description

Type

Output Property

Output dataset to write to

Dataset

Failure Property

Output dataset to write to

Dataset