hERG Toxicity Prediction for Small Molecules Using ML and Cheminfo Fingerprints

A floe that predicts hERG toxicity of small, drug-like molecules as active (toxic) or inactive (nontoxic). It runs a TensorFlow-based fully-connected neural network regression model for predictions. It uses convex box and Monte Carlo based methods for domain of application and error bar predictions. The TensorFlow models have been trained on 2D fingerprints.

Finally, it uses LIME, a model agnostic system to explain hERG toxicity of the molecule(s). The floe is inexpensive and quick, adding about 1.5 seconds for property prediction of 10 molecules.

Outputs:

Failure Dataset : (a) The molecule is too large or too small, or (b) the molecule has an atom not encountered in the training set.

No Confidence Dataset: The molecule is deemed out of scope compared to the training set (details below). In this case, the model predictions are unreliable. The explainer image has a red background.

Success Dataset: (a) The molecule falls within scope; the explainer has a green background, or (b) it falls at the edge of scope; the explainer has a yellow background.

Molecules outside the scope of the training set will be sent to the “No Confidence” port, as a prediction cannot be considered reliable. Specifically, the scope is defined as a range in molecular weight, atom count, polar surface area, and calculated logP from the training set molecules. These ranges are provided in the Floe Report.

Inputs

Name

Description

Type

Input Small Molecule Dataset(s)
to Predict Property of

The dataset(s) to read records from.

Molecule Dataset

Explanation and Validation

Name

Description

Type

Molecule Explainer Type

Select explainer visualization.
Atom: annotate atoms only;
Fragment: annotate fragments;
Combined: annotate both.

List

Property Validation Field

If the dataset has a baseline, the floe provides
a comparison between predictions in the Floe Report.

Float

Outputs

Name

Description

Type

Output hERG Toxicity

Output dataset to write to.

Dataset

No Confidence hERG Toxicity

Output dataset to write to.

Dataset

Failed Dataset Name

Output dataset to write to.

Dataset