hERG Toxicity Prediction for Small Molecules using ML and Cheminfo Fingerprints¶
A Floe that predicts hERG toxicity of small, drug-like molecules as Active(Toxic) or Inactive(Non toxic). It runs a Tensorflow-based fully-connected neural network regression Model for prediction. Uses ConvexBox and Monte-Carlo based approach for Domain of Application and Error Bar prediction. The Tensorflow models have been trained on 2D Fingerprints.
Finally, it uses LIME, a model agnostic system to explain hERG toxicity of the molecule(s). The Floe is cheap and quick adding about 1.5 seconds for property prediction of 10 molecules.
Outputs:
Failure Dataset : (a) Molecule is too large or too small. or, (b) Molecule has an atom not encountered in the training set.
No Confidence Dataset: Molecule’s deemed out of scope compared to the training set (details below). In this case, the model predictions are unreliable. Explainer image has a red background.
Success Dataset: (a) Falls within scope; explainer has green background. (b) Falls at the edge of scope; explainer has yellow background.
Molecules outside the scope of the training set will be sent to the ‘No Confidence’ port, as a prediction cannot be considered reliable. Specifically, the scope is defined as a range in molecular weight, atom count, polar surface area, and calculated logP from the training set molecules. These ranges are reported in the Floe report.
Name |
Description |
Type |
---|---|---|
Input Small Molecule(s) Dataset
to predict property of
|
The dataset(s) to read records from |
Molecule Dataset |
Name |
Description |
Type |
---|---|---|
Molecule Explainer Type |
Select explainer visualisation.
Atom: annotate atoms only,
Fragment: Annotate Fragments,
Combined: Annotate Both
|
List |
Property Validation Field |
If the dataset has a baseline, the floe reports
a comparison between prediction in Floereport
|
Float |
Name |
Description |
Type |
---|---|---|
Output hERG Toxicity |
Output dataset to write to |
Dataset |
No-confidence hERG Toxicity |
Output dataset to write to |
Dataset |
Failed Dataset Name |
Output dataset to write to |
Dataset |