ML Classifier Model Building using Fingerprints for Small Molecules¶
‘ML Classifier Model Building using Fingerprints for Small Molecules’ is a floe that train multiple Neural Network Classifier models on physical properties of small molecules. It builds Machine Learning models for all possible combination of cheminformatics and neural network hyperparameters provided below. Generates floe report containing details of the best models built. User can pick any model and use it to predict properties of other molecules in a separate floe (Predict Physical Properties). The floe report presents detailed statistics on the hyperparameters so as to tweak them and build better models (See documentation). In addition to prediction, the built models provide explanation of predictions and confidence interval. NOTE: This floe by default, builds about 1k machine learning models. On a large dataset, this maybe pricey. Refer to documentations on how to build a cheaper version of the same
Name |
Description |
Type |
---|---|---|
Input Small Molecules to train
machine learning models on.
|
Input dataset file with each record containing
molecule and response value(String) to train on
|
Molecule Dataset |
Name |
Description |
Type |
---|---|---|
Response Value Field |
Name of the field containing the primary data being trained on and predicted. |
Float |
Number of Models to show in Floe report |
How many best models to provide in FloeReport. By default, keeps best
20 models (based on Acc) such that it meet memory requirement
|
Int |
Preprocess Molecule |
Preprocess by Neutral Ph, Largest Mol, Blockbuster Filter |
Bool |
Apply Blockbuster filter |
For every molecule, stores only largest component, adjusts ionization to Neutral Ph |
Bool |
Molecule Explainer Type |
Select explainer visualization. Atom: annotate atoms only,
Fragment: Annotate Fragments, Combined: Annotate Both
|
List |
Name |
Description |
Type |
---|---|---|
Min Radius |
Minimum radius for cheminfo fingerprint. |
IntVec |
Max Radius |
Maximum radius for cheminfo fingerprint. |
IntVec |
Bit Length of FP |
Bit Length of cheminfo fingerprint |
IntVec |
Type of FP |
Type of cheminfo fingerprints |
IntVec |
Name |
Description |
Type |
---|---|---|
Dropouts |
List of dropout hyperparameters. |
FloatVec |
Sets of Hidden Layers |
list(s) of hidden layers separated by -1. Input and output layer will be determined by data.
Eg: 150,100,50 will create NN with 3 hidden layers of size 150, 100, 50.
|
IntVec |
Sets of Regularisation Layers |
list(s) of regularisation layers separated by -1.
No regularisation on Input and output layer.
|
FloatVec |
Learning Rates |
List of all the learning rate hyperparameters to train model. |
FloatVec |
Max Epochs |
Maximum number of epochs to train model. |
Int |
Activation |
Activation Functions: ReLU, LeakyReLU, PReLU, tanh, SELU, ELU |
List |
Batch Size |
Batch size for training regressor |
Int |
Name |
Description |
Type |
---|---|---|
Models Built |
Output of Generated Models |
Dataset |
Failure Output |
Output of Failure |
Dataset |