Machine Learning Model Building Documentation¶
Introduction and Tutorials¶
- Tutorial: Building Machine Learning Regression Models for Property Prediction of Small Molecules
- Tutorial: Use Pretrained Model to Predict Generic Property of Molecules
- Tutorial: Cheaper and Faster Version of Building Machine Learning Regression Models for Physical Property Prediction of Small Molecules
- Tutorial: Building Machine Learning Classifier Fingerprint Models for Physical Property Prediction of Small Molecules
- Tutorial: Use Pretrained Classification Fingerprint Model to Predict Generic Property of Molecules
- Tutorial: Building Machine Learning Regression Models on Feature Vector Input
- Tutorial: Use Custom Feature Input to predict regression properties
- Tutorial: Use Small Molecule Data Processing Floe to Preprocess ML Data
- Tutorial: Predict Solubility of Druglike Molecules
- Tutorial: Predict hERG Toxicity of Druglike Molecules
- Tutorial: Use Transfer Learning to ReBuild previous Machine Learning Model builds using new data
How-To Guides¶
Floe Reference Documentation¶
- ML Build: Regression Model with Tuner using Fingerprints for Small Molecules
- ML Predict: Regression using Fingerprints for Small Molecules
- Solubility Prediction for Small Molecule using ML and Cheminfo Fingerprints
- ML Build: Classification Model with Tuner using Fingerprints for Small Molecules
- ML Predict: Classification using Fingerprints for Small Molecules
- ML Build: Regression Model using Feature Input
- ML Predict: Regression using Feature Input
- Data Processing of Small Molecule for ML Model Building
- ML ReBuild: Transfer Learn ML Regression Model using Fingerprints for Small Molecules
- hERG Toxicity Prediction for Small Molecules using ML and Cheminfo Fingerprints
Release Notes¶
FAQs¶
- Frequently Asked Questions
- For model building floes, How do we compare multiple models to decide which one is the best?
- How good is your solubility model?
- Does the model use/ train on 3D features?
- The inputs are still based on expert parameters such as fingerprints, which are all biased by what the rules are defined by the expert user. Any insights into how to overcome this flaw?
- Are the predictions for crystalline solubility? Is the data for crystals?
- What made you prefer NNs instead of, say, XGBoost?
- Neural networks don’t always show good performance in low data regimes. What measures do you take to improve performance?
- How the confidences are computed?
- What methods do you find most effective to eliminate overfitting of your models?
- How many percentages can the solubility prediction have high/med confidences?
- In the model builder floe, am I restricted to fingerprint or can I add other descriptors, such as mol properties,or other calculated or experimental parameters?