Machine Learning Model Building Documentation
Introduction
Tutorials
Tutorials to Build Machine Learning Models
- ML Build: Regression Model with Tuner using Fingerprints for Small Molecules
- ML Build: Cheaper and Faster Machine Learning Regression Models with Tuner using Fingerprints for Small Molecules
- ML Build: Classification Model with Tuner using Fingerprints for Small Molecules
- ML Build: Regression Model with Tuner on User-Based Feature Vector Input
- ML ReBuild: Transfer Learn ML Regression Model Using Fingerprints for Small Molecules
Tutorials to Predict Molecules based on trained Machine Learning Models
- ML Predict: Use Pretrained Regression Models to Predict Properties of Molecules
- ML Predict: Use Pretrained Classification Fingerprint Model to Predict Properties of Molecules
- ML Predict: Regression using Feature Input Floe
- Predict the Solubility of Small Molecules
- Predict hERG Toxicity of Drug-Like Molecules
How-To Guides
Floe Reference Documentation
- Data Processing of Small Molecules for ML Model Building
- ML Build: Regression Model with Tuner using Fingerprints for Small Molecules
- ML Build: Classification Model with Tuner using Fingerprints for Small Molecules
- ML Build: Regression Model with Tuner using Feature Input
- Solubility Prediction for Small Molecules using ML and Cheminfo Fingerprints
- ML Predict: Regression using Fingerprints for Small Molecules
- ML Predict: Classification using Fingerprints for Small Molecules
- ML Predict: Regression using Feature Input
- ML ReBuild: Transfer Learn ML Regression Model using Fingerprints for Small Molecules
- hERG Toxicity Prediction for Small Molecules using ML and Cheminfo Fingerprints
Theory
FAQs
- Frequently Asked Questions
- For the model building floes, how do you compare multiple models to decide which one is the best?
- How good is your solubility model?
- Does the model use or train on 3D features?
- The inputs are still based on expert parameters such as fingerprints, which are all biased by the rules defined by the expert user. Any insights into how to overcome this flaw?
- Are the predictions for crystalline solubility? Is the data for crystals?
- What data was used to train hERG toxicity Model?
- Why do you prefer neural networks instead of something like XGBoost?
- Neural networks don’t always show good performance in low data regimes. What measures do you take to improve performance?
- How are the confidences computed?
- What methods do you find most effective to eliminate overfitting of your models?
- What percentages do the solubility predictions have for high or medium confidences?
- In the model builder floes, am I restricted to fingerprints, or can I add other descriptors, such as molecular properties or other calculated or experimental parameters?