Machine Learning Model Building Documentation¶
Introduction and Tutorials¶
- Introduction and Layout of Machine Learning Model Building package
- Tutorial: Building Machine Learning Regression Models for Property Prediction of Small Molecules
- Tutorial: Use Pretrained Model to Predict Generic Property of Molecules
- Tutorial: Cheaper and Faster Version of Building Machine Learning Regression Models for Physical Property Prediction of Small Molecules
- Tutorial: Building Machine Learning Classifier Fingerprint Models for Physical Property Prediction of Small Molecules
- Tutorial: Use Pretrained Classification Fingerprint Model to Predict Generic Property of Molecules
- Tutorial: Building Machine Learning Regression Models on Feature Vector Input
- Tutorial: Use Custom Feature Input to predict regression properties
- Tutorial: Use Small Molecule Data Processing Floe to Preprocess ML Data
- Tutorial: Predict Solubility of Druglike Molecules
- Tutorial: Predict hERG Toxicity of Drug-Like Molecules
- Tutorial: Use Transfer Learning to Rebuild Previous Machine Learning Model Builds Using New Data
How-To Guides¶
Floe Reference Documentation¶
- ML Build: Regression Model with Tuner Using Fingerprints for Small Molecules
- ML Predict: Regression Using Fingerprints for Small Molecules
- Solubility Prediction for Small Molecules Using ML and Cheminfo Fingerprints
- ML Build: Classification Model with Tuner Using Fingerprints for Small Molecules
- ML Predict: Classification Using Fingerprints for Small Molecules
- ML Build: Regression Model Using Feature Input
- ML Predict: Regression Using Feature Input
- Data Processing of Small Molecule for ML Model Building
- ML ReBuild: Transfer Learn ML Regression Model Using Fingerprints for Small Molecules
- hERG Toxicity Prediction for Small Molecules Using ML and Cheminfo Fingerprints
Release Notes¶
FAQs¶
- Frequently Asked Questions
- For model building floes, how do we compare multiple models to decide which one is the best?
- How good is the solubility model?
- Does the model use or train on 3D features?
- The inputs are based on expert parameters such as fingerprints, which are all biased by the rules that are defined by the expert user. Any insights into how to overcome this flaw?
- Are the predictions for crystalline solubility? Is the data for crystals?
- Why do you prefer NNs instead of something like XGBoost?
- Neural networks don’t always show good performance in low data regimes. What measures can you take to improve performance?
- How are the confidences computed?
- What methods are most effective to eliminate overfitting of your models?
- How many percentages can the solubility prediction have for high/med confidences?
- In the model builder floe, am I restricted to fingerprints, or can I add other descriptors, such as molecular properties or other calculated or experimental parameters?