Look inside
The Python data science ecosystem is a powerful and open-source toolset utilized daily by thousands of data scientists and machine learning engineers. But with so many Python machine learning libraries to choose from, which tool works best for your needs?
In this liveProject, you’ll go hands-on with the scikit-learn and H2O frameworks, using them both to build working machine learning classifiers. You’ll use raw financial data and the tried-and-true random forest model to predict the chance of financial loan defaults. Once you’ve built your models, you'll compare implementations to find out which works best and evaluate your results against existing hard-coded tools.
This project is designed for learning purposes and is not a complete, production-ready application or solution.
prerequisites
This liveProject is for aspiring data scientists and machine learning engineers who want to practice their skills in a real-world environment. To begin this liveProject, you will need to be familiar with:
TOOLS
- Intermediate Python
- Beginner Jupyter Notebook
- Beginner Matplotlib
- Beginner pandas
- Beginner scikit-learn
TECHNIQUES
- Beginner Plotting and visualization
- Beginner Data munging with pandas
you will learn
In this liveProject, you’ll learn core skills of data science and machine learning that are easy to transfer across roles and industries.
- Exploratory data analysis
- Working with pandas DataFrames
- Machine learning feature engineering
- Machine learning modeling with random forests
- Optimizing ML and random forest models
- Model evaluation and comparison
- Deploying a model in a Python module