In this liveProject, you’ll take on the role of a data analyst working for the Jones Family philanthropic foundation. The board of directors is interested in learning about the life expectancy of Americans so that they can better target their charitable spending. To help them in their research, they’ve turned to you. Your challenge in this liveProject is to run a regression analysis on demographic data to find factors related to life expectancy, and answer data-mining questions about the distribution of these demographic variables. To do this, you’ll plan your data-mining and regression analysis following the CRISP-DM model, clean and model your data, assess the accuracy of your findings, and present your results—all with open source tools from the Python ecosystem.
This project is designed for learning purposes and is not a complete, production-ready application or solution.
This liveProject is for intermediate Python programmers who know the basics of regression analysis. To begin with this liveProject, you will need to be familiar with:
- Basics of numpy and pandas
- Basics of matplotlib
- Basics of Jupyter notebooks
- Basics of statistics and regression analysis
you will learn
In this liveProject, you’ll learn vital skills for planning and orchestrating a data analysis project. These foundational skills are easy to transfer to almost any data undertaking.
- Collecting, cleaning and exploring data
- Regression modelling and diagnostics
- Evaluating model appropriateness
- Presenting findings as Jupyter notebooks