Time series analysis is an essential tool for data forecasting, allowing data analysts to make predictions about the future events and track relationships between data. In this liveProject, you’ll utilize the Python data ecosystem and time series analysis to analyze the spread of the COVID-19 virus in different parts of the globe.
Your goal is to make near-future predictions about virus spread based on your available data. You’ll start with an exploratory data analysis into the types of data you have access to, establishing the kind of questions it can reasonably answer. You’ll then develop an ARIMA model for time-series forecasting. Finally, you’ll develop an interactive Voilà dashboard using ipywidgets that will allow stakeholders to access and understand your analysis.
This project is designed for learning purposes and is not a complete, production-ready application or solution.
The liveProject is for intermediate Python programmers who know the basics of data science. To begin this liveProject, you will need to be familiar with:
- Basics of NumPy and pandas
- Basics of Matplotlib/seaborn/Plotly
- Basics of Jupyter Notebook and ipywidgets
- Basics of machine learning
- Familiarity with time series analysis
you will learn
In this liveProject, you will learn predictive data analysis and time series techniques that are easy to transfer to a wide range of predictive modelling.
- Using pandas for data manipulation and analysis of COVID-19’s spread
- Using NumPy for core scientific computing and data formatting
- Visualize case numbers and affected countries/regions with Plotly and Matplotlib/seaborn
- Plotting data on an interactive leaflet map
- Making interactive Jupyter notebooks
- Creating interactive reports with Voila
- Univariate Analysis of individual variables
- Bivariate Analysis to understand the relationship between different types of cases
- Data Visualization to look at the number of cases for the worst affected countries
- Time Series Forecasting with ARIMA