In this liveProject, you’ll use the Julia language to build a regression-based machine learning model that can predict median house value in a neighborhood. You’ll start out with a simple linear regression model to give you a baseline value for quality metrics created with Julia’s package for Generalized Linear Models. You’ll then tune and assess a random forest model, and compare and contrast the two approaches to pick the best results.
This project is designed for learning purposes and is not a complete, production-ready application or solution.
This liveProject is for experienced data scientists and data analysts who are interested in building their skills in Julia. To begin this liveProject, you will need to be familiar with:
- Basics of Jupyter Notebook
- Basics of Julia and intermediate experience with another high-level programming language such as Python or R
- Basics of GLM.jl, DecisionTree.jl, and HypothesisTests.jl packages
- Basics of plotting libraries
- Basics of Arrow data format and DataFrames.jl
- Basics of data wrangling
- Basics visualization techniques (scatterplots, histograms)
- Basics of bootstrapping
- Basics of command pipelines
- Basic serialization
- Basic statistical hypothesis testing
you will learn
In this liveProject, you’ll put Julia into practice to build simple regression machine learning models that are in demand across industries.
- Build linear regression and random forest models
- Evaluate models and explore their output
- Save model to the file for further reuse
- Transform, manipulate and plot data