In this liveProject, you’ll use the Julia language to build a classification-based machine learning model that can predict the salary of a customer based on their sociodemographic data. This model will then be used to serve premium advertising to wealthier customers. You’ll build and evaluate XGBoost models with the dedicated Julia XGBoost.jl package, tune the hyperparameters, and assess your model’s capabilities using ROC curve, and measures such as AUC, accuracy, recall, and precision.
This project is designed for learning purposes and is not a complete, production-ready application or solution.
This liveProject is for experienced data scientists and data analysts who are interested in building their skills in Julia. To begin this liveProject, you will need to be familiar with:
- Basics of Jupyter Notebook
- Basics of Julia and intermediate experience another high-level programming language such as Python or R
- Intermediate usage of XGBoost.jl package
- Basic usage of plotting libraries
- Basics of Arrow data format and DataFrames.jl
- Basic data wrangling
- Basic usage of hash functions
- Basic visualization techniques (histograms, barplots)
- Basics of command pipelines
- Basic serialization
- Intermediate measuring of feature importance
you will learn
In this liveProject, you’ll learn to use Julia to build powerful binary classification models.
- Comparing distributions of predefined train and test data
- Building XGBoost model in Julia
- Evaluating classification model’s quality
- Analyzing feature importance based on produced model