Five-Project Series

Hands-on Data Science with Julia you own this product

prerequisites: basics of Julia • intermediate scikit-learn • intermediate data wrangling
skills learned: tabular data ingestion and integrity validation • clustering data with k-means and DBSCAN algorithms • calling Python modules from Julia

Łukasz Kraiński and Bogumił Kamiński

5 weeks · 4-6 hours per week average · INTERMEDIATE

Included with a Manning Online subscription

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

whole series

$79.99 $49.99

you save $30.00 (38%)

In this series of liveProjects you’ll explore the exciting scientific computing language Julia, and tackle common data science tasks with its robust machine learning ecosystem. These insightful and engaging projects work through each stage of a data science pipeline, from data preprocessing to building and training machine learning models. You’ll step into the role of a data scientist for a real estate company and develop hands-on experience with Julia—whether you’re working step by step or dipping into the tasks most relevant to your career.

go to series

These projects are designed for learning purposes and are not complete, production-ready applications or solutions.

here's what's included

Project 1 Data Preprocessing

In this liveProject, you’ll test your data wrangling and data processing skills using the Julia language. You’ll step into the role of a data scientist for a real estate company with a new task from your boss—analyze and clean housing and census data for the marketing and sales teams. You’ll employ the popular Julia package DataFrame.jl as well as powerful statistics related libraries to successfully explore these datasets, and prepare them for machine learning.

learn more

$29.99 $18.89

add to cart

Project 2 K-means and DBSCAN Clustering

In this liveProject, you’ll use the Julia language and clustering algorithms to analyze sales data and determine groups of products with similar demand patterns. Clustering is a well-established unsupervised learning technique that’s commonly used to discover patterns and relations in data. You’ll apply k-means and DBSCAN clustering techniques to housing sales data for a retail startup, leveraging your basic Julia skills into mastery of this machine learning task.

learn more

$29.99 $18.89

add to cart

Project 3 Dimensionality Reduction with PCA, t-SNE and UMAP

In this liveProject, you’ll use the Julia programming language and dimensionality reduction techniques to visualize housing sales data on a scatter plot. This visualization will allow the marketing team to identify links and demand patterns in sales, and is also a useful tool for noise reduction or variance analysis. You’ll use the popular PCA algorithm to visualize the sales dataset with overlaid clustering assignments from k-means and DBSCAN methods, and then expand Julia’s capabilities by calling Python modules using the PyCall.jl package. This extra flexibility will allow you to explore the t-SNE and UMAP algorithms which have excellent results for high-dimensional datasets.

learn more

$29.99 $18.89

add to cart

Project 4 Regression Using GLM and DecisionTree

In this liveProject, you’ll use the Julia language to build a regression-based machine learning model that can predict median house value in a neighborhood. You’ll start out with a simple linear regression model to give you a baseline value for quality metrics created with Julia’s package for Generalized Linear Models. You’ll then tune and assess a random forest model, and compare and contrast the two approaches to pick the best results.

learn more

$29.99 $18.89

add to cart

Project 5 Classification with XGBoost

In this liveProject, you’ll use the Julia language to build a classification-based machine learning model that can predict the salary of a customer based on their sociodemographic data. This model will then be used to serve premium advertising to wealthier customers. You’ll build and evaluate XGBoost models with the dedicated Julia XGBoost.jl package, tune the hyperparameters, and assess your model’s capabilities using ROC curve, and measures such as AUC, accuracy, recall, and precision.

learn more

$29.99 $18.89

add to cart

go to series

whole series

$79.99 $49.99

you save $30.00 (38%)

choose your plan

pro

monthly

annual

$24.99

$249.99
only $20.83 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose another free product every time you renew
choose twelve free products per year
exclusive 50% discount on all purchases
Hands-on Data Science with Julia project for free

team

monthly

annual

$49.99

$499.99
only $41.67 per month

five seats for your team
access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose another free product every time you renew
choose twelve free products per year
exclusive 50% discount on all purchases
Hands-on Data Science with Julia project for free

more seats?

project authors

Bogumil Kaminski

Bogumił Kamiński is Head of the Decision Analysis and Support Unit and Chairman of the Scientific Council for the Discipline of Economics and Finance at SGH Warsaw School of Economics. He also holds a position of adjunct professor at the Data Science Laboratory at Ryerson University and is affiliated with Fields Institute (Computational Methods in Industrial Mathematics Laboratory). In the Julia community, he is the owner of the JuliaData organization and a member of JuliaStats and JuliaLang organizations on GitHub. He also contributes to the community as the top answerer for the [julia] tag on Stack Overflow.

Lukasz Krainski

Łukasz Kraiński is a research assistant at the Decision Analysis and Support Unit at SGH Warsaw School of Economics. He is a certified cloud engineer with expertise in Azure and GCP cloud platforms. You can find him at tech conferences speaking about MLOps and AI (MLinPL 2019, PositivTech 2020, Data Driven Innovation 2020). Łukasz is also an active developer and maintainer of Julia packages (CGE.jl, SmartTransitionSim.jl).

Prerequisites

This liveProject is for experienced data scientists and data analysts who are interested in building their skills in Julia. To begin this liveProject, you will need to be familiar with the following:

TOOLS

Basics of Jupyter notebook
Basics of Julia and intermediate knowledge of another high-level programming language such as Python or R

TECHNIQUES

Intermediate data wrangling
Intermediate data visualization
Basics of bootstrapping
Basic usage of command pipelines
Basic usage of functions and control flow
Basic errors and correlation analysis

you will learn

In this liveProject, you’ll learn to use the powerful Julia language and its rapidly developing ecosystem to perform essential data preprocessing tasks.

Tabular data ingestion and integrity validation
Exploratory data analysis using descriptive and graphical techniques
Feature selection and feature engineering
Data cleaning and preprocessing

features

Self-paced: You choose the schedule and decide how much time to invest as you build your project.
Project roadmap: Each project is divided into several achievable steps.
Get Help: While within the liveProject platform, get help from other participants and our expert mentors.
Compare with others: For each step, compare your deliverable to the solutions by the author and other participants.
book resources: Get full access to select books for 90 days. Permanent access to excerpts from Manning products are also included, as well as references to other resources.