Three-Project Series

ML Feature Engineering and Modeling using Python

you own this product

prerequisites: intermediate Python and scikit-learn • basics of Jupyter Notebook, pandas, Matplotlib, SQL, and machine learning
skills learned: process and identify useful features to train your model • build a scalable process to score new data • make your features reusable using an ML feature store

Jayesh Patel

3 weeks · 6-8 hours per week average · BEGINNER

Included with a Manning Online subscription

catalog / Data Science

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

whole series

for $38.99

$59.99 $38.99

you save $21.00 (35%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

whole series

$59.99 $38.99

you save $21.00 (35%)

In this series of liveProjects, you’ll expand your understanding of machine learning feature engineering by building a ML application that predicts and diagnoses diabetes. You’ll work to process ML features, identify useful features to train your model, and build a scalable process to score new data. Once you’ve developed these features, you will redesign your solution to make your features reusable using an ML feature store. Each liveProject in this series is focused on a different aspect of machine learning feature engineering, so you can pick and choose what’s most relevant to your work.

go to series

These projects are designed for learning purposes and are not complete, production-ready applications or solutions.

here's what's included

Project 1 Creating Features

In this liveProject, you’ll process raw data to make it ready for a machine learning model to diagnose diabetes rates. You’ll use feature engineering techniques to generate ML features from raw data. To make sense of your data, you will undertake data profiling, exploratory data analysis, analyze independent/dependent variables, and visualize data patterns. You’ll evaluate the correlation between dependent and independent variables to identify relevant features. You’ll even generate additional features as needed. Additionally, you will apply feature engineering techniques such as treating missing values and outliers to make your features ready for model training.

learn more

$29.99 $19.49

Project 2 Train and Score with Raw Data

In this liveProject, you’ll train and evaluate a machine learning model for diagnosing diabetes, and set up a pipeline for your model to run effectively. You’ll start by exploring sample data, processing features, and performing common feature engineering techniques for treating outliers or missing data. After dividing your dataset into training and testing data, you’ll train a logistic regression model using scikit-learn. You will then retrain the model with a different set of features. Finally, you’ll pick a model for scoring and build a scoring pipeline. You will test your scoring process on a scoring dataset.

learn more

$29.99 $19.49

Project 3 Train and Score with Feature Store

In this liveProject, you’ll train your model and build a scoring pipeline using an ML feature store. You’ll explore a sample data set for diagnosing diabetes, generate new features and store them in a feature store, train and retrain ML models, and build a scoring process. You’ll employ common feature engineering techniques to train the model, then test and retrain it as needed. You’ll also work on setting up a scoring pipeline, and brainstorm ML development using a feature store. In this project, you will learn how to store the features for a machine learning model so they can be reused in other machine learning projects.

learn more

$29.99 $19.49

books resources

When you start each of the projects in this series, you'll get full access to the following books for 90 days.

go to series

whole series

$59.99 $38.99

you save $21.00 (35%)

choose your plan

pro

monthly

annual

$24.99

$249.99
only $20.83 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose another free product every time you renew
choose twelve free products per year
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime
renews annually, pause or cancel renewal anytime
ML Feature Engineering and Modeling using Python project for free

team

monthly

annual

$49.99

$499.99
only $41.67 per month

five seats for your team
access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose another free product every time you renew
choose twelve free products per year
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime
renews annually, pause or cancel renewal anytime
ML Feature Engineering and Modeling using Python project for free

more seats?

project author

Jayesh Patel

Jayesh Patel is a strategic big data leader and proven architect who successfully designed complex data processes, architected machine learning pipelines, and developed big data analytics solutions over the past 15+ years. He currently works for Rockstar Games, architecting data-driven big data platforms and artificial intelligence solutions to keep players engaged in Red Dead Redemption II and Grand Theft Auto V. He is an active senior member of the IEEE. His expertise and research in the big data space are well received in numerous international IEEE conferences. He is an editorial board member of a renowned international journal. He actively guides and reviews the research work of other scholars and professors around the world. He completed his master’s from San Diego State University in 2009.

Prerequisites

These liveProjects are for learners who are familiar with Python, have a general understanding of basic machine learning techniques, have some knowledge of data modeling, and have a basic knowledge of dealing with ML pipelines and models.

TOOLS

Intermediate Python (file processing, data frames, data processing)
Basics of Jupyter Notebook
Basics of Matplotlib
Basics of pandas
Intermediate scikit-learn (pre-processing capabilities, ML models, pipelines)
Basics of SQL

TECHNIQUES

Basic file processing
Intermediate data processing and feature engineering
Intermediate machine learning pipelines
Basic understanding of ML development cycles
Basic understanding of classification with linear regression
Basic understanding of classification with logistic regression

features

Self-paced: You choose the schedule and decide how much time to invest as you build your project.
Project roadmap: Each project is divided into several achievable steps.
Get Help: While within the liveProject platform, get help from fellow participants and even more help with paid sessions with our expert mentors.
Compare with others: For each step, compare your deliverable to the solutions by the author and other participants.
book resources: Get full access to select books for 90 days. Permanent access to excerpts from Manning products are also included, as well as references to other resources.