Math for Machine Learning

Principal Component Analysis you own this product

This project is part of the liveProject series Math for Machine Learning
prerequisites
intermediate Python (particularly NumPy) • basics of linear algebra (particularly systems of linear equations and matrices)
skills learned
algorithm optimization with principal component analysis (PCA) • matrix manipulation with NumPy • nuances of scikit-learn's PCA library
Nicole Königstein
1 week · 6-8 hours per week · INTERMEDIATE

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


Look inside

Step into the role of data scientist at Finative, an analytics company that uses environmental, social, and governance (ESG) factors to measure companies’ sustainability, a brand new, eco-focused trend that's changing the way businesses think about investing. To provide its clients with the valuable insights they need in order to develop their investment strategies, Finative analyzes a high volume of data using advanced natural language processing (NLP) techniques.

Recently, your CEO has decided that Finative should increase its own sustainability. Your task is to develop a method to optimize the runtime for the company’s machine learning models. You’ll apply principal component analysis (PCA) to the data in order to speed up the ML models. To classify handwritten digits and prove your theory that PCA speeds up ML algorithms, you’ll implement logistic regression with scikit-learn. You’ll use the explained variance ratio to gain an understanding of the trade-offs between speed and accuracy. When you’re done, you’ll be able to present your CEO with proof of PCA’s efficiency in optimizing runtime.

This project is designed for learning purposes and is not a complete, production-ready application or solution.

book resources

When you start your liveProject, you get full access to the following books for 90 days.

project author

Nicole Konigstein

Nicole Königstein currently works as data science and technology lead at impactvise, an ESG analytics company, and as a quantitative researcher and technology lead at Quantmate, an innovative FinTech startup that leverages alternative data as part of its predictive modeling strategy. She’s a regular speaker, sharing her expertise at conferences such as ODSC Europe. In addition, she teaches Python, machine learning, and deep learning, and holds workshops at conferences including the Women in Tech Global Conference.

prerequisites

This liveProject is for ML engineers, intermediate-level Python programmers, and early-stage data scientists who want to gain an understanding of the mathematical foundations of PCA and how they can use this simple, yet powerful, algorithm in their own projects. To begin these liveProjects you’ll need to be familiar with the following:

TOOLS
  • Intermediate Python (declaring variables, loops, branches, working with arrays)
  • How to use Jupyter Notebook
  • Understanding of vectors and matrices
  • Basic familiarity with NumPy (indexing arrays, array creation, and manipulation)
  • Basic familiarity with scikit-learn (how to import and use classes such as sklearn.decomposition)
TECHNIQUES
  • Basic linear algebra
  • Basic statistics
  • Basic data science

you will learn

In this liveProject, you’ll learn to improve the runtime of ML models by using principal component analysis (PCA) to reduce the dimensionality of your data.

  • Fundamental linear algebra techniques used to compute PCA
  • Use NumPy to transition your newly gained mathematical knowledge into code
  • Apply scikit-learn’s PCA library and learn about its nuances
  • Understand the benefits of dimensionality reduction and the trade-offs between speed and accuracy

features

Self-paced
You choose the schedule and decide how much time to invest as you build your project.
Project roadmap
Each project is divided into several achievable steps.
Get Help
While within the liveProject platform, get help from other participants and our expert mentors.
Compare with others
For each step, compare your deliverable to the solutions by the author and other participants.
book resources
Get full access to select books for 90 days. Permanent access to excerpts from Manning products are also included, as well as references to other resources.

choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • Principal Component Analysis project for free