Methods for Multidimensional Datasets you own this product

define functions and classes • use libraries • read documentation
skills learned
PCA method • Mahalanobis method • evaluate algorithms using correlated synthetic anomalies

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases

lite $19.99 per month

  • access to all Manning books, including MEAPs!


5, 10 or 20 seats+ for your team - learn more

Look inside

Preventing operation failures and interruptions is mission-critical at Sigma Corp. The large conglomerate of energy production companies has recently implemented a z-score anomaly detection algorithm that focuses on a single feature. Now that the algorithm has proved its value, members of Sigma have requested additional algorithms that are just as simple to use, but that can handle multidimensional data. As a lead data scientist at Sigma, you’ll implement the Mahalanobis distance (MD) method and the principal component analysis (PCA) method as you build anomaly detection algorithms for multidimensional data. To gauge the performance of your algorithms, you’ll test them against a benchmark dataset as well as synthetic anomalies generated by your own algorithms. When you’re done, you’ll have firsthand experience building anomaly detection algorithms for multidimensional datasets as well as testing anomaly detection algorithms against both benchmark datasets and synthetic anomalies.

This project is designed for learning purposes and is not a complete, production-ready application or solution.

project author

Sergio Solorzano

Sergio Solórzano holds a PhD in physics from ETH Zürich, where he specialized in computational physics and published various papers on numerical algorithms for physical simulation and analysis. Currently, he’s a senior researcher and developer at Exeon Analytics, developing systems for anomaly detection in cybersecurity.


This liveProject is for beginner data scientists interested in learning to build multidimensional anomaly detection algorithms. To begin these liveProjects you’ll need to be familiar with the following:

  • Basic Python
  • Basic NumPy
  • Basic Matplotlib (or Seaborn or Bokeh)
  • Basic data science
  • Basic testing


You choose the schedule and decide how much time to invest as you build your project.
Project roadmap
Each project is divided into several achievable steps.
Get Help
While within the liveProject platform, get help from other participants and our expert mentors.
Compare with others
For each step, compare your deliverable to the solutions by the author and other participants.
book resources
Get full access to select books for 90 days. Permanent access to excerpts from Manning products are also included, as well as references to other resources.

choose your plan


only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • Methods for Multidimensional Datasets project for free