Anomaly Detection

Methods for Multidimensional Datasets you own this product

This project is part of the liveProject series Three Anomaly Detection Methods
prerequisites
define functions and classes • use libraries • read documentation
skills learned
PCA method • Mahalanobis method • evaluate algorithms using correlated synthetic anomalies

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • share your subscription with another person
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


Look inside

Preventing operation failures and interruptions is mission-critical at Sigma Corp. The large conglomerate of energy production companies has recently implemented a z-score anomaly detection algorithm that focuses on a single feature. Now that the algorithm has proved its value, members of Sigma have requested additional algorithms that are just as simple to use, but that can handle multidimensional data. As a lead data scientist at Sigma, you’ll implement the Mahalanobis distance (MD) method and the principal component analysis (PCA) method as you build anomaly detection algorithms for multidimensional data. To gauge the performance of your algorithms, you’ll test them against a benchmark dataset as well as synthetic anomalies generated by your own algorithms. When you’re done, you’ll have firsthand experience building anomaly detection algorithms for multidimensional datasets as well as testing anomaly detection algorithms against both benchmark datasets and synthetic anomalies.

This project is designed for learning purposes and is not a complete, production-ready application or solution.

book resources

When you start your liveProject, you get full access to the following books for 90 days.

project author

Sergio Solórzano

Sergio Solórzano holds a PhD in physics from ETH Zürich, where he specialized in computational physics and published various papers on numerical algorithms for physical simulation and analysis. Currently, he’s a senior researcher and developer at Exeon Analytics, developing systems for anomaly detection in cybersecurity.

prerequisites

This liveProject is for beginner data scientists interested in learning to build multidimensional anomaly detection algorithms. To begin these liveProjects you’ll need to be familiar with the following:

TOOLS
  • Basic Python
  • Basic NumPy
  • Basic Matplotlib (or Seaborn or Bokeh)
  • Basic data science
TECHNIQUES
  • Basic testing

you will learn

In this liveProject, you’ll learn to apply the MD and PCA methods to build algorithms for multidimensional anomaly detection.

  • Assemble a working anomaly detection algorithm for multidimensional data using the Mahalanobis distance (MD) method
  • Apply the principal component analysis (PCA) method to find anomalies
  • Understand concepts including covariance matrix, principal values, and principal components
  • Test your algorithms against non-trivial multidimensional synthetic anomalies

features

Self-paced
You choose the schedule and decide how much time to invest as you build your project.
Project roadmap
Each project is divided into several achievable steps.
Get Help
While within the liveProject platform, get help from other participants and our expert mentors.
Compare with others
For each step, compare your deliverable to the solutions by the author and other participants.
book resources
Get full access to select books for 90 days. Permanent access to excerpts from Manning products are also included, as well as references to other resources.

choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • Methods for Multidimensional Datasets project for free