Math for Machine Learning

Analyze Reports with Hugging Face you own this product

This project is part of the liveProject series Math for Machine Learning
intermediate Python (particularly NumPy and pandas) • matrix multiplication • derivatives and chain rule
skills learned
understand the attention mechanism (Transformers) • build a model pipeline with Hugging Face • extract and prepare data from PDF files
Nicole Königstein
1 week · 6-8 hours per week · INTERMEDIATE

placing your order...

Don't refresh or navigate away from the page.
liveProject This project is part of the liveProject series Math for Machine Learning liveProjects give you the opportunity to learn new skills by completing real-world challenges in your local development environment. Solve practical problems, write working code, and analyze real data—with liveProject, you learn by doing. These self-paced projects also come with full liveBook access to select books for 90 days plus permanent access to other select Manning products. $19.99 $29.99 you save $10 (33%)
Analyze Reports with Hugging Face (liveProject) added to cart
continue shopping
adding to cart

choose your plan


only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free eBook every time you renew
  • choose twelve free eBooks per year
  • exclusive 50% discount on all purchases
  • Analyze Reports with Hugging Face eBook for free
Look inside

You’re a data scientist at Finative, an environmental, social, and governance (ESG) analytics company that analyzes a high volume of data using advanced natural language processing (NLP) techniques in order to provide its clients insights for sustainable investing. Recently, your CEO has decided that Finative should increase its own financial sustainability. Your task is to classify sustainability reports of a publicly traded company in an efficient and sustainable way.

You’ll learn the fundamental mathematics—including backpropagation, matrix multiplication, and attention mechanisms—of Transformers, empowering you to optimize your model’s performance, improve its efficiency, and handle undesirable model predictions. You’ll use Python’s pdfplumber library to extract text from a sustainability report for quick delivery to your CEO. To further increase efficiency, you’ll save training time by using a language model that’s been pre-trained with ESG data to build a pipeline for the model and classify the sustainability report.

This project is designed for learning purposes and is not a complete, production-ready application or solution.

book resources

When you start your liveProject, you get full access to the following books for 90 days.

project author

Nicole Königstein

Nicole Königstein currently works as data science and technology lead at impactvise, an ESG analytics company, and as a quantitative researcher and technology lead at Quantmate, an innovative FinTech startup that leverages alternative data as part of its predictive modeling strategy. She’s a regular speaker, sharing her expertise at conferences such as ODSC Europe. In addition, she teaches Python, machine learning, and deep learning, and holds workshops at conferences including the Women in Tech Global Conference.


This liveProject is for ML engineers, intermediate-level Python programmers, and early-stage data scientists who are familiar with the basics of linear algebra. To begin these liveProjects you’ll need to be familiar with the following:

  • Intermediate Python (declaring variables, loops, branches, working with arrays)
  • How to use Jupyter Notebook
  • Understanding of vectors, matrices, and derivatives
  • Basic familiarity with NumPy (indexing arrays, array creation, and manipulation)
  • Basic familiarity with scikit-learn (how to import and use classes such as sklearn.decomposition)
  • Basic linear algebra
  • Basic calculus
  • Basic data science

you will learn

In this liveProject, you’ll learn the fundamental mathematics of backpropagation, and you’ll get an understanding of the inner workings of a Transformer-based deep learning network in order to effectively and efficiently classify documents.

  • Write NumPy expressions to explain the attention mechanism
  • Easily extract information from PDF files using pdfplumber and prepare it for the model
  • Use a pre-trained large language model to build a pipeline and classify documents


You choose the schedule and decide how much time to invest as you build your project.
Project roadmap
Each project is divided into several achievable steps.
Get Help
While within the liveProject platform, get help from other participants and our expert mentors.
Compare with others
For each step, compare your deliverable to the solutions by the author and other participants.
book resources
Get full access to select books for 90 days. Permanent access to excerpts from Manning products are also included, as well as references to other resources.