Math for Machine Learning

Detect Sentiment with Transformers you own this product

This project is part of the liveProject series Math for Machine Learning
prerequisites
intermediate Python (particularly pandas) • random variables from probability • experiments and events from probability
skills learned
conditional probability • fine-tuning a large language model • hyperparameter optimization • monitor training experiments
Nicole Königstein
1 week · 6-8 hours per week · INTERMEDIATE

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


Look inside

Finative, the environmental, social, and governance (ESG) analytics company you work for, analyzes a high volume of data using advanced natural language processing (NLP) techniques to provide its clients with valuable insights about their sustainability. Your CEO has concerns that some of the companies Finative analyzes may be greenwashing: spreading disinformation about their sustainability in order to appear more environmentally conscious than they actually are.

As a data scientist for Finative, your task is to validate your sustainability reports by creating and analyzing them. You’ll compute conditional probability with Bayes’ Theorem, by hand, to better understand your model’s performance through metrics such as recall and precision. You’ll learn an efficient way to prepare your data from different sources and merge it into one dataset, which you’ll use to prepare tweets. To successfully classify the tweets, you’ll use a pre-trained large language model and fine-tune it using the Hugging Face ecosystem as well as hyperopt and Ray Tune. You’ll use TensorBoard and Weights & Biases to analyze and track your experiments, and you’ll analyze the tweets to determine whether enough negative sentiment exists to indicate that the company you analyzed has been greenwashing its data.

This project is designed for learning purposes and is not a complete, production-ready application or solution.

book resources

When you start your liveProject, you get full access to the following books for 90 days.

project author

Nicole Konigstein

Nicole Königstein currently works as data science and technology lead at impactvise, an ESG analytics company, and as a quantitative researcher and technology lead at Quantmate, an innovative FinTech startup that leverages alternative data as part of its predictive modeling strategy. She’s a regular speaker, sharing her expertise at conferences such as ODSC Europe. In addition, she teaches Python, machine learning, and deep learning, and holds workshops at conferences including the Women in Tech Global Conference.

prerequisites

This liveProject is for ML engineers, intermediate-level Python programmers, and early-stage data scientists who are familiar with the basics of probability. To begin these liveProjects you’ll need to be familiar with the following:

TOOLS
  • Intermediate Python (declaring variables, loops, branches, working with arrays)
  • How to use Jupyter Notebooks and Google Colab
  • Basic familiarity with NumPy (indexing arrays, array creation, and manipulation)
  • Basic familiarity with pandas (how to create and manipulate DataFrames)
TECHNIQUES
  • Probability basics
  • Data science basics

you will learn

In this liveProject, you’ll learn the mathematical principles behind preparing, fine-tuning, and classifying a dataset, and analyzing it for media sentiment.

  • Understand conditional probability and Bayes' Theorem
  • Grasp the concepts of sensitivity and specificity
  • Interpret a confusion matrix
  • Optimize the hyperparameters of a Hugging Face Transformer model using Ray Tune and hyperopt
  • Use Weights & Biases (WandB) to track experiment metrics
  • Track a model's performance with TensorBoard

features

Self-paced
You choose the schedule and decide how much time to invest as you build your project.
Project roadmap
Each project is divided into several achievable steps.
Get Help
While within the liveProject platform, get help from other participants and our expert mentors.
Compare with others
For each step, compare your deliverable to the solutions by the author and other participants.
book resources
Get full access to select books for 90 days. Permanent access to excerpts from Manning products are also included, as well as references to other resources.

choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • Detect Sentiment with Transformers project for free