In this liveProject, you’ll use the ALBERT variation of the BERT Transformer to detect occurrences of hate speech in a data set. The ALBERT model uses fewer parameters than BERT, making it more suitable to the unstructured and slang-heavy text of social media. You’ll load this powerful pretrained model using the Hugging Face library and fine-tune it for your specific needs with PyTorch Lightning. As falsely tagging hate speech can be a big problem, the success of your model will involve calculating and optimizing its precision score. Your final product will run as a notebook on a GPU in the Google Colab environment.
This project is designed for learning purposes and is not a complete, production-ready application or solution.
This liveProject is for intermediate Python and NLP practitioners who are interested in implementing pretrained BERT architectures, and customizing them to solve real-world NLP problems. To begin this liveProject you will need to be familiar with:
- Intermediate Python
- Intermediate PyTorch
- Basics of Google Colab
- Basics of machine learning
- Basics of neural networks
- Basics of natural language processing
you will learn
In this liveProject, you will develop hands-on experience in building a text classifier using PyTorch Lightning and Hugging Face. You’ll also get practical experience working on GPUs in the Google Colab environment.
- Working with Jupyter Notebook on Google Colab
- Loading and preprocessing a text data set
- Tokenizing data using pretrained tokenizers
- Creating dataloaders and tensor data sets
- Loading and configuring pretrained ALBERT model using Hugging Face
- Building and training a text classifier using PyTorch Lightning
- Validating the performance of the model by optimizing its precision score