In this liveProject, you’ll step into the role of a Natural Language Processing Specialist working in the Growth Hacking Team of a new video game startup. Your team wants to massively accelerate your company’s early growth by acquiring huge numbers of customers at the lowest possible cost. To help tailor marketing messages, your boss has asked you to map the market and find out how customers evaluate your competitors’ products. Your challenge is to create a sentiment analyzer that will give a deeper understanding of customer feedback and opinions. To do this, you’ll need to download and create a dataset from Amazon reviews, build an algorithm that will determine whether a review is positive or negative, evaluate your analyzer's performance against star ratings, and lay out your findings in a report for your manager.
This project is designed for learning purposes and is not a complete, production-ready application or solution.
This liveProject is for intermediate Python programmers who are familiar with data science. You will need to know the basics of statistics and machine learning. Previous encounters with NLP, neural networks, and PyTorch will be useful, but not essential. You’ll use the Google Collaboratory (Colab) environment for this project to access a free cloud-based GPU. To get the most out of the project, you should be familiar with:
- Python standard library
- Basics of pandas, min. version 1.1.5
- Basics of Jupyter Notebook
- Basics of Colab
- Basics of scikit-learn, min. version 1.0.1
- Basics of machine learning
- Basics of neural networks
you will learn
In this liveProject, you’ll learn the foundational techniques of an NLP Specialist using the Python data ecosystem. The sentiment analysis skills you’ll learn are all easily transferable to other common NLP projects.
- Creating a data corpus from text reviews
- Sampling from imbalanced data
- Finding sentiment value using NLTK and dictionary-based sentiment analysis tools
- Data evaluation with scikit-learn
- Analyzing reviews using PyTorch and deep learning
- Comparing classifier performance
- Transformers-based language models
- Visualizing findings and presenting a formal report