Look inside
Text summarization is a powerful data science technique. It allows vital high-level information to be automatically extracted from reams of text data, without any slow and expensive human analysis.
In this liveProject, you’ll master text summarization techniques for summarizing news data by building a tool that can extract key information about COVID-19 from news articles using the PyTorch deep learning library. Your challenges will include cleaning your data set, building an attention-based deep learning model, and evaluating the success of your model using TensorBoard and ROUGE scores.
This project is designed for learning purposes and is not a complete, production-ready application or solution.
book resources
When you start your liveProject, you get full access to the following books for 90 days.
project author
Nahid Alam
Nahid Alam is a Machine Learning Engineer based out of San Francisco, with over four years experience building Machine Learning applications and AutoML systems that can train and build Machine Learning models in a distributed way. Nahid has a wide range of experience working at startups to fortune 500 companies, Venture Capital and founding startups. She has a Masters Degree in Computer Science from Clemson University.
prerequisites
The liveProject is for intermediate Python programmers who know the basics of data science. To begin this liveProject, you will need to be familiar with:
TOOLS
- Intermediate Python
- Basics of pandas
- Basics of NumPy
- Basics of scikit-learn
- Basics of Jupyter Notebook
- Basics of PyTorch
TECHNIQUES
- Basics of machine learning
- Basics of sequence-to-sequence modeling
- Basics of deep learning concepts
you will learn
In this liveProject, you’ll master PyTorch-based text summarization, a useful and easily transferable data science task. Text summarization can be applied to many real-world applications such as legal document summarization, classified document analysis, spam detection, and more.
- Sequence to sequence models with attention techniques
- Text data cleanup techniques
- Feature generation techniques
- Model building using PyTorch
- Model debugging with TensorBoard
- Analyzing model performance tradeoffs
features
- Self-paced
- You choose the schedule and decide how much time to invest as you build your project.
- Project roadmap
- Each project is divided into several achievable steps.
- Get Help
- While within the liveProject platform, get help from other participants and our expert mentors.
- Compare with others
- For each step, compare your deliverable to the solutions by the author and other participants.
- book resources
- Get full access to select books for 90 days. Permanent access to excerpts from Manning products are also included, as well as references to other resources.