Look inside
In this liveProject, you’ll apply premade transfer learning models to improve the context understanding of your search. You’ll implement BERT (Bidirectional Encoder Representations from Transformers) to create a semantic search engine. BERT excels in many traditional NLP tasks, like search, summarization and question answering. It will allow your search engine to find documents with terms that are contextually related to what your user is searching for, rather than just exact word occurrence.
This project is designed for learning purposes and is not a complete, production-ready application or solution.
prerequisites
This liveProject is for intermediate Python programmers familiar with the basics of manipulations with strings, lists and dictionaries. To begin this liveProject you will need to be familiar with:
TOOLS
- Intermediate Python
- Basic understanding of conda environments
- Basic PyTorch
TECHNIQUES
- Reading data from and writing to JSON files
- Manipulations with tuples, lists and dictionaries using loops and comprehensions
- Natural language processing tokenization, lemmatization, and cleaning of text data
- Basic NumPy array operations
- Basic understanding of tensor operations and machine learning with PyTorch
you will learn
In this liveProject, you will apply multiple libraries to create a semantic search engine with BERT. BERT’s advantages in a semantic learning environment have revolutionized NLP, as it empowers users to find similar documents based on context.
- Vectorize documents with DistilBERT
- Create an inverted index with Faiss library
- Encode and search documents with a sentence-transformers library
- Fine-tune factual question-answering tasks with pretrained BERT models