AI-Powered Search
Trey Grainger
  • MEAP began September 2019
  • Publication in Early 2021 (estimated)
  • ISBN 9781617296970
  • 325 pages (estimated)
  • printed in black & white
Great search is all about delivering the right results. Today’s search engines are expected to be smart, understanding the nuances of natural language queries, as well as each user’s preferences and context. AI-Powered Search teaches you the latest machine learning techniques to create search engines that continuously learn from your users and your content, to drive more domain-aware and intelligent search. Written by Trey Grainger, the Chief Algorithms Officer at Lucidworks, this authoritative book empowers you to create and deploy search engines that take advantage of user interactions and the hidden semantic relationships in your content to constantly get smarter and automatically deliver better, more relevant search experiences.

About the Technology

The search box has become the de facto user interface for modern data-driven applications. Users expect software to fully understand their search inputs, context, and activity, and to return the right answers every time. Fortunately, you no longer need a massive team manually adjusting relevancy parameters to deliver optimal search results. Using the power of AI, you can develop search solutions that dynamically learn from your content and users, constantly getting smarter and delivering better answers.

About the book

AI-Powered Search is an authoritative guide to applying leading-edge data science techniques to search. It teaches you how to build search engines that automatically understand the intention of a query in order to deliver significantly better results. Author Trey Grainger helped develop numerous algorithms now transforming search, and is an expert on leading techniques for crowdsourced relevancy and semantic search. Working through code in interactive notebooks, you’ll deploy intelligent search systems that deliver real-time personalization and contextual understanding of each user, domain, and query through a self-learning search platform that can tune its own results automatically.
Table of Contents detailed table of contents

Part 1: Modern Search Relevance

1 Introducing AI-powered search

1.1 Searching for User Intent

1.1.1 Search Engines

1.1.2 Recommendations Engines

1.1.3 The Information Retrieval Continuum

1.1.4 Semantic Search and Knowledge Graphs

1.1.5 Understanding the Dimensions of User Intent

1.2.1 Apache Lucene: the core search library powering Apache Solr and Elasticsearch

1.2.2 Apache Solr: the open sourced, community-driven, relevance-focused search engine

1.2.3 Elasticsearch: the most used, anaytics-focused, full text search engine

1.2.4 Lucidworks Fusion: the out-of-the-box AI-powered search Engine

1.2.5 Apache Spark: the standard for large-scale data processing

1.2.6 Strategy for this book

1.3.1 Targeted Skillsets and Occupations

1.3.2 System Requirements for Running Code Examples

1.5 How does AI-powered search work?

1.5.1 The Core Search Foundation

1.5.2 Reflected Intelligence through Feedback Loops

1.5.3 Curated vs. Black-box AI

1.5.4 Architecture for an AI-powered search engine

1.6 Summary

2 Working with natural language

2.1 The myth of unstructured data

2.1.1 Types of unstructured data

2.1.2 Data types in traditional structured databases

2.1.3 Joins, fuzzy joins, and entity resolution in unstructured data

2.2 The structure of natural language

2.3 Distributional semantics and word embeddings

2.4 Modeling domain-specific knowledge

2.5.1 The challenge of ambiguity (polysemy)

2.5.2 The challenge of understanding context

2.5.3 The challenge of personalization

2.5.4 Challenges interpreting queries vs. documents

2.5.5 Challenges interpreting query intent

2.7 Summary

3 Ranking and Content-based Relevance

3.1 Scoring query and document vectors with cosine similarity

3.1.1 Mapping text to vectors

3.1.2 Calculating similarity between dense vector representations

3.1.3 Calculating similarity between sparse vector representations

3.1.4 Term Frequency (TF): measuring how well documents match a term

3.1.5 Inverse Document Frequency (IDF): measuring the importance of a term in the query

3.1.6 TF-IDF: a balanced weighting metric for text-based relevance

3.2 Controlling the relevance calculation

3.2.1 BM25: Lucene’s default text-similarity algorithm

3.2.2 Functions, functions, everywhere!

3.2.3 Choosing multiplicative vs. additive boosting for relevance functions

3.2.4 Differentiating matching (filtering) vs. ranking (scoring) of documents

3.2.5 Logical matching: weighting the relationships between terms in a query

3.2.6 Separating concerns: filtering vs. scoring

3.3 Implementing user and domain-specific relevance ranking

3.4 Summary

4 Crowdsourced Relevance

4.1 Intro to Crowdsourced relevance

4.2 Working with User Signals

4.2.1 Signals vs. Content

4.2.2 Setting up our product and signals datasets (RetroTech)

4.2.3 Exploring the signals data

4.2.4 Modeling users, sessions, and requests

4.3 Introduction to Reflected Intelligence

4.3.1 What is Reflected Intelligence?

4.3.2 Popularized Relevance through Signals Boosting

4.3.3 Personalized Relevance through Collaborative Filtering

4.3.4 Generalized Relevance through Learning to Rank

4.3.5 Other reflected intelligence models

4.3.6 Crowdsourcing from content

4.4 Summary

Part 2: Learning domain-specific context

5 Knowledge graph learning

6 Learning domain-specific language

7 Interpreting query intent through semantic search

Part 3: Reflected Intelligence

8 Signal boosting models

9 Personalized search

10 Learning to rank

Part 4: The Search Frontier

11 Automated learning with click models

12 Thought vectors and embeddings

13 Emerging AI-powered search paradigms


Appendix A: Running the Code Examples

A.1 Overall Structure of Code Examples

A.2 Pulling the source code

A.3 Building and running the code

A.4 Working with Jupyter

A.5 Working with Docker

What's inside

  • Using reflected intelligence to continually learn and improve search relevancy
  • Natural language search with automatically-learned knowledge graphs
  • Semantic search with domain-specific terms, phrases, concepts, and relationships
  • Personalized search utilizing user behavioral signals and learned user profiles
  • Automated Learning to Rank (machine-learned ranking) from user signals
  • Word embeddings, vector search, question answering, image and voice search, and other modern search paradigms

About the reader

For software developers or data scientists familiar with the basics of search engine development.

About the author

Trey Grainger is the Chief Algorithms Officer at Lucidworks, the AI-powered search company that powers hundreds of the world’s leading organizations. Trey co-authored Solr in Action and has over 12 years experience building semantic search engines, recommendation engines, real-time analytics systems, and leading related engineering and data science teams.

placing your order...

Don't refresh or navigate away from the page.
Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.
print book $24.99 $69.99 pBook + eBook + liveBook
Additional shipping charges may apply
AI-Powered Search (print book) added to cart
continue shopping
go to cart

eBook $19.99 $55.99 3 formats + liveBook
AI-Powered Search (eBook) added to cart
continue shopping
go to cart

Prices displayed in rupees will be charged in USD when you check out.

FREE domestic shipping on three or more pBooks