The knowledge and techniques you need.
Solr in Action is a comprehensive guide to implementing scalable search using Apache Solr. This clearly written book walks you through well-documented examples ranging from basic keyword searching to scaling a system for billions of documents and queries. It will give you a deep understanding of how to implement core Solr capabilities.
foreword
preface
acknowledgments
about this book
Part 1 Meet Solr
1. Introduction to Solr
1.1. Why do I need a search engine?
1.2. What is Solr?
1.3. Why Solr?
1.4. Features overview
1.5. Summary
2. Getting to know Solr
2.1. Getting started
2.2. Searching is what it’s all about
2.3. Tour of the Solr administration console
2.4. Adapting the example to your needs
2.5. Summary
3. Key Solr concepts
3.1. Searching, matching, and finding content
3.2. Relevancy
3.3. Precision and Recall
3.4. Searching at scale
3.5. Summary
4. Configuring Solr
4.1. Overview of solrconfig.xml
4.2. Query request handling
4.3. Managing searchers
4.4. Cache management
4.5. Remaining configuration options
4.6. Summary
5. Indexing
5.1. Example microblog search application
5.2. Designing your schema
5.3. Defining fields in schema.xml
5.4. Field types for structured nontext fields
5.5. Sending documents to Solr for indexing
5.6. Update handler
5.7. Index management
5.8. Summary
6. Text analysis
6.1. Analyzing microblog text
6.2. Basic text analysis
6.3. Defining a custom field type for microblog text
6.4. Advanced text analysis
6.5. Summary
Part 2 Core Solr capabilities
7. Performing queries and handling results
7.1. The anatomy of a Solr request
7.2. Working with query parsers
7.3. Queries and filters
7.4. The default query parser (Lucene query parser)
7.5. Handling user queries (eDisMax query parser)
7.6. Other useful query parsers
7.7. Returning results
7.8. Sorting results
7.9. Debugging query results
7.10. Summary
8. Faceted search
8.1. Navigating your content at a glance
8.2. Setting up test data
8.3. Field faceting
8.4. Query faceting
8.5. Range faceting
8.6. Filtering upon faceted values
8.7. Multiselect faceting, keys, and tags
8.8. Beyond the basics
8.9. Summary
9. Hit highlighting
9.1. Overview of hit highlighting
9.2. How highlighting works
9.3. Improving performance using FastVectorHighlighter
9.4. PostingsHighlighter
9.5. Summary
10. Query suggestions
10.1. Spell-check
10.2. Autosuggesting query terms
10.3. Suggesting document field values
10.4. Suggesting queries based on user activity
10.5. Summary
11. Result grouping/field collapsing
11.1. Result grouping vs. field collapsing
11.2. Skipping duplicate documents
11.3. Returning multiple documents per group
11.4. Grouping by functions and queries
11.5. Paging and sorting grouped results
11.6. Grouping gotchas
11.7. Efficient field collapsing with the collapsing query parser
11.8. Summary
12. Taking Solr to production
12.1. Developing a Solr distribution
12.2. Deploying Solr
12.3. Hardware and server configuration
12.4. Data acquisition strategies
12.5. Sharding and replication
12.6. Solr core management
12.7. Managing clusters of servers
12.8. Querying and interacting with Solr
12.9. Monitoring Solr’s performance
12.10. Upgrading between Solr versions
12.11. Summary
Part 3 Taking Solr to the next level
13. SolrCloud
13.1. Getting started with SolrCloud
13.2. Core concepts
13.3. Distributed indexing
13.4. Distributed search
13.5. Collections API
13.6. Basic system-administration tasks
13.7. Advanced topics
13.8. Summary
14. Multilingual search
14.1. Why linguistic analysis matters
14.2. Stemming vs. lemmatization
14.3. Stemming in action
14.4. Handling edge cases
14.5. Available language libraries in Solr
14.6. Searching content in multiple languages
14.7. Language identification
14.8. Summary
15. Complex query operations
15.1. Function queries
15.2. Geospatial search
15.3. Pivot faceting
15.4. Referencing external data
15.5. Cross-document and cross-index joins
15.6. Big data analytics with Solr
15.7. Summary
16. Mastering relevancy
16.1. The impact of relevancy tuning
16.2. Debugging the relevancy calculation
16.3. Relevancy boosting
16.4. Pluggable Similarity class implementations
16.5. Personalized search and recommendations
16.6. Creating a personalized search experience
16.7. Running relevancy experiments
16.8. Summary
Appendix A: Working with the Solr codebase
Appendix B: Language-specific field type configurations
Appendix C: Useful data import configurations
index
About the book
Whether you're handling big (or small) data, managing documents, or building a website, it is important to be able to quickly search through your content and discover meaning in it. Apache Solr is your tool: a ready-to-deploy, Lucene-based, open source, full-text search engine. Solr can scale across many servers to enable real-time queries and data analytics across billions of documents.
Solr in Action teaches you to implement scalable search using Apache Solr. This easy-to-read guide balances conceptual discussions with practical examples to show you how to implement all of Solr's core capabilities. You'll master topics like text analysis, faceted search, hit highlighting, result grouping, query suggestions, multilingual search, advanced geospatial and data operations, and relevancy tuning.
What's inside
- How to scale Solr for big data
- Rich real-world examples
- Solr as a NoSQL data store
- Advanced multilingual, data, and relevancy tricks
- Coverage of versions through Solr 4.7