Lucene in Action, Second Edition![]() Erik Hatcher, Otis Gospodnetić, and Michael McCandless MEAP Release: May 2008 Softbound print: November 2009 (est.) | 475 pages ISBN: 1933988177 |
|||
| Pre-Order options* | |||
| Order today and start reading Lucene in Action, Second Edition today through MEAP | |||
| MEAP + Ebook only - $27.50 | |||
| MEAP + Print book (includes Ebook) when available - $44.99 | |||
| Every purchase includes a free ebook of the previous edition! | |||
| For more information, please see the MEAP FAQs page. | |||
| About MEAP Release Date Estimates | |||
Table of Contents, MEAP Chapters & Resources
| Table of Contents | Resources | |
|
1. Meet Lucene - FREE
2. Indexing - AVAILABLE 3. Adding search to your application - AVAILABLE 4. Analysis - AVAILABLE 5. Advanced search techniques - AVAILABLE 6. Extending search - AVAILABLE 7. Parsing common document formats - AVAILABLE 8. Tools and extensions - AVAILABLE |
9. Lucene ports - AVAILABLE
10. Administration and performance tuning - AVAILABLE 11. Case studies Appendix A Installing Lucene Appendix B Lucene index format - AVAILABLE Appendix C Resources Appendix D Lucene Contrib Benchmark - AVAILABLE |
|
DESCRIPTION
When Lucene first hit the scene five years ago, it was nothing short of amazing. By using this open-source, highly scalable, super-fast search engine, developers could integrate search into applications quickly and efficiently. A lot has changed since then—search has grown from a "nice-to-have" feature into an indispensable part of most enterprise applications. Lucene now powers search in diverse companies including Akamai, Netflix, LinkedIn, Technorati, HotJobs, Epiphany, FedEx, Mayo Clinic, MIT, New Scientist Magazine, and many others.
Some things remain the same, though. Lucene still delivers high-performance search features in a disarmingly easy-to-use API. It's still a single compact JAR file (less than 1 MB!). Due to its vibrant and diverse open-source community of developers and users, Lucene is relentlessly improving, with evolutions to APIs, significant new features such as payloads, and a huge (as much as 8x) increase in indexing speed with Lucene 2.3.
And with clear writing, reusable examples, and unmatched advice on best practices, Lucene in Action is still the definitive guide to developing with Lucene.
What's been updated in the Second Edition?
- Updating and deleting documents using IndexWriter
- Using the different LockFactory, DeletionPolicy, MergePolicy and MergeScheduler implementations that have been factored out.
- Using the new autoCommit option in IndexWriter
- Understanding simplifications to Lucene's locking
- Adding payloads to your index and using them with BoostingTermQuery
- Using Function queries
- Using FieldSelector to speed up loading of stored fields
- Using IndexReader.reopen() to efficiently opening a new reader from an existing one
- Measuring performance using the "benchmark" contrib package
- Tuning the indexing or searching speed
- Using threads to gain concurrency
- Managing resources like memory, disk, and file descriptors usage
- Making a backup copy of your index without pausing indexing.
- Debugging common problems
Lucene in Action, Second Edition, completely revises and updates the best-selling first edition and remains the authoritative book on Lucene. This book shows you how to index your documents, including types such as MS Word, PDF, HTML, and XML. It introduces you to searching, sorting, and filtering, and covers the numerous changes to Lucene since the first edition. All source code has been updated to current Lucene 2.3APIs.
About the Author
Erik Hatcher, one of the original Lucene in Action authors, is a committer on the Ant, Lucene, and Tapestry open-source projects, and coauthor of Manning's award-winning Java Development with Ant.
Otis Gospodnetić is a coauthor of the first edition of Lucene in Action. He has been involved with Lucene since 2000 and is also an active member of Apache Solr, Nutch, and Mahout development teams, as well as Lucene Project Management Committee. Otis is a founder of Sematext, a software development and consulting company focused on Lucene, Solr, Nutch, and Hadoop.
Michael McCandless has been building search engines for over a decade. In 1999 he founded iPhrase, a startup providing enterprise search software written in Python and C. When IBM acquired iPhrase in 2005, he became interested to Lucene and started contributing patches, becoming a committer in 2006 and a PMC member in 2008. Michael has a B.S., M.S., and Ph.D. from MIT.
About the Early Access Version
This Early Access version of Lucene in Action, Second Edition enables you to receive new chapters as they are being written. You can also interact with the authors to ask questions, provide feedback and errata, and help shape the final manuscript on the Author Forum
Want to learn More?
Sign up to read more content when it is released and to receive news about this book.


