click to
read an excerpt

Look inside

read now

Resources

chapter briefs Source Code Errata Book Forum Register your pBook for a free eBook more

Become a
Reviewer

Help us create great books

Taming Text

you own this product

How to Find, Organize, and Manipulate It

Grant S. Ingersoll, Thomas S. Morton, and Andrew L. Farris
Foreword by Liz Liddy

December 2012
ISBN 9781933988382
320 pages

Included with a Manning Online subscription

printed in black & white

available in Chinese, Korean

catalog / Data Science / Data Engineering / Data Management and Organization

resources: Source Code Errata Book Forum Register your pBook for a free eBook

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

pdf, ePub, online

$54.99 $32.99

you save $22.00 (40%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$54.99 $32.99

you save $22.00 (40%)

Look inside

Taming Text is a hands-on, example-driven guide to working with unstructured text in the context of real-world applications. This book explores how to automatically organize text using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization. The book guides you through examples illustrating each of these topics, as well as the foundations upon which they are built.

about the book

There is so much text in our lives, we are practically drowning in it. Fortunately, there are innovative tools and techniques for managing unstructured information that can throw the smart developer a much-needed lifeline. You'll find them in this book.

Taming Text is a practical, example-driven guide to working with text in real applications. This book introduces you to useful techniques like full-text search, proper name recognition, clustering, tagging, information extraction, and summarization. You'll explore real use cases as you systematically absorb the foundations upon which they are built.

Written in a clear and concise style, this book avoids jargon, explaining the subject in terms you can understand without a background in statistics or natural language processing. Examples are in Java, but the concepts can be applied in any language.

what's inside

When to use text-taming techniques
Important open-source libraries like Solr and Mahout
How to build text-processing applications

about the authors

Grant Ingersoll is an engineer, speaker, and trainer, a Lucene committer, and a cofounder of the Mahout machine-learning project. Thomas Morton is the primary developer of OpenNLP and Maximum Entropy. Drew Farris is a technology consultant, soft ware developer, and contributor to Mahout, Lucene, and Solr.

eBook

pdf, ePub, online

$54.99 $32.99

you save $22.00 (40%)

pro $24.99 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose one free eBook per month to keep
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime

lite $19.99 per month

access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more

eBook

$54.99 $32.99

you save $22.00 (40%)

Takes the mystery out of very complex processes.

From the Foreword by Liz Liddy, Dean, iSchool, Syracuse University

Text analysis and processing as it should be: clear, practical, and open source!

David Weiss, Carrot Search s.c.

Shows how to unlock and exploit information locked up in text documents.

Rick Wagner, Red Hat

Teaches text concepts with examples ... makes text search easy.

Doug Warren, Java Web Services

A great overview of tools and techniques for text processing.

Julien Nioche, DigitalPebble, Ltd.

choose your plan

pro

monthly

annual

$24.99

$249.99
only $20.83 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose another free product every time you renew
choose twelve free products per year
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime
renews annually, pause or cancel renewal anytime
Taming Text ebook for free

team

monthly

annual

$49.99

$499.99
only $41.67 per month

five seats for your team
access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose another free product every time you renew
choose twelve free products per year
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime
renews annually, pause or cancel renewal anytime
Taming Text ebook for free

more seats?

choose your plan

pro

monthly

annual

$24.99

$249.99
only $20.83 per month

access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose another free product every time you renew
choose twelve free products per year
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime
renews annually, pause or cancel renewal anytime
Taming Text ebook for free

team

monthly

annual

$49.99

$499.99
only $41.67 per month

five seats for your team
access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
choose another free product every time you renew
choose twelve free products per year
exclusive 50% discount on all purchases
renews monthly, pause or cancel renewal anytime
renews annually, pause or cancel renewal anytime
Taming Text ebook for free

more seats?

Taming Text

pro $24.99 per month

lite $19.99 per month

team

pro $24.99 per month

lite $19.99 per month

team

about the book

what's inside

about the authors

pro $24.99 per month

lite $19.99 per month

team

Add to Reading List

pro

team

pro

team