Voice-First Development
Designing, Developing, and Deploying Conversational Interfaces
Ann Thymé-Gobbel, Ph.D. and Charles R. Jankowski Jr., Ph.D.
  • MEAP began November 2018
  • Publication in Spring 2019 (estimated)
  • ISBN 9781617295461
  • 350 pages (estimated)
  • printed in black & white

A very good book on the unique challenges you face when trying to build programs using voice technology.

William Wade
Voice-commanded applications are everywhere, running on smart speakers like the Amazon Echo and Google Home, digital assistants like Apple’s Siri, speech-based automotive chatbots, and even novelties like the Alexa-enabled Big Mouth Billy Bass. In Voice-First Development, authors Ann Thyme-Gobbel and Charles Jankowski draw from more than three decades of experience in voice-related development and research to bring you up to speed with a host of voice-controlled applications. This engaging guide focuses on end-to-end voice app development, concrete best practices, and how to avoid common pitfalls. Including practical instruction, real-world examples, and lots of code samples, this book is perfect for developers ready to create fully-functioning voice solutions that users will love!
Table of Contents detailed table of contents

Part 1: Voice-First Foundations

1 Voice-first development components

1.1 Voice-first, voice-only, and conversational everything

1.2 Introduction to voice technology components

1.2.1 Speech to text

1.2.2 Natural-language understanding

1.2.3 Dialog management

1.2.4 Natural-language generation

1.2.5 Text to speech

1.3 Introduction to the phases of voice-first development

1.3.1 Plan

1.3.2 Design

1.3.3 Build

1.3.4 Test

1.3.5 Deploy & Assess

1.3.6 Iterate

1.4 Hope is not a strategy — but to plan & execute is

1.5 Summary

2 Keeping voice in mind

2.1 Why voice is different

2.2 Hands-on: A pre-coding thought experiment

2.3 Voice dialog and its participants

2.3.1 Human voice

2.3.2 Computer voice

2.3.3 Human-computer voice dialog

2.4 Summary

3 Running a voice-first application — and noticing issues

Part 2: Planning Voice-First Interactions

4 Defining your vision: Building What, How, and Why for Whom

5 From discovery to UX and UI design: Tools of the voice-first trade

Part 3: Building Voice-First Interactions

6 Applying human 'rules of dialog' to reach voice-first dialog resolution

7 Using disambiguation methods to resolve incomplete and ambiguous requests

8 Conveying reassurance through confirmation, consistency, and wording

9 Addressing findability through navigation and global commands

10 Creating robust recognition coverage for speech-to-word resolution

11 Ensuring shared understanding through parsing and intent resolution

12 Using accuracy strategies to avoid misunderstandings

13 Using error strategies to recover from miscommunication

14 Using world knowledge to improve interpretation and experience

15 Incorporating personalization and customization for broader user appeal

16 Using context and proactive behavior for smarter dialogs

17 Using speaker identity for privacy and security

18 Meeting user expectations through persona and voice

19 Addressing limitations through modality, location, and eco systems

Part 4: Verifying and Deploying Voice-First Interactions

20 Finding and understanding the data that tells you what’s working

21 How users tell you what to improve

22 Voice-first discovery revisited


Appendix A: Future directions and other in-depth topics

Appendix B: Checklists

Appendix C: Documentation templates and samples

Appendix D: References and sources

Appendix E: Terminology

About the Technology

New platforms and tools make voice apps easier to create than ever before. The unfortunate downside is a flood of sub-par apps that leave users frustrated with easily-avoidable bugs, design flaws, and installation glitches. To build voice apps you need intermediate-level skills in a language like Python or JavaScript along with a solid command of how voice-to-machine interactions work. Being voice-first means leveraging knowledge about other modes, like chat, while incorporating voice-specific knowledge to the process. Like any other application style, voice-centric software requires a proven strategy of planning, designing, building, testing, deploying, and assessing until you get it right.

About the book

Voice-First Development is your personal roadmap to developing successful voice applications. In this insightful guide, you’ll get a solid foundation in modern voice technologies and also get your feet wet writing your first speech interaction apps. As you progress, you’ll devise an effective plan for balancing business and product requirements, technology dependencies, and user needs. Through interesting and practical examples, you’ll be immersed in design-informed development with code and techniques that address various characteristics of voice-first interactions. Finally, you'll ensure your apps succeed by onboarding Ann and Charles’s techniques for testing and debugging. This practical tutorial teaches you the most effective strategy of just-in-time actionable steps and tips for making great voice apps, no matter the scope, topic, or users!

What's inside

  • Planning, building, verifying, and deploying voice apps
  • Applying human rules of dialog
  • Accuracy strategies for avoiding misunderstandings
  • Using world knowledge to improve user experiences
  • Error strategies for recovering from miscommunications
  • Using context for smarter dialogs
  • Pitfalls and how to avoid them
  • Real-world examples and code samples in JavaScript and Python

About the reader

For developers with intermediate JavaScript or Python skills.

About the authors

Ann Thyme-Gobbel and Charles Jankowski have worked in speech recognition and natural language understanding for over 30 years. Ann is currently the Voice UI/UX Design Leader at Sound United. She holds a Ph.D. in cognitive science and linguistics from UC San Diego. Charles is currently Director of NLP Application at CloudMinds Technologies. He holds S.B., S.M., and Ph.D. degrees from M.I.T. Together Ann and Charles created a multi-modal conversational natural language interface to assist acute and chronic care patients.

Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.

placing your order...

Don't refresh or navigate away from the page.

FREE domestic shipping on three or more pBooks

The pro-tips in it will make you avoid the most common pitfall, you will create a very solid and chatty application.

Davide Cadamuro

Unlike other books that focus too much on the technical aspects, this book will teach you the underlying concepts of voice systems first.

Peter Friese

This book would save you time by using the experience of the authors to avoid common mistakes and guide you in the process you should follow.

Jose San Leandro