Voice Applications for Alexa and Google Assistant
Dustin Coates
  • MEAP began December 2017
  • Publication in July 2019 (estimated)
  • ISBN 9781617295317
  • 264 pages (estimated)
  • printed in black & white

The book has absolutely priceless details about making Voice UI and chat UI.

Tiklu Ganguly

There's always someone to talk to! Voice-controlled devices like Amazon Alexa and Google Assistant are everywhere, and the apps that control them are getting more powerful. Whether you're jamming to Spotify, Googling facts, chatting with friends, or reordering supplies from Amazon, great voice apps change how you interact with the web. Voice Applications for Alexa and Google Assistant teaches you how to design, build, and share voice apps.

Table of Contents detailed table of contents

1 Introduction to voice first

1.1 What is voice first?

1.2 Designing for voice UIs

1.3 Anatomy of a voice command

1.3.1 Waking the voice-first device

1.3.2 Introducing natural language processing

1.3.3 How speech becomes text

1.3.4 Intents are the functions of a skill

1.3.5 Training the NLU with sample utterances

1.3.6 Plucking pertinent information from spoken text

1.4 The fulfillment code that ties it all together

1.5 Telling the device what to say

1.6 Summary

2 Building a call and response skill on Alexa

2.1 Skill metadata

2.1.1 Interaction model

2.2 The interaction model

2.2.1 Building the intent

2.2.2 Slots

2.3 Fulfillment

2.3.1 Hosted endpoint

2.3.2 AWS lambda

2.3.3 Coding the fulfillment

2.3.4 WellRestedIntent

2.4 Summary

3 Designing a voice user interface

3.1 Voice user interface fundamentals

3.2 The cooperative principle

3.2.1 Quantity

3.2.2 Quality

3.2.3 Relation

3.2.4 Manner

3.3 VUI planning

3.4 Variety

3.5 Summary

4 Using entity resolution and built-in intents to extend an Alexa skill

4.1 Alexa Skills Kit CLI

4.1.1 Creating an Alexa skill project

4.2 Entity resolution

4.2.1 Fulfillment

4.2.2 Built-in intents

4.2.3 LaunchRequest

4.3 Invoking the skill locally

4.4 Summary

5 Making a conversational Alexa skill

5.1 Creating a conversation

5.1.1 State management

5.1.2 Per-state handlers

5.1.3 Handling the unhandled

5.2 Maintaining long-term information

5.3 Putting it all together

5.3.1 New intents

5.3.2 New utterances

5.3.3 New fulfillment

5.3.4 Correcting a mistake

5.4 Summary

6 VUI and conversation best practices

6.1 Conversations and context

6.1.1 A skill with context

6.2 Intercepting responses and requests

6.2.1 Response interceptors

6.2.2 Request interceptors

6.3 Unit testing

6.4 Summary

7 Using conversation tools to add meaning and usability

7.1 Discourse markers

7.2 Controlling the application’s way of speaking with Speech Synthesis Markup Language

7.2.1 Breaks and pauses

7.2.2 Prosody

7.2.3 amazon:effect

7.2.4 w, say-as

7.2.5 Phoneme

7.3 Embedding audio

7.4 Summary

8 Directing conversation flow

8.1 Guiding user interaction

8.2 Dialog interface

8.2.1 Setting up the dialog model

8.2.2 Slot filling

8.2.3 Intent confirmation

8.2.4 Dialog model fulfillment

8.3 Handling errors

8.4 Summary

9 Building for Google Assistant

9.1 Setting up the application

9.2 Building the interaction model

9.2.1 Building an intent

9.2.2 Testing with the simulator

9.2.3 Parameters and entities

9.2.4 Adding entities

9.2.5 Using parameters in intents

9.3 Fulfillment

9.3.1 The code

9.3.2 Deployment

9.3.3 Changing the invocation name

9.4 Summary

10 Going multimodal

10.1 Multi-modal in actions

10.1.1 Little difference between spoken and displayed text

10.2 Surface capabilities

10.3 Multi-surface conversations

10.4 Summary

11 Push interactions

11.1 Routine suggestions

11.1.1 Storing user data

11.1.2 Action suggestion for a routine

11.2 Daily updates

11.2.1 Developer control of daily updates

11.3 Push notifications

11.4 Implicit invocation

11.5 Summary

12 Building for actions on Google with the Actions SDK

12.1 App planning

12.2 Action package

12.3 Fulfillment

12.3.1 Parsing input with regular expressions

12.3.2 Handling the unexpected

12.4 Summary


Appendix A: Adding an AWS IAM Profile

Appendix B: Connecting DynamoDB to Lambda function

Appendix C: Glossary

About the Technology

Voice assistants have taken off, with "voice-first” devices like the Amazon Echo and Google Home found in millions of homes. Voice-enabled devices, and the apps that control then, are an exciting new field for UI designers and web developer. To create your own voice “skills,” you’ll need to learn some new device toolkits, the basics of Voice UI design, and a some emerging best practices for building and deploying on these diverse platforms.

About the book

Voice Applications for Alexa and Google Assistant is your guide to the exciting world of designing, building, and implementing voice-based applications for Amazon Alexa or Google Assistant! Inside, you'll learn how to build your own "skills"—the voice app term for actions the device can perform—from scratch. After an overview of Voice UIs and how they work, you'll build a voice-powered sleep tracker to monitor sleeping patterns. Every chapter introduces a new topic as you learn to build a call and response Skill so your app knows when you talk to it, store the information in a database so your app can track and monitor the sleep patterns, and enable account linking so you can retrieve historical data. Building on the basics, you'll dig deeper as you master the art of building a multi-use conversational flow and even learn how to automatically display cards with stats of the previous night.

Along with the running example, this carefully-crafted tutorial includes smaller projects you can take on to practice your new techniques. You'll also discover a trove of best practices and tips that will streamline the app development process.

What's inside

  • Designing a voice interaction model
  • Fulfilling skills via a serverless platform like AWS Lambda
  • Connecting a skill to a database
  • Building a skill which can connect to a user account
  • Handling errors, disambiguation, and conversations in a voice skill

About the reader

Written for JavaScript developers interested in building voice-enabled applications. No prior experience required!

About the author

Dustin A. Coates is a developer who focuses on voice and conversational applications. He's currently the Voice Search Lead at Algolia and is also a Google Developers Expert for Assistant as well as co-host of the VUX World podcast.

Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.
MEAP combo
$59.99 pBook + eBook + liveBook
MEAP eBook
$47.99 pdf + ePub + kindle + liveBook

placing your order...

Don't refresh or navigate away from the page.

FREE domestic shipping on three or more pBooks