An interview with Nathan Smith, director of Smart Home UX at Amazon, who is leading a team of developers working on new smart home features for Alexa
Analysts predict an increase in demand in the market for smart home devices. Users are getting more accustomed to smart gadgets, trusting them to do an increasing number of household tasks.
Last year, Amazon launched an API for interacting with its smart assistant Alexa, and the company sees great potential in the development of this product. Today, Alexa can be used to manage your smart home: closing gates, turning the light on and off, interacting with security cameras, and so on. One of the most important tasks in the development of this assistant lies in making voice controls more natural, enabling the user to give commands in an arbitrary manner, as if they are talking to another person.
VentureBeat has interviewed Nathan Smith, who is in charge of the department working on new smart home features for Alexa, and discussed the key principles at the core of the assistant’s capabilities and the improvement of user experience.
VentureBeat: I thought we could start with a high-level overview of Amazon’s approach to the “smart home” and voice interactions and then dive into some of the ideas you and your team are pursuing to make managing connected devices easier with Alexa. That sound good?
Nathan Smith: Sure. We think the smart home is in a period of mass adoption and expansion right now. Classically, it has comprised much more tech-forward earlier adopters, but we’re past that. There are now more than 60,000 products that work with Alexa from 7,400 different manufacturers, and a trend we’re seeing is that Alexa is democratizing control of these devices.
One of the things I’m most excited about this year is a new feature that uses machine learning and artificial intelligence to help Alexa understand not just what you say, but what you actually mean, and then provide a simple user experience around that.
The problem we’re solving came from customer feedback as we were onboarding people who didn’t necessarily have context concerning which smart devices were named what around their house. We ran into this over and over again — people were having trouble remembering the names of devices, which was only exacerbated as they added more devices to their homes.
What we’ve done is make Alexa a little bit more human-like. If you ask Alexa something like “Hey, Alexa, turn on the Sofa Lights” but the lights you’re trying to turn on are called Living Room Lights and Alexa is uncertain about which you mean, she’ll helpfully suggest “Oh, you know, did you mean Living Room Lights?”
This technology, which allows people to speak more casually in their homes and go beyond the strict syntax that Alexa previously understood, helps in a lot of different real-world use cases. One is words that have similar transcriptions and another is mixed characters, like when people add emojis to their [own] or their devices’ names [in the Alexa smartphone app]. It can resolve words without being strict about the exact pronunciation, and it can even help in multilingual cases. If you’re using a mix of names across different languages, Alexa can learn from that.
The context is that we’re trying to build toward a world where Alexa understands you in a much more natural way, rather than training people to talk in Alexa’s terms. If we have a pretty good idea of what you’re saying, we’ll simply perform the intended task, but what we’re evolving toward is a model where Alexa gets ground truths from customers. We don’t want to take the power of customers away without asking a clarifying question if we’re not 100% certain about something, but we also want Alexa to be helpful in ambiguous cases.
We started rolling out this feature in the U.S. at the end of December and recently expanded it to Canada, Australia, the U.K., and India. In terms of early results, when Alexa prompts a customer with a suggestion, they’re accepting it 80-90% of the time, on average.
VentureBeat: Which other factors does Alexa take into account when determining how to respond to a command, misspoken or not?
Smith: Gathering ground truths and assimilating them into semantic and behavioral models that learn from you in a very human way — the way a child would ask questions about the world — underpins the machine learning side [of Alexa]. What our models really do is layer on signals in terms of device state and behavioral signals — like which devices are usually switched on at which times — in addition to environmental signals, like date and time. The models use all of these to generate suggestions.
There’s a lot more work to do, and we think that we can expand the reach of this sort of helpfulness to other scenarios. We’re seeing more and more customers from different walks of life and different technology backgrounds using smart home devices with Alexa, and this is a first step to taking bleeding-edge technology and using it to help simplify the customer experience.
VentureBeat: AI and machine learning are obviously at the core of Alexa, from its language processing and understanding to the way it intelligently routes commands to the right Alexa skill. What are some of the other challenges you and your team are solving with AI? What has it enabled you to achieve?
Smith: At the feature level, there’s Hunches, where Alexa provides information based on what it knows from connected sensors or devices. It checks if when you say a command such as “Alexa, good night” whether your garage lights are still on and whether they’re usually off at that time of day, which informs the response. Alexa will say something like “Good night. You know, by the way, I noticed that your garage lights are on. Would you like me to turn them off for you?” and give customers helpful feedback at certain stages of smart home routines without requiring them to dig into a bunch of app screens.
These features use machine learning techniques enabled by Amazon Web Services. We run these real-time capabilities at scale on the SageMaker platform, which has given us the ability to iterate a lot more quickly.
VentureBeat: It seems, as you said a moment ago, that smart home adoption is on the rise, perhaps driven in part by cheaper connected devices, like Philips’ recently announced Bluetooth-compatible Hue series. What are some of the other ways you’re making onboarding simpler for first-time buyers?
Smith: We’ve been working really hard on that for a while now, and one of the things we’re most excited about is this ability to have a zero-touch setup. Last year, we announced Wi-Fi Simple Setup, which lets you quickly configure Amazon Wi-Fi devices like the Amazon Smart Plug. Basically, you plug it in and then Alexa will say “Hey, I found your new device.” There’s no other setup necessary. We’re bringing that same experience to Bluetooth Low Energy light bulbs like the new Philips Hue products, and we’re really working to expand the usage of this technology broadly.
As for configuration post-setup, once you get a device talking to Alexa, we released a couple of features at the end of last year that help you do some of the other setup and context-gaining by voice that you might need to have a fully natural interaction with Alexa. We want customers to be able to do things like put their devices in rooms so that when they refer to one device in a set of several, Alexa targets the right device.
That’s why we rolled out last year a more contextually sensitive setup experience. If you say “Alexa, turn on the lights,” she can walk you through with voice setting up a room and putting lights in there. We’ve seen customers really take to this because it doesn’t get in the way of controlling the device for the first time.
VentureBeat: I’m sure you have to account for different Alexa device form factors, right? I’m talking about an Echo Dot versus an Echo Show.
Smith: We think of it as a mesh among the different modalities — among the app, voice, and screen — because each has different strengths. Voice is really great when you’re trying to do something hands-free, but not great when you’re trying to do something quietly. That’s where we lean on screen-based interactions.
What we’re really excited about is ensuring that, as more diverse customers start to use Alexa, we’re keeping up with their needs and not looking backward and saying “OK, how do we teach these customers the sort of patterns of the past?” Instead, we’re using technology like machine learning to look forward and learn from them.
The key is using the technique that’s right for the type of problem, whether it’s examining a behavioral pattern or trying to establish semantic similarity with ground truths, and then tuning a meta-model that takes those individual signals into account, producing a user experience that’s helpful instead of one that makes assumptions.
Share this with your friends!