In search of things new and useful.
I’m starting to get really interested in voice as a major computing interface. Prior to my Apple Watch, I rarely ever used Siri. But I find I’ve been using voice more and more, and my experiences with the Echo and the rise of more apps that utilize a conversational interface make me pretty excited about this new communication medium. A couple thoughts/questions.
Interface: Voice is a funny interface. It’s super flexible in some ways, but really limited in others. It’s potentially insanely convenient, but also really clunky. I think that there are two shifts that usher in the more mainstream adoption of voice as an interface (in addition to improvements in the actual language processing and AI). The first is convenience and speed. I think that consumer adoption of new interfaces like these exhibit sort of a convenience tipping point. 20% more convenient drives next to zero usage, but perhaps 50% more convenient drives massive adoption. I’m experiencing this with my watch. The delta between pulling out my phone to make a request, and making a request on my watch (and then have the request fulfilled more naturally on my watch) is getting me over the hump. Navigation is a great example of this, and is even getting to me to occasionally switch off of Google Maps to enjoy this benefit. I haven’t actually made a complete switch yet, but I can see it happening if the mapping software were better.
The second trend is that we are going to see new use cases for voice-input that will be more narrow and forgiving but way more convenient. Messaging apps and pseudo-human-powered services are making me think about this. Part of the frustration of voice is the unpredictability of it. Does it actually understand what I’m saying? How many mistakes until I totally give up on the medium? You could see voice being integrated in more constrained environments, like in a narrow app with a messaging interface where the meaning behind a response could be easier to parse. Or, if there is actually a human on the other end, that parsing may not actually need to occur. Using voice as an input to specific instructions to an Uber driver or a Taskrabbit seems like a potential no-brainer. Finally, the nice thing about Apple, Amazon, and others moving into this area is that they are training us to use semi-natural language to communicate with machines, shedding the massive negative bias that anyone who has ever dealt with voice prompts why trying to call an airline CS number can attest to.
Social: One question I have is how voice plays into messaging and social services. I’m curious how often the voice input is used in major messaging services with different demographics. I actually suspect that it’s surprisingly low, even though it’s pretty widely featured (one of only three main options on Line, Whatsapp, etc). I’m curious whether this is because voice actually doesn’t work well, or if it’s just that those apps are built so much around text that it’s unnatural to use. Or maybe I’m wrong and the use of voice is increasing on these services (if anyone knows, please tell me). I’m curious to see what types of voice-native social networks might emerge in the coming years and whether one really will exist that is reasonably separate from video, text, or photos. One big impediment to voice is that it’s super awkward as the recipient in many cases. But maybe there will be a social dynamic that is unlocked by that constraint as well.
Age: I’m curious how impressions of voice as an interface is different between different age groups. I actually think that my age group is probably going to be most negative. We find other input mechanisms too natural, and have too much of a distaste from early speech-based interfaces that we find no need for it. My kids find our Amazon Echo absolutely magical. I haven’t seen them as excited about a piece of technology since they learned how to make Netflix work on my iphone 🙂 I’d love feedback here, especially from readers of a younger demographic.
I have mostly questions and no answers at this point. But it’s an area I’m getting pretty excited about.