Machine learning continues to drive improvements in voice technology. Error rates in voice and speech recognition are continually dropping. Eventually, our technology may get to the point where it hears and responds as well as humans.

 

Narrow AI vs. General Intelligence

Narrow AI (artificial intelligence), also known as weak AI, is where we are in today’s consumer space. It is AI that is limited to performing a single task—tell us the weather, play chess, or analyze data. It can be brittle, with interactions failing in expected ways if interactions are outside of the applications limits. General intelligence, on the other hand, is AI with cross-domain capabilities, with the potential to be at or beyond human levels. We are nowhere near this level of AI, even though movies and cartoons might have us think otherwise.

 

VUI in more places

Voice User Interface (VUI) is showing up in more and more places. Not only do our phones has voice-enabled options, but it is showing up in various IoT devices (Internet of things). The lines will start to blur as talking to your refrigerator might be possible from your car or phone.

A few key figures:

  • 1 in 2 smartphone users use voice technology on their phones
  • 41% of people using voice search have only started in the last 6 months
  • Voice search will exceed 50% of all searches by 2021
  • Smart speaker adoption is increasing, with sales growth up by 137% from Q3 of 2017 to Q3 of 2018

Adoption is accelerating, as is the technology supporting it.

There will be continue to be opportunities to incorporate VUI into our lives. As the technology improves, so will user comfort levels with its use.

 

Where are we headed?

VUI error rates will continue to improve to the point where accuracy will be nearly on par with human interaction, and adoption rates will continue to increase among users. User expectation will move from novelty to an increased expectation for a good VUI experience, with 42% saying that voice-activated devices have quickly become “essential” to their lives. What we would expect to see is a gap between those brands who have successfully navigated this establishing their brand in this space and those that are left behind.

That said, VUI is not an all-or-nothing proposition. Instead, users will expect VUI to be one component of many in a multimodal experience, with devices that receive voice input and show images. Additionally, users will expect voice to be one of many options that they can use, with interfaces that allow them to easily switch between the experience that they prefer for the problem at hand. This may limit the need for a full-blown experience initially, allowing companies to dip their toes into the initial experience.

 

Next week: Why should you care about VUI?