Older automated attendants (which route phone calls to the correct extension) and interactive voice response systems (which conduct more complicated transactions over the phone) can respond to the pressing of keypad buttons via DTMF tones, but those with a full voice user interface allow callers to speak requests and responses without having to press any buttons.
However, advances in technologies like text-to-speech, speech-to-text, natural language processing, and cloud services contributed to the mass adoption of these types of interfaces.
Designing a good VUI requires interdisciplinary talents of computer science, linguistics and human factors psychology – all of which are skills that are expensive and hard to come by.
Even with advanced development tools, constructing an effective VUI requires an in-depth understanding of both the tasks to be performed, as well as the target audience that will use the final system.
In contrast, a VUI designed for a small group of power users (including field service workers), should focus more on productivity and less on help and guidance.
Such applications should streamline the call flows, minimize prompts, eliminate unnecessary iterations and allow elaborate "mixed initiative dialogs", which enable callers to enter several pieces of information in a single utterance and in any order or combination.
On the flip side, speech is perfect for handling quick and routine transactions, like changing the status of a work order, completing a time or expense entry, or transferring funds between accounts.
Early applications for VUI included voice-activated dialing of phones, either directly or through a (typically Bluetooth) headset or vehicle audio system.
In 2007, a CNN business article reported that voice command was over a billion dollar industry and that companies like Google and Apple were trying to create speech recognition features.
Additionally, Google has created a speech recognition engine called Pico TTS and Apple released Siri.
[3] Both Apple Mac and Windows PC provide built in speech recognition features for their latest operating systems.
Microsoft integrated voice commands into their operating systems to provide a mechanism for people who want to limit their use of the mouse and keyboard, but still want to maintain or increase their overall productivity.
In addition to the built-in speech recognition software for each mobile phone's operating system, a user may download third party voice command applications from each operating system's application store: Apple App store, Google Play, Windows Phone Marketplace (initially Windows Marketplace for Mobile), or BlackBerry App World.
[20] Currently, technology allows a driver to issue voice commands on both a portable GPS like a Garmin and a car manufacturer navigation system.
VoiceDraw allows users to "paint" strokes on a digital canvas by modulating vowel sounds, which are mapped to brush directions.
Other approaches include adopting non-verbal sounds to augment touch-based interfaces (e.g. on a mobile phone) to support new types of gestures that wouldn't be possible with finger input alone.
[31] These errors tend to be especially prevalent when the speech content uses technical vocabulary (e.g. medical terminology) or unconventional spellings such as musical artist or song names.
[33] Most voice assistants today are capable of executing single commands very well but limited in their ability to manage dialogue beyond a narrow task or a couple turns in a conversation.
[34] Privacy concerns are raised by the fact that voice commands are available to the providers of voice-user interfaces in unencrypted form, and can thus be shared with third parties and be processed in an unauthorized or unexpected manner.
[35][36] Additionally to the linguistic content of recorded speech, a user's manner of expression and voice characteristics can implicitly contain information about his or her biometric identity, personality traits, body shape, physical and mental health condition, sex, gender, moods and emotions, socioeconomic status and geographical origin.