A Reddit-Powered Database Will Inform Google’s Voice Recognition Technology
The dataset was conceived to overcome the 'speech divide' introduced by strong accents and dialects
Voice-controlled interfaces such as Siri and Alexa are a seamless and intuitive method of achieving simple input commands without the need for a screen, making them especially efficient for the tech-adverse. But when thick accents are introduced, the competence of these devices are put in question; what good is a universal tool that can’t be used by everyone?
The prevalence of this issue is understatedly widespread; in the United States—where these machines are attuned to our pronunciations—everything runs smoothly, but are failing to accommodate some of our European and Middle Eastern counterparts. In fact, computer scientists have identified this ‘speech divide’ some time ago, noting the existence of a ‘machine voice’ often adopted by individuals with accents as an attempt to bypass the technology’s inability to interpret their voice.