As we all begin to look forward to the Interactive section of South By South West in March, PSFK has identified five key trends that readers should be monitoring during the festival. One of these trends we have coined ‘Sonic Interface.’
The pervasiveness of electronics in virtually every aspect of daily life has prompted designers to rethink the way that people interact with their devices and the world around them. The next evolution of natural user interfaces sees voice and audio recognition technologies capable of reacting to spoken commands and audio cues, enabling people to perform a wider variety of instant, hands-free operations–from searching for information to surfing the channels.
To explore this idea further, we spoke with David Jones, EVP Marketing at Shazam Entertainment.
What do you think is driving this trend of ‘sonic interface’?
Audio content recognition is not just about handsfree — although this is a fantastic leap forward for all consumers in so many use cases. Applying technologies like Shazam’s audio content recognition, whether it’s pre-recorded or live audio or video content, audio content recognition can simply be a faster, easier way to start a journey or get something done. It’s all about reducing friction, and overcoming consumer inertia — and there is a ton of it out there.
When a consumer users Shazam to identify a song that’s playing, that’s something that you couldn’t do before Shazam came along, or could take you literally hours to figure it out. On top of that, you can buy the track or album in an instant, literally just a password away at most on the iPhone. It’ so easy, that’s why 8% of people who use Shazam with music go on to buy the track or the album, making Shazam the largest mobile affiliate for iTunes, selling over $100 million in digital goods per year through our partners.
When it comes to Shazam for TV — the payoff for consumers is different. You’re not trying to identify what you’re watching (in most cases), it’s about interactivity and engagement. Again, Shazam makes it easy and fast to quickly identify what you’re watching, whether it’s the Super Bowl, a TV show, or a TV ad, and instantly — and automatically — the consumer is immersed in a curated, rich, satisfying experience. For TV shows and events, it’s about going deeper to experience more — getting more information, engaging in additional bonus content, sometimes exclusive content. For TV ad campaigns, it’s about providing more information, special offers, finding the nearest location to buy the product, or best of all — shopping from your couch.
But even if it’s all available today online, and if you’re lucky, it’s available at mobile-optimized sites like what NBC Universal has built for all of its shows, Shazam’s and audio content recognition is essentially competing with:
- a consumer remembering what they want, launching a browser, and typing in a Google search or a URL [Whew, I am already tired just thinking of that.]
- A telephone number that you have to remember or write down, and then actually take the time to call.
- An SMS short code (more common in Europe and Asia)
- And even occasionally you’ll see a QR code on-screen, where you have to find and launch an app, then line up and aim the camera at the code, trying to capture it.
All of this takes time. Many people don’t do these things because of the effort involved. Imagine if ACR can streamline this process and get you more quickly where you want to go? That’s what’s exciting about this. And, it’s not just that you made things easier and faster for those who were going to do it “the hard way…the classic way” — but that there is an uplift that will come from consumers who wouldn’t do it before, but will do it now that it is so easy. That’s additional consumers and engagement that would not have otherwise happened simply due to friction and inertia. These are previously lost opportunities for engagement, interactivity and conversion — these are new, thanks to technology like Shazam’s.
Future advances in this technology point to emerging applications in areas ranging from security to personal health care, In your opinion, what are the areas we could see big change using sonic control?
Entertainment and Advertising are two massive industries that can absolutely benefit from these new technologies.
Interactive television using Audio Content Recognition (ACR) technology can be used for special promotions and hidden content – such as exclusive videos and interviews – by providing a click-through to a brand mobile web experience where viewers can interact directly with branded content. Because the tagged item remains in the viewers tag list, they can revisit it and share it with friends long after the television show or ad has ended, extending the viewing experience beyond the 30-second commercial or the 30- or 60-minute television program.
What talks and events should PSFK readers be looking out for in Austin in terms of ‘sonic interface’?
To hear more about how audio content recognition technology is driving consumer engagement with television shows and events, I’d recommend “TVEngagement: Does Social Media Drive TV Ratings?” which will be Friday, March 9, 2:00PM – 3:00PM, Omni Downtown, Capital Ballroom. Come hear what MTV Networks, Bravo Network, Food Network, and Shazam Entertainment have to share on this topic.