True story: On Monday afternoon, I said to my watch, “Siri, set a timer for 35 minutes.” Two rooms away, my phone replied, “I’ll set a timer for 10 minutes.”
It was an ironic coda to a morning spent watching the keynote for Apple (AAPL)’s Worldwide Developer Conference (WWDC). WWDC is the last of three big developer conferences in the past month, and what it had in common with Microsoft Build and Google I/O was a cheery insistence that consumer-facing technology has gotten to the point where we can talk to our computers and expect them to actually do what we say.
It’s a bold notion, given that we still live in a society where Get Human exists to help us avoid dealing with phone trees.
On the heels of three separate keynotes where presenters calmly and cheerily queried software agents and got precise and correct responses to their requests, I found myself increasingly skeptical about how this is going to play out for consumers who don’t have the benefit of being tech professionals in a demo environment. There are a few concerns I have yet to see addressed:
How will voice-activated interfaces work for people who speak with regional accents or whose vocabulary is tightly tied to specific dialects? There are at least 24 distinct regional dialects in the U.S. and our population has the privilege of skewing the data pool for most tech companies that collect raw data with which to train voice-activated assistants like Apple’s Siri, Microsoft’s Cortana or Google Assistant.
We also have the privilege of having all those languages presumably be English. Imagine if you’re in South Africa, with its 11 official languages. How does voice recognition work there? Does it serve all languages with equal facility? Now imagine you live in India, with its 22 different languages birthing approximately 720 distinct dialects.
The point is, voice-based interfaces are only as good as the data pools they’re based on, and those data pools are not going to be perfect reflections of the people who live in a society and use technology. Moreover, developing those data pools costs time and money—resources corporations are likely to allocate based on likely return on investment, i.e. who can afford the product? Profit margins are not pegged to equal access.
What about people with speech impediments? Or people whose speech alters after a medical condition? Approximately 25 to 40 percent of stroke patients lose some speech capacity—how well will Cortana or Siri work then? Ideally, any good speech interface will also have a tactile or keyboard alternative that lets a user opt out of voice as the primary interface.
How do you protect your privacy? This is a growing concern, in part because companies have been unwilling to make direct promises like “We’re not passively listening to your conversations” and that they’re not transparent about how much of your conversation is collected as data, what’s stored, what’s attached to specific user accounts, how securely the data is stored, or how the data is sold to third parties.
While these devices are supposed to have cues to wake them up (“Hey, Alexa” or “Hey, Cortana,” for example), they’re fairly short on goodbyes. I have yet to hear one say, “Are you done? I’d like to stop listening.”
How much transparency will we get from our voice assistants? Any time I ask Siri, “What data do you send to Apple?” I get the response, “Who, me?” That’s not cute. Whenever I ask, “Do you save the questions I ask you?” I get the reply, “I can’t answer that.” Again, not cute. I’d like Siri to come back with a concise explanation of what it sends to Apple’s servers and how long my queries stay there.
So what? Tech interfaces that do not work because they do not recognize the speaker’s primary language, parse an accent, or handle speech impediments are tech interfaces that do not work, period.
And any tech that obscures what it does with your assets—and make no mistake, your voice commands, queries and requests all constitute data, which is an asset—is tech that can’t be trusted.
I don’t make a point of culling life lessons from Harry Potter books but there’s a moment in Harry Potter and the Chamber of Secrets where a character is chided, “What have I always told you? Never trust anything that can think for itself if you can’t see where it keeps its brain.” That’s a good ground rule for any smart assistant.
Who cares? People who keep trying to make smart appliances and smart homes happen, mostly because smart appliances and smart assistants are a license to collect data for f-r-e-e from users, all day every day. And data is very valuable—it can be repackaged, bundled and sold to vendors who want to target very specific market segments.
But mostly you should care. You should care that you don’t know what Siri knows about you. You should care that an interface most tech giants are pushing as the next step in computing is neither universally accessible nor transparent. You may not be able to see Cortana’s brain, but you can sure use your own to decide how much of your personal data you want to throw into a black box so you can get movie listings read back to you.
Want more? There’s a whole archive of So What, Who Cares? newsletters at tinyletter.com/lschmeiser. In addition to the news analysis, there are also fun pop culture recommendations.