Skip to main content

"Smartphones Could Make Illiteracy Unimportant"

© Getty Images Voice is taking over as the way we interact with technology and input words. Editor’s note: The opinions in this article are the author’s, as published by our content partner, and do not represent the views of MSN or Microsoft.
Kids today will grow up thinking a keyboard is some antediluvian tool like an abacus or butter churn, which they might encounter only because it’s nailed to a wall of a TGI Fridays.
Voice is taking over as the way we interact with technology and input words. Actually, it was supposed to have taken over a long time ago. Back in 1998, I wrote a column for USA Today saying that “speech-recognition technology looks ready to change the world,” though I also noted that when I tried to say “two turntables and a microphone” into the latest and greatest speech-recognition software, it thought I said something like “two torn labels and an ice cream cone.” Turns out that was about 20 years too soon.
But the technology works now. Microsoft, Google, Amazon, IBM, China’s Baidu and a handful of startups have been driving hard to build artificial intelligence software that can understand nuanced speech and reply coherently. Late last year, Microsoft said its speech-recognition technology had caught up to human understanding. Its “word error rate” got down to 5.9 percent, about the same as people who had transcribed the same conversation—and much better than the word error rate in any conversation between a parent and his or her teenage son.
Google’s speech-recognition technology is learning human languages at a rapid clip. In August, it added 30 new ones, including Azerbaijani and Javanese, bringing the total to 119. IBM’s Watson technology has become well known for interacting with humans—you’ve probably seen the commercial showing Watson talking with Bob Dylan. OK, it’s an ad. But even implying that a machine can comprehend what Dylan is saying is groundbreaking.
Companies are lining up to get ready for a flood of speech-driven commerce. The main reason Amazon wants to get Alexa into your home is so you’ll get used to shopping by just speaking to the thing. In August, Google and Walmart announced a partnership that will allow users of the Google Home gadget to use speech to buy directly from the world’s biggest retailer. “We are trying to help customers shop in ways that they may have never imagined,” said Marc Lore, CEO of Walmart eCommerce U.S. (Lore joined Walmart when it bought the online retailer he founded, Jet.com.) All around retail, chatbot shopping through apps from the likes of WeChat, Kik and Hipmunk is the new hot thing. Most shopping bots today are text-based but are moving toward speech. According to ComScore, half of all searches will be voice searches by 2020—and search is most consumers’ first step toward buying.
Ever since Apple introduced Siri in 2011, we’ve come to expect our phones and apps to comprehend spoken queries, which is an underappreciated, monumental achievement after so many decades of trying. It’s like the turning point in the 1910s, when people started to expect that airplanes would actually fly. IBM demonstrated the first voice-recognition machine, called Shoebox, at the 1962 World’s Fair in Seattle. The device could understand all of 16 words—the numbers zero to nine and instructions like “plus” and “minus.” To let you know it understood you, Shoebox would do simple math and print the result.
In the 1970s, the U.S. military’s research arm, the Defense Advanced Research Projects Agency, or DARPA, funded a massive speech-recognition program that got the total of words understood by a machine up to about 1,000—still far from practical yet roughly equivalent to our current president’s vocabulary. In the 1980s, James Baker, a professor at Carnegie Mellon University, co-founded Dragon Systems, based on his speech-recognition research. In 1990, Dragon’s first consumer dictation-taking product cost $9,000 and mostly just frustrated users. In 1998, when I stopped in at IBM Research to check on progress in the field, speech recognition was still not yet good enough for everyday use.
© Associated Press A file picture of Google Home. Why has the technology suddenly gotten so good? The onslaught since 2007 of mobile devices and cloud computing has allowed massive data centers operated by giants such as Google and Amazon to learn language from hundreds of billions of conversations around the world. Every time you ask something of an Alexa or a Watson, the system learns a little more about how people say stuff. Because the software can learn, no one has to punch in data about every slang word or accent. The software will keep improving, and soon it will understand our speech better than the typical human does.
And that could radically change the world. Shopping may be an early application, but the technology can even alter the way we think. A couple of generations learned to think with a keyboard and mouse—a tactile experience. “The creative process is changed,” a Dragon executive named Joel Gould told me back in 1998, anticipating changes. “You’ll have to learn to think with your mouth.” In a way, it’s taking us back to the way our brains were meant to work—the way people thought and created for thousands of years before pens and typewriters and word processors. Homer didn’t need to type to conjure up The Iliad.
In a speech-processing world, illiteracy no longer has to be a barrier to a decent life. Google is aggressively adding languages from developing nations because it sees a path to consumers it could never before touch: the 781 million adults who can’t read or write. By just speaking into a cheap phone, this swath of the population could do basic things like sign up for social services, get a bank account or at least watch cat videos.
The technology will affect things in odd, small ways too. One example: At a conference not long ago, I listened to the head of Amazon Music, Steve Boom, talk about the impact Alexa will have on the industry. New bands are starting to realize they must have a name people can pronounce, unlike MGMT or Chvrches. When I walked over to my Alexa and asked it to play “Chu-ver-ches,” it gave up and played “Pulling Muscles From the Shell” by Squeeze.
In fact, as good as the technology is today, it still has a lot to learn about context. I asked Alexa, “What is ‘two turntables and a microphone’?” Instead of replying with anything about Beck, she just said, “Hmm, I’m not sure.” But at least she didn’t point me to the nearest ice cream cone.

Comments

  1. Thanks for the information, nicely written blog. windows speech recognition are there which gives you virtual assistant which makes your easy by just by saying the information that you want to search or write.

    ReplyDelete

Post a Comment

Popular posts from this blog

"How to Write a Thesis Statement in 5 Simple Steps"

Hungry for tacos?  Feel like you can’t fully concentrate on your writing assignment until you make a trip for a late-night snack? As tempting as a few tacos and a burrito sound right now, don’t rush to satisfy your cravings just yet. Instead continue reading, as this blog post contains important information you’ll need to write that paper—in particular, how to write a thesis statement in 5 simple steps.  This blog post discusses tacos, too, so that alone should give you incentive to keep reading! What’s the Purpose of a Thesis Statement? The short answer The purpose of a thesis statement is to inform readers of: the subject of your paper. your claim (or opinion) of the topic. The longer answer A thesis statement generally appears at the end of the introductory paragraph; it tells your readers what you’re writing about and tells your readers your opinion of the topic.  The thesis essentially serves as a mini outline for the paper. A thesis state...

"Switch to Biogas could save Ireland from massive fines"

Some of Ireland’s leading food and drink companies are supporting a big move into production of biogas, an emissions-free energy source from agricultural waste and energy crops. Diageo, one of the State’s biggest gas consumers – mainly through St James’s Gate brewery in Dublin – and Dairygold co-op are leading the way. Ireland, with its large agriculture sector, is considered the EU member state with best potential to exploit biogas. But a “renewable heat incentive” (RHI) to support this sector is absent. It’s urgently required, according to those prepared to back the green technology – Ireland is the only EU country without a RHI. Biogas comes with benefits: it’s a renewable energy source that farmers can help generate and it reduces CO2 emissions associated with farming, which are responsible for a third of Irish greenhouse gas (GHG) emissions. Much of Diageo’s gas needs next year are likely to be supplied by Green Generation in Nurney, Co Kildare, an anaerobi...

"The best makeup to wear to work or for a job interview"

Whether you’re interviewing for your ~dream~ job, sat at your desk counting down the hours ‘til an after-work vino or killing it as your own beauty boss, makeup probably plays as pivotal role in your 9-5 life as your work wife. Think about it: Makeup’s the ultimate form of self-expression, it hides a hangover from your boss (praise be) and a flawless face doesn’t just help you make your mark in the workplace, it gives you the confidence to deal with whatever the day throws at you — from a passive-aggressive encounter with Lynda from Accounts to an unexpected flirt sesh with a cute barista. Still, it’s not always easy to navigate the dos and don’ts of workplace glamour. So we asked Cosmo beauty editor Cassidy and online beauty writer Erin to share their go-to work makeup: a berry power lip for Cassidy — perfect for a casual office or anyone who’s their own boss — and a pared-back look with nude nails for Erin. It’s the ultimate five-minute-face for anyone who wor...