The tech industry is doing its best to topple the Tower of Babel.
Last month, Skype, Microsoft’s video calling service, initiated simultaneous translation between English and Spanish speakers. Not to be outdone, Google will soon announce updates to its translation app for phones. Google Translate now offers written translation of 90 languages and the ability to hear spoken translations of a few popular languages. In the update, the app will automatically recognize if someone is speaking a popular language and automatically turn it into written text.
Certainly, the technology of turning one tongue to another can still be downright terrible – or “downright herbal,” as I purportedly said on a test of Skype. The service also required a headset and worked best if a speaker paused to hear what the other person had said. The experience was a little as if two telemarketers were using walkie-talkies.
Sebastian Cuberos, on the laptop, demonstrated a newly announced Microsoft Skype program that simultaneously translated Spanish and English with his interviewer, seated at his desk, in San Francisco.Credit Jim Wilson/The New York Times
But those complaints are churlish compared with what also seemed like a fundamental miracle: Within minutes, I was used to the process and talking freely with a Colombian man about his wife, children and life in Medellín (or “Made A,” as Skype first heard it, but it later got it correctly). The single biggest thing that separates us — our language — had started to disappear.
Those language mistakes are a critical part of how online products get better. The services improve with use, as so-called machine learning by computers examines outcomes and adjusts performance. It is how the online spell check feature became dependable, and how search, map directions and many other online services progress.
“The program learns as you using the conversations,” is how Sebastian Cuberos, my new friend from Colombia, put it during our Skype call. “At this time, is pretty good.” The grammar isn’t perfect, but you know what he means.
Just a few thousand people are using the service on Skype. As it learns from them, it will bring in more of the nearly 40,000 people waiting to try the Spanish-English service. Even in these early days, it elicits the possibility of social studies classes with children in the United States and Mexico, or journalism where you can live chat with a family in Syria.
Google says its Translate app has been installed more than 100 million times on Android phones, most of which could receive the upgrade. “We have 500 million active users of Translate every month, across all our platforms,” said Macduff Hughes, the engineering director of Google Translate. With 80 to 90 percent of the web in just 10 languages, he added, translation becomes a critical part of learning for many people.
Automatic translation of web pages into some major languages is already a feature on Google’s Chrome browser. People using the browser can render a page that is in English into, say Korean. There are also 140 languages in which it is possible to change things like Gmail settings.
It is possible to set your email to languages like Klingon, Pirate and Elmer Fudd. Other options, like Cherokee, are more serious, and Google aspires to eventually have these as full translation languages. Google will also soon announce a service that enables you to hold your phone up to a foreign street sign and create an automatic translation on the screen.
Microsoft’s Bing Translation engine is used on Twitter and Facebook. Facebook, which also features communication across the borders of language by operating the world’s largest photo sharing service, also has its own translation efforts. It has also signed up thousands of people to a waiting list for Skype to offer other simultaneously translated languages, like Chinese and Russian.
Feeding the “corpus,” as linguistics engineers call their database of language, has become critical for some countries as well as for the sake of machine learning. Google, which uses human translation to initiate its service, recently added Kazakh after a government official went on television to ask people to help out. “People can ask very, very strongly that we put their language on the service,” Mr. Hughes said.
Still, some experts worry as machines look more deeply at individual uses of meaning through things like intonation and humor. What will it mean if, as with our search terms and our Facebook “likes,” these become fodder for advertisers and law enforcement?
“The technology is potentially magical, but the threats are real too,” said Kelly Fitzsimmons, co-founder of the Hypervoice Consortium, which researches the future of communication. “What would it mean to have a corpus of conversations after there is regime change, and a new government doesn’t like what you said?”
Currently, Ms. Fitzsimmons said, just 1 percent of consumers consent to having their data recorded overtly. That is what people do when they help machine learning of translation, however, or when they use voice-based assistants like Siri. She thinks individuals will become better at managing their own privacy, and not outsourcing it to the providers of services. But for now, all kinds of information is surrendered for convenience.
Olivier Fontana, director of product marketing in the Skype project, says conversations are broken up into separate files before people check a translation for quality. “There is no way to know who said what,” he said. “The N.S.A. couldn’t make sense of this.”
Mr. Hughes said Google was also careful about what it did with voice, in part because of potential issues around biometric security in case voice recognition replaced passwords. Besides, he said, “there is something to be said for having your translator be different – if I speak Chinese, I’d have a woman’s voice, so people know it’s a translation.”