Meta creates an AI translation system for a mostly spoken language

Share This Post

  • Meta has developed an artificial intelligence translation system that can convert the spoken language Hokkien into spoken English. We’re one step closer to realizing Star Trek’s Universal Translator. CEO Mark Zuckerberg shared a video on his Facebook post showcasing the technology with his software developer Peng-Jen Chen. In it, the two speak in English or Hokkien and Meta’s AI system translates them by voice.

The demo is pretty impressive, but much like Meta’s VR legs, the video was most likely edited for illustrative purposes, and the actual product isn’t quite as smooth. Translation AI is typically trained on text, with researchers feeding the system a large number of written words to learn.

But he has more than 3,000 languages ​​that are mostly spoken and do not have a widely used script, and are difficult to incorporate into such training.

Hokkien is one such language he. Spoken by more than 45 million people in mainland China, Taiwan, Malaysia, Singapore, and the Philippines, Hokkien is a spoken language with no official, standardized script.

“Our team first translated English or Hokkien into Mandarin text, then translated it into Hokkien or English.Human He did it both annotator and automatically,” he said. Meta researcher Juan Pino said. “We then added the paired sentences to the data we used to train the AI ​​model.”

So when Hokkien speakers have to write down information, they tend to do so phonetically, with considerable variation between writers. In addition, there is very little recorded data of Hokkien-to-English translations, and there is a shortage of professional human translators.

Of course, filtering a sentence in multiple languages ​​can distort its meaning. Anyone who has played with Google Translate knows this. Meta also collaborates with Hokkien speakers to validate translations and publishes its models, data and research as open source information for other researchers to use.

Related Posts