Translating tech better
Siri, Alexa and Google keep getting better at translating spoken words to text messages and scouring huge troves of information for answers to complex questions. But that’s mainly just in English and the world’s other dominant languages.
A recent study found that most progress in the branch of artificial intelligence known as natural language processing has been restricted to a tiny subset of the world’s 6,500 languages.
That’s a problem since language technology is increasingly important in communication, education, medicine and elsewhere.
AI researchers who speak some of those overlooked languages are banding together to develop their own language approaches, said Damián Blasi, a co-author of the study who researches linguistic diversity at the Harvard Data Science Initiative.
The study found that while Dutch and Swahili both have tens of millions of speakers, there are hundreds of scientific reports on natural language processing in the Western European language and only about 20 on the East African one.
Grassroots research coalitions such as the pan-African project known as Masakhane have been working to build better language datasets and other improvements to make Swahili and dozens of other widely spoken languages better understood by AI systems.