Thanks to advances in speech and the development of natural languages, we hope that one day you can ask your assistant what are the best salads. In the meantime, it is possible to ask your home instrument to play a song, or open vocals, which is a feature that is already available on most devices.
If you speak Moroccan, Algerian, Egyptian, Sudanese, or any other language of the Arabic language, which varies greatly from region to region, where some are not understood, that is another matter. If your language is Arabic, Finnish, Mongolian, Navajo, or any other language that has many structural problems, you may feel left out.
This complex construction aroused Ahmed Ali’s interest in finding an answer. He is a senior engineer at the Arabic Language Technologies team at the Qatar Computing Research Institute (QCRI) – part of the Qatar Foundation of Hamad Bin Khalifa University and founder of ArabicSpeech, “an existing platform to benefit from Arabic language science and vocabulary.”
Ali was fascinated by the idea of talking to cars, electronics, electronics many years ago while at IBM. “Can we make a multilingual machine — an Egyptian pediatrician to make his own medicine, a Syrian teacher to help children get the most out of their education, or a Moroccan chef who explains the best way for couscous?” he says. However, the algorithms that run the machine cannot detect about 30 Arabic versions, not to mention comprehension. Today, most voice recognition tools are available in English as well as a few other languages.
The plague of coronavirus has also strengthened the reliance on voice technologies, while modern language-changing technologies have helped people to adhere to home remedies and remedies. However, even though we have been using commanding words to help buy e-commerce and improve our families, the future has many uses.
Millions of people around the world are using open-air online training (MOOC) to gain more freedom and to participate without limit. Speech recognition is one of the key features in MOOC, where students can search for specific areas of speech in the course and assist in translation through footnotes. Vocabulary technology enables digital learning to reflect spoken words as writing texts in university classrooms.
According to a recent article in Speech Technology magazine, the word-for-word market is expected to reach $ 26.8 billion by 2025, while millions of consumers and companies around the world rely on word bots not only to connect their devices or vehicles but also. as well as to promote customer service, manage health skills, and improve access and integration for those with hearing, speech, or motor impairment.
In a 2019 survey, Capgemini predicted that by 2022, more than two-thirds of consumers would choose voice assistants instead of visiting stores or bank branches; an area that can rise exponentially, considering the domestic, remote and commercial livelihoods that have plagued the world for more than a year and a half.
However, these weapons fail to reach many parts of the world. For the 30 Arab tribes and millions of people, this is a rare opportunity.
Arabic for machines
English or French word boats are not perfect. However, training machines to understand Arabic are more complex for a number of reasons. Here are three well-known problems:
- Lack of letters. Arabic languages are common languages, as they are spoken. Most of the existing text is inaccurate, meaning that it does not contain words such as acute (´) or grave (`) that indicate the meaning of the letters. Therefore, it is difficult to determine where vowels go.
- Lack of resources. There is a shortage of data written in various Arabic languages. Together, they have no rules governing the grammar of a language, including routine or spelling, pronunciation, pronunciation, and emphasis. These resources are very important in teaching computer models, and as a result have less disrupted the well-known development of the Arabic language.
- Morphological pull. Arabic speakers fluctuate many codes. For example, in the French colonies — North Africa, Morocco, Algeria, and Tunisia — these languages contain many French words that are borrowed. As a result, there are a large number of words called unused words, which word recognition technologies cannot understand because the words are not Arabic.
“But the field is moving fast,” says Ali. It is a collaborative work among many researchers to move it faster. Ali’s Arabic Language Technology is leading an ArabicSpeech project to integrate Arabic translations and dialects from each region. For example, Arabic can be divided into four regional languages: North Africa, Egypt, the Gulf, and Levantine. However, considering that languages do not fit the border, this may sound more like one language per city; for example, an Egyptian native speaker could distinguish between the Alexandrian language and a native of Aswan (1,000 miles[1,000 km]on the map).
Building a professional future for all
At present, machines are almost as accurate as human writers, thanks in large part to the advancement of deep neural networks, a small part of mechanical learning that relies on inspired algorithms and how the human brain works, naturally and functionally. Until recently, however, speech recognition has been somewhat disturbed. The technology has a reputation for relying on a variety of acoustic modeling modules, lexicon lexicons, and language adaptation; all modules that need to be trained separately. More recently, researchers have been teaching models that convert acoustics to text, which can advance all components to the final task.
Despite this progress, Ali is still unable to command many weapons in Arabic. “It’s 2021, and I can’t communicate with many machines in my language,” he said. I mean, I now have a device that can understand my English, but the recognition of most Arabic-speaking machines has not happened. “
Making this happen is the purpose of Ali’s work, which has culminated in the first transformation of the recognition of Arabic words and dialects; which has done well so far. Called the QCRI Advanced Transcription System, the technology is used by broadcasters Al-Jazeera, DW, and the BBC to record online content.
There are a few reasons why Ali and his team have done well in building speech engines right now. In particular, he says, “There is a need for language resources. We need to develop the tools to be able to teach the model. As Ali says, “We have good infrastructure, good modules, and we have data that represents reality.”
Researchers from QCRI and Canary AI recently developed models that can achieve social cohesion in Arabic-speaking media. This system shows how Aljazeera’s daily subtitling reports. Although the prevalence of grammatical errors (HER) is approximately 5.6%, this study showed that Arabic HER is very high and can reach 10% due to morphological difficulties in grammar and lack of simple spelling rules in dialectal Arabic. Thanks to the recent advances in in-depth learning and final design, the Arabic voice recognition engine is capable of surpassing speakers in broadcast voice.
Although Modern Standard Arabic grammar seems to be working well, researchers from QCRI and Canary AI are busy testing the limits of dialectal translation and getting better results. Since no one speaks modern Standard Arabic Arabic at home, caring for the language is what we need for our vocabulary to understand.
This was written by Qatar Computing Research Institute, Hamad Bin Khalifa University, member of the Qatar Foundation. Not written by MIT Technology Review authors.