Arabic has been spoken for approximately 400 million people throughout the world. So it is one of the most widespread languages but, curiously enough, only 1% of online content is in Arabic. This points to a potential digital exclusion for many people living in the Middle East and North Africa.
In light of this, recently some tools have been developed Open source AI models specific to Arabic language. In fact, dedicated work is needed in order to be able to make reliable models that must understand and respond to approximately 30 major dialects, plus their local and cultural variants.
A work that, in addition to representing an important step in the digitalization of Arabic-speaking countries, could stimulate the technological evolution of artificial intelligence globally. Let's see why and how the topic can interest us at work.
Content index
Jais: The open source AI for the Arabic language
Jais is one of the most advanced language models developed specifically for Arabic. Born from the collaboration between the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), G42 e Cerebras Systems, uses 13 billion parameters, enabling a deep understanding of the peculiarities of the Arabic language, including its dialectal variants. Jais' ability to understand both modern Arabic and regional dialects sets it apart from general-purpose AI models, making it a valuable tool for public and private institutions in the region.
Its accessibility, through platforms such as hugging face, allows developers, researchers and academics to use the model to create custom applications. Having so many parameters makes Jet a large open source AI model, however it is not yet comparable to other models.
In fact there are open source AI models like Falcon 180B, developed in the United Arab Emirates, which It has 180 billion parameters. Let's see more.
Falcon 180B: the Arab answer to GPT and Bloom
We are talking about one of the largest open source AI models ever released. It was developed by the Technology Innovation Institute (IIT), based in the United Arab Emirates, as part of their efforts to boost research onartificial intelligence in the region. A milestone in open source AI in Arabic, then. However, unlike Jais it is less specific.
Indeed Falcon 180B It is more powerful and supports many more languages, but it is not suited to accurately handle the specificity of Arabic dialects and variants.
ALLaM and the collaboration with IBM
Another model that opens the Arab world to the global market of artificial intelligence is ALLAM. This model is interesting because it is developed by the Saudi Data and Artificial Intelligence Authority (SDAIA) in collaboration with IBM. So it is on the platform watsonx.ai and in addition to representing a further step towards the democratization of artificial intelligence in the Arabic language, it is one more step of this world towards the world's largest AI companies.
This model supports various applications, from government agencies to private companies. In addition, it offers tools for training and customizing the model according to specific needs.
ALLaM is available in both a 13 billion parameter and a 7 billion parameter version.
How can these Arab open source AIs be of interest to Italian professionals?
For Italian professionals who are experts in software development, machine learning algorithm e data science, these models represent a unique opportunity to contribute to the development of innovative technologies in emerging markets.
From the point of view of the digital marketing, these AIs can be used to improve understanding of local preferences and personalize advertising campaigns. Several Arabic-speaking countries, in fact, they are in strong technological expansion. So there are investments that open new markets and collaborations with local or multinational startups.
Italian professionals could exploit their experience in these markets cybersecurity and cloud computing.