Skip to content
arabic nlp

Arabizi

Arabizi is a informal way of writing spoken Arabic using Latin letters and numbers (such as '3' for ع and '7' for ح), widely used by GCC and Egyptian users in texting, WhatsApp, and social media instead of Arabic script.

Arabizi emerged because early mobile phones and keyboards did not support Arabic script well, and the habit stuck: many young and urban users in the GCC and Egypt now default to Arabizi in casual digital messaging even though Arabic keyboards are universally available today. Numbers stand in for Arabic letters that have no close Latin equivalent — '3' looks like ع (ain), '7' resembles ح (ha), '9' resembles ص (sad) — while the rest is transliterated phonetically, so the same word can be spelled several different ways depending on the writer's dialect and personal habit (e.g. 'izzayak', 'ezayak', or 'ezayk' for 'how are you' in Egyptian Arabic).

For a business AI agent, Arabizi is a real support burden rather than an edge case: a WhatsApp AI agent serving GCC or Egyptian customers will routinely receive messages that mix Arabizi, English, MSA, and dialect in a single sentence, and a model trained only on formal Arabic text will frequently misread or fail to respond to it. Handling Arabizi well requires either a language model with genuine exposure to Arabizi in training or a normalization step that maps common Arabizi patterns back to Arabic script before intent detection — a gap that generic, non-Arabic-first chatbot platforms routinely miss.

Chat on WhatsApp