Vol. 1: Abugidas

The what3words Language Lab have worked with a number of writing scripts over the years. Most of our languages (22 out of 36 as of July 2019) are written using the Latin script, but the other 14 represent some very different ways of writing.

The Latin script, as everyone knows, is an alphabet – a group of consonants and vowels that are strung together to make words. Most words consist of a combination of consonants and vowels (unless you’re Welsh!), and there are countless possible ways to group them together.

While Europe is dominated by Latin script languages, the picture is very different elsewhere in the world. Our South Asian language package currently includes Hindi, Bengali, Tamil, Telugu and Marathi, we’re working on Urdu, Kannada, Malayalam and Gujarati, and have plans to release more in the months to come. Almost all of these languages are written using an abugida – a writing script where each consonant comes with an inherent vowel sound. By which I mean, the Hindi consonant with a ‘k’ sound is actually ‘ka’, because the ‘a’ just naturally occurs alongside it:

= ka

All Hindi consonants have an inherent ‘a’, regardless of which one it is - so e.g. the ‘n’ sound is represented as and also pronounced na. You can then string these characters together to build longer words:

कनक = kanaka (meaning: ‘gold’)

However, you can modify the vowel sound if you want, by adding an extra sign to the consonant. For instance, if you want to change the short ‘a’ to a long ‘aa’, it comes in the form of a vertical bar on the right hand side of the consonant:

का = kaa

Other vowel signs might attach themselves to the consonant on the left, above or underneath:

कि = ki के = ke कृ = ku

…and can then be used to build more complex words:

कलाकृति = kalaakuti (meaning: ‘artwork’)

Many South Asian languages take this even further, including vowel signs that display on both the left and right of the consonant, as in this example from Tamil:

= ka, and கா = kaa, but:

கொ = ko (the ‘o’ sound looks like 2 separate characters to an English eye, but is a single vowel and is typed with just one keystroke. Displayed on its own, it includes what we've affectionately termed a 'dotty circle' to denote how it attaches itself to the consonant:

In Tamil you can also ‘kill’ the vowel completely, by adding a dot above the consonant:

க் = k - meaning you can create words like மக்கள் = makkal (meaning: ‘people’) – if the word was written மககள it would be pronounced makakala

Other South Asian languages don’t display this ‘killer’ character but show its presence by creating consonant ‘clusters’ instead, for example in this sound in Bengali:

ষ্ট = sht - you can make out the two separate consonants (sh) and (t) – actually the Bengali ‘killer’ character has to be typed between them in order to construct the cluster, but then disappears before your very eyes as the consonants merge together. It can be seen in words such as মুষ্টি (mushti – meaning ‘fist’), where the ‘u’ and ‘i’ vowels might look quite familiar from their Hindi counterparts!

On the other hand, Telugu sometimes stacks consonants on top of each other:

మజ్జిగ = majjiga (meaning: ‘buttermilk’), where you can see (j) appearing twice to create the double consonant. Again, we needed to type a hidden ‘killer’ character in between the two j’s.

Overall, working on these languages (and many others that work in similar ways – there are in fact more abugidas than alphabets in the world) has been a huge learning experience, one that has terrified and thrilled in equal measure. Imagine typing a word and suddenly, through nothing you were aware of doing, characters disappear or climb inside one another to create entirely new (and rather beautiful) new signs! The joy of an abugida…


NB some aspects of these languages may have been simplified in order to better illustrate the point required - no disrespect was intended!