Why is a language underrepresented online?

52 posts / 0 new
Last post
Why is a language underrepresented online?

You can use these questions to help kick off this discussion thread:

  • Is language underrepresented online because of challenges accessing the internet?  The lack of a functioning keyboard?  Is there a lack of terminology in the language to allow users to participate online?
  • Is language underrepresented online because of non-technology related challenges such as speakers’ lack of confidence with the language (in general and in written form)?  Or, challenges around understanding different dialects?
  • From your experience, what are the reasons behind the underrepresentation of languages online?

Share your experiences, thoughts, ideas and questions by adding a comment below or replying to an existing comment!

Some challenges for our languages on line

A language is all for the man. After its communicational role, it conveys the knowledge, the competence, and the traditions and shows the cultural identity.

Therefore, no language is to be neglected in a globalized world where the ICT takes a very important place in all fields. In spite of some efforts deployed here and there, the report proves that at the moment, the African languages are more under-represented on-line.

Globally, this rarity of our languages on line can be explained by the great difficulties of Internet connectivity in our countries.

Computer and Internet connexion chalenges :

  • All the citizens have no means to get a computer and the Internet connection by the Smartphones for themselves, which would facilitate this domain, is extremely expensive in our countries.
  • All the citizens have no means to get home Internet connection for themselves and an hour Internet connection in cybercafes is relatively high when it is about to spend hours and hours there to work.
  • With these difficulties, the Internet is considered for the moment as appropriate to those who have the big financial means.

 The low level of those who have to act on the languages in the field if ICT:

Most of those who master these languages, who speak and write them, who are a little bit previous with regard to the period of the development the ICT. Many do not know how to use computers, do not know many notions in Internet notions and most, either due to the lack of time, or not wanting any more to become again a learner in this domain, don’t want to supply no more effort in this direction. According to me, it does not like because a bambara proverb says: «Mɔgɔ tɛ kɔrɔ kalan ma, mɔgɔ tɛ kodɔn kalan ma, don o don, tulo bɛ taa kalanso » (“We are never old to learn, we are never so educated to learn because every day, the ear goes to the school.) These old generations that master our languages, that know the outlines of their meanings, have to be at the school of technology, accompanied with the young generations to manage our languages towards the road of the ICT.

Difficulties to work with the keyboard in our languages:

The keyboard to write our languages exists today, but its use is difficult for those who have to work with. There are keyboards which can be installed today and virtual keyboards which allow to work on line languages. But, many do not master their how to use them.

It is necessary to alternate for example between French and the Bambara to work with the keyboard that I have on my computer. It is also necessary to know that: q = ɲ,  ù=ŋ, µ = ɔ, $ = ɛ (The 4 special characters in bambara). (If there caracters are not appeared well on your scean, please, use Firefox, Chrome or Opera.) Guve a train to the users is a big solution for that problem of keyboard.

The influence of the developed languages on ours:

Many persons are not interested in our languages in this sense by the fact that they already master foreign languages. Some have even this complex to express themselves in their mother tongue and work with because they find sufficient for them, the foreign languages they already use.

The lack of confidence in the wealth of our languages in terminology:

One of the reasons of the resistances of many persons in this domain is situated at the level of the terminology. During my activities of development and promotion of my mother tongue on line, I listened to several times: «What you want is not possible, because our languages are poor in terminology. » Here is an idea that I never share because a language does not develop itself; it is its speakers to develop it and why not to adapt our languages to the reality of moment.

For example: we can say: telefɔni kɔfɛcikanlaselan (to show the answering machine of the telephone)

Like that, we can create and suggest new words that will be discussed in on line forum and it moreover contributes to enrich our languages

Lack of real policy development and promotion of our languages on line:  

To shoot down well a tree, it is necessary to attack its root. Allow me to call the cat by its name. I think that the solution of all this is situated at the level of the implementation of a real political commitment in this field. Too much conference and workshop in this about our languages and ICT, but the realization remains a problem. When the governments of the various countries together, give hands to set up a real system of promotion of our languages on line, this will arrange all.

In spite of all these difficulties, I notice that efforts are being made here and there by individuals. Today, the only source of motivation for me is the fact I work, with my quite small notion, because I like my mother tongue, I want that it developed and Promoted on line on blog and social network like facebook, tweeter.... and why not to give opportunity to other people to learn our languages on line via English and French?

I want through my mother tongue, to redraw my cultural identity, the traditions and the culture of my countries, to create spaces permitting to my parents, who have never been at school, to get informed on line and exchange with other sharing knowledge and experience. For that purpose, I think that I do not have to wait for all the means combined to begin. The means will join us by the way to facilitate things.

I cannot finish this thought without thanking all who, morally, materially, financially helped and continue to help and encourage me in this activity that I began with my quite small notion on the subject.

I have a particular thanking for those who, in the other continents, are interested in our languages.

Hi Boukary, Thank you for

Hi Boukary,

Thank you for your comments.  I am seeing these issues come up in my research into the Irish and Sámi communities online.  Two of the issues you raise are particularly relevant - they are those that relate to the sense that a majority or dominant language is sufficient for communication purposes, and the requirement for policy development.

The case of the Irish language is particularly interesting here.  In the first instance, a process of language shift has been occuring in Ireland over the past 800 years.  As a result, the Irish language has lost out to the English language in terms of its functionality in society when it comes to culture, economics and politics.  Irish language speakers need to be particularly persistent to establish their language as a viable means of communication, in both the private and public spheres. 

On the other hand, there has been a policy of compulsary education in Irish in Ireland for the past ninety years.  As a result of this, over 1.6million Irish people claim to have some competence speaking Irish.  This might sound like a success, but the most recently available statistics show that only 83,000 use the language on a daily basis outside of the education system.  This goes to show the extent to which a process of language shift has occured in Ireland.

It also demonstrates that government policy alone is insufficient for a language to survive.  What is necessary is that people persist in speaking and using the language in as many social contexts as possible.  This is why it is important to use endangered languages online and particularly in social media.  It is important to communicate in these languages and for this communication to be visible to other speakers.  This encourages the use of the language and can expand the social contexts in which language use is considered relevant.



Use of the language in social settings

I think you've hit the nail on the head in terms of what's necessary to proliferate the language in "the real world." The thing that seems to have been successful in North Carolina Cherokee is a multi-pronged approach: We are aiming to martial the resources of the elder speakers via monthly speakers' gatherings at a community center, having them not only come together to develop words, but also just to chat and enjoy each other's company.  We have also built a "language nest" called the New Kituwah Academy which begins immersing students in Cherokee as early as 6 months old.  This creates a generation of young speakers who have the potential to become creative forces in the language later in life.  They are able to learn from the elders while developing new words & places for language use.  If I may gush for a minute, it's astounding to watch them play on the playground and in the classroom and do it in Cherokee (ᏣᎳᎩ).  Currently what seems to be missing is that bridge generation - the speakers who did not learn the language from their parents but who want their children to be able to speak it.  I think if we can establish good second-language learning programs and have places in the community where the language is acknowledged to be preferred, we can bring second-language learners into the fold as well.

As far as online content, I think generating it has to have a clear motivation behind it and an address of the audience for whom it is intended.  Because we have a need for our second graders to learn to read chapter books in Cherokee, the New Kituwah Academy has begun to take steps to achieve that goal.  Similarly, the Cherokee Nation in Oklahoma has anticipated the need (/desire) of their fifth and sixth graders to text/blog/video-chat/search Google in ᏣᎳᎩ, and has taken steps to achieve that.  So now those are tools for content creation.  As far as what kind of content, I think the most engaging content that people consume on the internet is entertainment and social media.  Entertainment is an area that definitely needs to be improved upon, but I think it can be done if you can motivate the right people.  Roy Boney & Joseph Erb in the Cherokee Nation have been doing amazing projects with videos and animation which hopefully will progress as the immersion school children mature.  As far as social media, a community translation project is underway for Facebook, and several groups exist to promote the use of the language online.  

While all of these projects likely seem overwhelming, it is incredibly important to remember that Rome wasn't built in a day, and anything you do - any small effort, can be a beginning.  What I think is really important though, is not to put the cart before the horse - you have to have enough speakers and/or learners in order to generate content.  Really, because of the collaborative nature of the internet, anyone's offering can be thought of as a starting point and expanded upon.  To take it online though, you have to have a firm & solid base in the community (/communities)

mobile Cherokee

The Cherokee Nation has been doing a remarkable job with getting their language resources out there. The addition of the keyboard in iOS devices is a particularly impressive. I believe that this is one of the most important steps. The ability to text and email on the go is major. As long as it is difficult to use, young people are less likely to embrace it. 

Moving beyond the domain of education

Duit duit! Your point about the use of Irish beyond educational contexts resonates with me. It is only recently that ‘ōlelo Hawai‘i, the Hawaiian language, has started to make strides outside of the classroom. I believe much of this has to do with the fairly recent (1999) graduation of our first immersion students and their entry into the workforce. We've seen a modest increase in Hawaiian language media (TV and radio mostly), and though I have no hard figures or research to back it, I've encountered more people using Hawaiian within the family and groups of friends than I have in the past. Previously, when I encountered someone using Hawaiian on our island, by and large, I knew them or their family, so encountering more individuals with whom I am not acquainted is a good sign.

Education throught the medium of Hawaiian was made illegal in 1898, and that law remained on the books until the mid 1980s. Hawaiian immersion education was established and grew quickly. The growth of our immersion schools, about 12 preschools and a similar number of K-12 sites, has flattened out, and about 2,000 receive their education in Hawaiian. That number hs been static for about a decade. 

Our College of Hawaiian Language, Ka Haka ‘Ula O Ke‘elikōlani, offers several degrees, a B.A. in Hawaiian Language,, a B.A. in Linguistics, a graduate certificate in Indigenous Teacher EducationM.A. in Hawaiian Language and Literature and an M.A. in  Indigenous Language and Culture Education, and a Ph.d. in Hawaiian and Indigenous Language and Culture Revitalization. All upper-division undergraduate classes and most graduate classes (except for some in the Ph.D. program) are conducted through the Hawaiian language.

While Hawaiian is an official language of the state government here, we frequently encounter resistance to strenthening its use. Ballots are available in many immigrant languages, but not Hawaiian. The reasoning is that there are no Hawaiian speaking monoglots - all speakers of Hawaiian also speak English. Other initiatives, such as using Hawaiian in state departmental letterheads, have failed because of our state's bean counters claiming it will cost too much. 

The Association of Hawaiian Civic Clubs recently passed a resolution that urges the state to require students to take the equivalent of one full year of Hawaiian language instruction in order to graduate from high school. Most of my colleagues and I are wary of such and action. We barely have enough teachers to support the level of instruction that we have, and such a move could be devastating to our immersion program. Also, I spent a month in Ireland, three weeks at Oideas Gael, learning gaeilge and interviewing people about the language. I was shocked at how much ill-will there was toward the language, including teachers who were there simply because their jobs required it. I realize the history of compulsory Irish and what is happened here in Hawai‘i are quite different, but by and large the language does have widespread support, and would question any move toward compulsory Hawaiian language education, feeling that the money and resources are better spent (at least at this time) on those who choose to make it the language of their homes and workplaces.

For an interesting juxtaposition of languages, here is my blog from our trip to Eire in 2002, though it is written in Hawaiian:


Slán go fóill!


Hi Boukary and Niamh, Thank

Hi Boukary and Niamh,

Thank you for your comments and raising the question of one dominant language being sufficient for communication purposes. While for obvious reasons it facilitates communication between groups who may otherwise have different languages and gives them a common language to communicate in, completely ignoring and not providing the resources for under-represented languages to exist online risks alienating groups. While one may learn another language easily, especially if it is the dominant one in the area in which one lives and works, not being able comunicate and read in one's mother tongue can have detrimental effects - not lost of which is forgetting the language or getting disconnected from one's roots.

Going back to your comment Boukary about the keyboards not being in other languages, it is so easy nowadays to simply change the language on our computers and cheap covers made with the alphabet or symbols of the local language. These exist for a number of languages, as far as I am more, but how might we be able to make them available to more people? Such a small change could facilitate the creation of documents, blogs and websites of all sorts and publishing them online done instantaneously. And as people see that there are pages up in their language, they may feel closer to the language and their community, no matter where they are in the world.

Going around keyboards

ana_svoren wrote:

Hi Boukary and Niamh,

Going back to your comment Boukary about the keyboards not being in other languages, it is so easy nowadays to simply change the language on our computers and cheap covers made with the alphabet or symbols of the local language. These exist for a number of languages, as far as I am more, but how might we be able to make them available to more people? Such a small change could facilitate the creation of documents, blogs and websites of all sorts and publishing them online done instantaneously. And as people see that there are pages up in their language, they may feel closer to the language and their community, no matter where they are in the world.

I agree with this observatiion. Keyboards could be a physical/mechanical barrier but there are many ways of patching the gap. For Songhay we proposed different layouts to members of our mailing list and put one version at a time on the website (www.songhay.org -> "Claviers"). We got some feedback and suggestions for a future layout. Users could download this for a start while we consider other steps: 1) to have keyboards with character stickers as an affordable intermediary solution, 2) to have keyboards with original keys, ideally through cross-border collaboration between countries with the same language(s).

Hi Niamh,  It would be great

Hi Niamh, 

It would be great to read your work. I'm working on the use and potential of social media in conjunction with traditional media such as broadcasting, in terms of how it can amplify the discussions/content.

Have you seen any examples for Irish / Sami, where the traditional media have 'bought into' social media and made it those languages more visible? 

Also do you think the internet makes bilingual/multiligual communication too black and white? I find that language use tends to be one or the other with little of the more natural, oral code-switching and mixing going on. The mainly text based nature of social media seems to limit the way that people use their natural bilingualism and the clash of languages (especially when non-speakers see minority language use 'impinging' on their Facebook / Twitter stream) is often a place for tension, and therefore a place for speakers without confidence in their written minority language to perhaps defer. 

The importance of content

When blogs were first invented, the impulse of many early bloggers in all parts of the world was to write in English because this was the surest way to reach an audience. Even on Global Voices, our goal was once to discover "bridge bloggers" who wrote primarily in English for an international audience. In the past 6-7 years this has changed enormously to where many more bloggers outside of English speaking countries are writing in their own languages and more local conversations. Accordingly, our focus at Global Voices has shifted to translation (in multiple directions). I'm always struck by how quickly things change online. From the adoption of online networks like Facebook in new language communities to the explosion of citizen media in a languages that were previously invisible on the internet.

Underrepresented languages face enormous challenges in the "real world" that are also mirrored online. But it also seems like there could be a huge oportunity to bypass some of the social and political barriers that maybe prohibit their use in daily life or national media. Nonetheless, people use languages to communicate something that is useful or entertaining to them. If they don't find this online in one language, then they will just switch to another. Aside from technical barriers, that is probably the main thing that needs to be addressed for underrepresented languages to spread online: There has to be unique, relevant and entertaining content online in underrepresented languages for people to begin naturally communicating in them. Most people will simply not do it out of principle.

This may present a sort of chicken-and-egg problem (which came first?) but there is always hope that with the right interventions and gradual growth of internet access things could change very quickly.

I think the issue of content

I think the issue of content is immensely important. The idea of having interesting, engaging material online, that is not focused specifically on the language itself, is vital to capturing the interest especially of young technology users. In the Mayan languages, we have lots of pedagogical stuff online (teaching resources, dictionaries, language lessons), but I don't think this has been of much help or interest in terms of building a broad-base of speakers of these languages working and interacting online in their languages.

Kara Andrade and Erik Sundelof in Guatemala have been working to create citizen journalism environments, which capture some of the best aspects  of social networking together with encouraging interesting online content (www.hablaguate.com), and I think this moves in the right direction, for us at least.

At the same time the issue Solana raises of content curation is extremely important and something we struggle with. In addition to creating online spaces for the use of indigenous languages, it is also important that those voices reach a wide audience - which requires editorial and translation capacity on the back end, something that we struggle to find the time (and funding) for.

What kinds of subjects do Welsh speakers discuss online?

>> There has to be unique, relevant and entertaining content online in underrepresented languages for people to begin naturally communicating in them. Most people will simply not do it out of principle.

Indeed, but much of so the "unique, relevant and entertaining content" that Welsh speakers tend to read is based either around the subject of Wales, Welsh politics, or Welsh language culture. Content about things other than the Welsh language is nearly always seen as pointless to read in Welsh, when it is probable that somebody, probably 100 times as many people, are doing it in much more depth and with much more funding behind it in English. So we get closed in an inward looking loop, where much of what is discussed in depth in Welsh media is...Welsh language issues and Welsh culture. 

So, for instance, I've been tracking the trending topics on the Welsh language tweet aggregator Umap Cymraeg for the last six months and more often than not, news of national or global importance is not discussed in Welsh (on Twitter at least, though it would seem that research on blogs and Facebook groups would back this up). The latest Welsh language television programme nearly always gets more discussion. I'm hoping to gather the evidence for this, although it may just indicate the ways that people use Twitter bilingually, and that people wish to join discussions about global events on a global level and therefore require a language that can reach globally. 

Which brings us to how people perceive their audience when publishing on the web too. I think that many see it as being a massive potential audience, whereas in reality most of their readers will be from their physical networks and would probably prefer to receive their communication in their minority language. So there's this huge psychological barrier to publishing in a language where there is always a perceived potential for more eyeballs on work in the majority language (though this applies to languages such as French also in some respects!). 

Talking about the language in the language

...does seem to be a bit of a problem.  If you only have that as your common focal point, it does seem like kind of an artificial exercise.  I guess you'd have to find things that people have a common interest in & figure out ways to discuss that, but then again as you say, those things are probably being discussed in more depth in other languages.  Maybe you could take something that's being talked about in a more widely used language and write opinions about it?  By writing it in Welsh it might open the door for some discussion on who holds the "Welsh perspective" on whatever the issue happens to be, and you might get some mileage out of that.  Are people able to write editorials in Welsh?  I'm just thinking that would probably open the door for more discussion among speakers. 

Welsh editorials

Most of the international news reporting that takes place in Welsh comes through either the BBC in a television news reports (very factual, no editorialising); or the online daily news site http://Golwg360.com which are mostly translations for the wires. We have no daily newspaper, and weeklies have weak reach, and even then discussion of world events is limited. So this deficiency in the mainstream media, probably correlates with the lack of discussion in Welsh on other online civic media. 

Welsh tweets/blogs

Hi Rhodri,

First, thanks for joining in the discussion, we're glad to have you here.

I really like your idea of following trending topics in Welsh; I'll be very curious to hear any conclusions you're able to draw from that.    Also, I've crawled every Welsh-language blog post on blogspot.com for Indigenous Blogs (19000 of them) - it might be interesting to try something similar with that medium and compare with the results for tweets.  We could maybe train a statistical classifier to label blog posts as being "about Wales/Welsh" or not, to automate things a bit.

Comparative study

Thanks for the welcome Kevin. This discussion has been fantastic and I think we would benefit from a more permament discussion along these lines.

Re: the research - I would be very interested in discussing a project along those lines. I'll be in touch when I make a start on the trends analysis. 

Neologisms: a major theoretical barrier for Mayan languages

Mayan languages (Guatemala, Belize, Mexico) face a major theoretical barrier when developing online tools and platforms in that the various Mayan language are relatively impoverished with respect to technical lexical items that can be used to describe the kinds of major central concepts that need to be represented online.

A tension easily and naturally develops between more highly educated "elite" speakers of the languages (who tend to be those employed by language policy organizations) who advocate for the wholesale creation of neologisms to cover all of these concepts and the large base of native speakers who might have an interest in online resources in their language but who might prefer an emphasis on code-switching to cover unfamiliar terminology rather than employing neologisms.

Both parties need to be at the table when solutions are being designed and implemented.  

"elite" speakers/base of native speakers

Good point.  It's conceivable that keeping these groups separate would result in slightly different forms of the language that could end up at odds with each other, or that one form would be thought of as more "artificial."  Loan words are not really the enemy, especially if they're being phonologized.  It's the lack of generational transmission that creates a problem.

Exactly, although the

Exactly, although the historical trend in Guatemala goes in the opposite direction. Language maintenance efforts here have always been tied really strongly to a pedagogical focus on neologisms and 'pure' discourse. So, although we and others are advocating for more relaxed and inclusive methodologies (which we think are especially important when thinking about social media and other web-based networking), it is an uphill battle.

Thanks for your comments and

Thanks for your comments and perspective. Interestingly in our case it is the young, educated elite who push hardest for the neologisms and pure speech, whereas grandma is quite happy to codeswitch all day long. 

I’ve had the privilege to

I’ve had the privilege to work with Cherokee speakers in developing new terminology for localization of software (a process we’re still in the middle of) at Cherokee Nation.  One of the things that I have found that helps speakers that are unfamiliar with the technology terminology is to explain the idea that many of the concepts of technology often can often be thought of its own language.  One example was the term "image."  In tech talk, imaging a computer, in layman’s terms, is making a copy of the system so it can be imaged onto another hard drive - basically making a copy.  But that’s not what the term "image" means in a regular context.  Once speakers understand that analogy, it is easier for them to make a similar analogy in the Cherokee language. Of course, the bigger issue at play is once these new words are developed, will they actually be actively used?  That remains to be seen for us. 

It’s been useful having second language learners interacting with elder speakers in this realm.  A fascinating term that has been developed, to me, is email: lightning paper or ᎠᎾᎦᎵᏍᎩ ᎪᏪᎵ (in the Western Cherokee dialect). 


Re: Neologisms

Roy, I've noticed that same kind of phenommenon in NC at the language symposia.  Basically you just have to explain what the deeper meaning of the term is in the source language & then try and get the speaker to explain it back to you in Cherokee.  The issue then is, okay, they can explain it, but what does the "short form" become - what does the "word" end up being?  In Cherokee it's always like you have a ton of explanation, but it's hard to boil down to one term.  Then if the term is long, you wonder if people are going to use it, because it feels more like an explanation than a lexical item.  Then again though, I guess if people are willing to say ᎠᏓᏴᎳᏘᏍᎩ for "TV," they probably will use whatever term they come up with as long as it strikes other speakers as comprehensible and clever.  (I notice the speakers tend to like terms that sound witty/maybe a little funny. "ᎡᎳᏗᎢ" for "Lowe's," or "ᎤᏃᎴ ᏧᎾᎵᏍᏓᏴᏗ" for "Wendy's" ;-D)

ICT terminology for indigenous languages

Peter Rohloff wrote:
A tension easily and naturally develops between more highly educated "elite" speakers of the languages (who tend to be those employed by language policy organizations) who advocate for the wholesale creation of neologisms to cover all of these concepts and the large base of native speakers who might have an interest in online resources in their language but who might prefer an emphasis on code-switching to cover unfamiliar terminology rather than employing neologisms.

Both parties need to be at the table when solutions are being designed and implemented.  

Specifically on the subject of ICT terminology, this is something that we've been working on with the African Network for Localization (ANLoc). As has been mentioned by several people in this forum, the terminology through which people access internet and other communications technology can be a substantial barrier.

I think of all my friends in Tanzania who are literate in Swahili, but not in English - when they have access to a cell phone, they are confronted by "send" and "message" and "call" in English. People are highly motivate to learn these basic terms for basic cell phones, so we see, eg, the verb "message" having made its way into Swahili - nitakumessage, alinimessage, etc. However, more complicated technology, such as smartphones, contain much more terminology in order to use any of the advanced features, such as the camera, the radio, or the web browser. Without the terminology for a camera, a son living in the city will never be able to send his mom a photo of her new granddaughter, and without the terminology for audio playback, the grandmother will never be able to use the phone to listen to a community radio program about organic ways to keep insects from eating her tomatoes (assuming the price of a smartphone becomes affordable, as is likely). People can easily master 10 or 20 terms like "send" from a foreign language, but it is unrealistic to expect them to learn the several thousand terms that underlie the general ICT environment.

At ANLoc, we identified 2500 key ICT terms, which we defined in English. Then we gave these to teams for 10 African languages, and asked them to provide equivalents, along with definitions in their languages. This project can be seen at http://www.it46.se/glossmaster/ .

In all honesty, the project, although a major success in terms of meeting its metrics, was only partly successful in terms of developing terminology sets that resonate with the language communities, so we are now in the process of revising the system in order to learn from the initial round. The big problem is that terminology is not something that exists in the wild waiting to be harvested. You will not find an existing term in Swahili, or Cherokee, or Hawaiian, for "browser," for example. Heck, 20 years ago you wouldn't have found that term in English. The question then becomes, how do you decide on a term that the language community will recognize and choose to use from this day forward? What we learned is that terms developed by a few language "experts" can often fall flat. The experts can propose a term that nobody in the user community understands, or perhaps the community is already using a loan word (such as "faili" in Swahili derived from English "file") and has no interest in switching to something that happens to have an indigenous pedigree.

We ran an experiment with Swahili where we took the terms that the "experts" could not agree on, and asked for community input. In some cases, the community feedback showed that one of the proposed terms was well understood. In other cases, we learned that the terms derived by the experts were completely inadequate. Sometimes the community members proposed terms that were obviously better than anything the experts had come up with, and sometimes the community was at least able to propose better ways of looking at the issue - such as seeing "cache" as "temporary storage" instead of the literal metaphor of a "treasure trove" that the experts were fixated on. From the 150 terms we put up for discussion (out of a list of about 1500, the rest of which were fairly clear), the community input process left us with only about 10 that were really difficult, for which the "experts" needed to dig deep into the essence of the concept to construct a term that would make sense to users down the line. "Mode" was one of those terms, and "tablet computer" was another.

It is important to note that the end result was not a "democratic" process, in which the term with the most community votes was deemed the term to use for Swahili from this day forward. The "experts" still had the final say, because they are the people who had the combination of linguistic training and technical knowledge about the ICT domain. For example, community members might prefer a term for a particular concept that already has a well-established usage with the field in another context, whereas the experts would recognize a potential conflict and therefore choose an acceptable but less popular choice for the newer concept. Community input is invaluable, but in the end the final decision must be left to practitioners, just like nobody asked you if you like "browser."

Keeping all this in mind, we are now in the process of revising the software into a next generation that has a refined community input model. We will be using this system in two ways, first to build ICT terminology for additional languages, and second to build terminology sets in other domains (eg health or engineering) that can be expanded to any language. The premise is that any language can be productive in any field, as long as a set of terms exists that are agreed upon and used by members of the community. (Community in the sense of the subset of people who speak a language who are involved with a particular domain - grandma on the farm needs to be able to understand terms about tomatoes and pesticides, but not necessarily about the aluminum mine where her grandson works, while her grandson needs to know mining terminology but might not care about insect control.) 

Unfortunately, the software for the new community participation system is not yet finished, so I cannot invite new groups to try it out. Maybe by the end of this week the system will have progressed to the point where I can post a demo, but I'm not the programmer and I don't want to make promises on his behalf. The larger point, though, is that terminology truly is a barrier to access to ICT - but also that there is an attempt at a solution that benefits from the real world experience of 10 African languages, and that we hope to open up to other interested language groups in the near future.

Back translations

Thanks for your thoughts on this subject Martin.  I'm a fan of the ICT glossary project.  One dream I've had for a while is a big terminology grid just like that one, but augmented with back-translations of each term into English (or French or other widely-used languages).  Terminology development is a huge opportunity for indigenous language groups to share ideas and "metaphors" - there's no reason at all why our computing metaphors need to come from English at all.   With a big grid like this, if I were trying to coin a term for "keyboard" in my language, I could go to that row in the grid and see what dozens of other indigenous languages have done.  In Irish, it's a "finger board", for example - maybe that works better in your language than a metaphor based on a western musical instrument (keyboard, clavier, etc.)   Or even the word for "computer":  Irish uses "ríomhaire" which is essentially "computer".  French famously uses "ordinateur" = "sorter".   But quite a few indigenous American languages have adopted the metaphor of a "brain" or "artificial brain" which is an altogether better way of thinking about it, I'd say (maybe having to make up new technical terms in 2011 is actually an advantage - better than being stuck with bad outdated terms from the 1950's!).

Another favorite of mine: Yiddish uses "shleptop" for "laptop" (English speakers will recognize the Yiddish "shlep" which is used in English as a verb for lugging something around).   No "laps" involved.  In Irish it's a "knee computer" - I don't know if that's better or worse but we don't have a common word for "lap"!

As I final example, I recall that Martin and I had an exchange on Facebook over a term for Swahili where the metaphor in Irish worked a lot better than the English one (I think the term was "de-interlacing" - Irish is based on the idea of "weaving" as opposed to "lacing"); another good example of sharing across languages, language groups, and continents!

The African Network for

The African Network for Localization Project is absolutely interesting and while I am sure that it has been a huge undertaking, I am sure it has helps many many users. I was just wondering, for users that cannot simply go online to download these lists or simply do not have computers, are there any printed versions that come with cellphones or cellphone lines in these countries? I am just wondering how this great work can be used to reach the most people. And the successes that have been experiences in Africa can surely be taken to other parts of the world.

accessing the anloc term lists

Ana, good question about how to access the term lists. The completed lists are available here http://www.it46.se/glossmaster/ , but as you point out, that's not so useful to people who have zero or limited web access. I'm afraid there's not much we can do in the way of getting the lists printed, unfortunately, because that would require a printing budget, which just does not exist. Regarding mobile access, we actually have a prototype system for serving dictionary lookups via sms, but we've gotten stuck at the level of reaching an agreements with the telecom companies that would need to transmit the messages.

The next generation of our terminology software will include full search facilities so that terms and definitions can be looked up in any language in the system. That should be much more useful than the original version, which only looks up on the English side (our original programmer was building with terminologists in mind, not the end terminology user). The multilingual search is fully functional in the development version. Once we go live, it will make it possible for users with web access to look up the terms in their indigenous interface and find (a) the definition in their own language, and (b) the translations in English and French, and other African languages, along with the definitions in those languages.

For the moment, though, the only access is to download the full csv file from the website and open it up in a spreadsheet program. Not optimal, but thankfully soon to be improved...

Thank you Martin. I had this

Thank you Martin. I had this image of the cellphones being sold with these lists, perhaps funded by the government who wanted more people to be able to fully utilize the technology, or even NGOs who are focused on empowering these groups. Not being able to use the cellphone or go online is just as bad as not having one.It is surprising the telecom companies are not cooperative, I would think they would see the benefit to them too of having more people using their services...

Thanks to you at least this has started and in a major way and hopefully more cooperation on the part of the telecom companies and governments will follow very soon.

Terminology, Neologisms

Peter Rohloff wrote:

A tension easily and naturally develops between more highly educated "elite" speakers of the languages (who tend to be those employed by language policy organizations) who advocate for the wholesale creation of neologisms to cover all of these concepts and the large base of native speakers who might have an interest in online resources in their language but who might prefer an emphasis on code-switching to cover unfamiliar terminology rather than employing neologisms.

Thank you Peter for raising this, which I know is an issue for many languages, including my own language of Irish.   There is an official terminology board in Ireland that coins new terms in all domains and publishes them online, but these are not always embraced by everyday users of the language, especially native speakers from the strongest Irish-speaking parts of the country.  Speakers from Conamara in County Galway are famous for liberally sprinkling their Irish with English terms (sometimes Gaelicised: "babysitteáil"), as opposed to using the "official" terms.  This would be fine, except that I've heard from several native speakers from the area who are reluctant to use some of the software we've translated into Irish exactly because they don't understand the technical terminology used in the menus, error messages, etc.  Indeed, we only have about 600 daily users of Firefox in Irish (user interface, menus, etc.), as compared to around 4000 Firefox users who have installed the Irish spellchecker but use the browser in English.   So lots of people using and writing in the language but the great majority are uncomfortable with the menus in Irish for whatever reason.  

What's the solution?  Martin has given us some good ideas in his post above.  I'd really like to hear others from the group. 

In our case, our team of translators has freely diverged from the official terms when they are especially unclear or unmotivated.   We make an effort to coin terms that combine simple, easy-to-understand elements that are native to the language.  For "pop-up window", we coined "preabfhuinneog"; from "preab" which is a common word for "jump" (e.g. if frightened), plus "fuinneog" for "window".    We also take the attitude that the Irish translation of a piece of software can always be better and clearer than the English original; so sometimes it makes sense not to create terminology at all.  If a dialog box has two buttons "Save" and "Cancel", we'll translate those as "Save" and "Don't Save".   "Spellcheck-as-you-type" becomes "Live spelling", etc.

This question came up with Michael Bauer in an interview on the Indigenous Tweets blog.  Michael has translated a lot of software into Scottish Gaelic, and has had to coin a lot of his own terminology:

Michael Bauer wrote:

There's not much that you cannot translate into Gaelic but the challenge is translating it in such a way that a non-technical user of the language can find their way around without having to resort to the dictionary all the time which tends to turn people off. But it can be done with a bit of forethought and a healthy approach to using loanwords. For example, when I was translating Firefox, we had to tackle the term 'export', quite a good example of subtle language engineering. There are several terms in dictionaries for the verb 'export' but they all try to carry the meaning by using native roots, for example 'às-mhalairt' – literally 'out-trade'. That sort of word sometimes works but in this instance it leaves most native speakers confused. So after some debate we settled on a new term, 'às-phortaich' or 'out-port' because it gives non-technical users more clues as to the meaning and that seems to have worked very well. The other aspect of this involves a bit of best practice in translation – when you get volunteers who translate software they often stick too close to the original language which results in really bad translations which put off end-users but if you get it right, it makes the localised versions much more readily acceptable to everyone, including non-technical native speakers.

Edmond Kachale (Chichewa language) is another featured practitioner whose thoughts I'd like to hear.  He is a strong advocate for using native elements of his language in creating new terms, even if they are perceived as "hard".


I'll share one other link, to a list of Irish language computing terms that was started by Caoimhín Ó Donnaíle in the early 1990's.  You'd need to speak Irish to really appreciate it (Niamh), but in any case this is evidence that a lot of the terms we take for granted now came not from the terminology board, but through a more organic process - you'll see many alternate terms for the same English concept, some survived and others have passed into oblivion (some thankfully so!).

In our case, our team of

kscanne wrote:
In our case, our team of translators has freely diverged from the official terms when they are especially unclear or unmotivated.   We make an effort to coin terms that combine simple, easy-to-understand elements that are native to the language.  For "pop-up window", we coined "preabfhuinneog"; from "preab" which is a common word for "jump" (e.g. if frightened), plus "fuinneog" for "window".    We also take the attitude that the Irish translation of a piece of software can always be better and clearer than the English original; so sometimes it makes sense not to create terminology at all.  If a dialog box has two buttons "Save" and "Cancel", we'll translate those as "Save" and "Don't Save".   "Spellcheck-as-you-type" becomes "Live spelling", etc."

I think is great stuff. We recently did a neologisms project in Kaqchikel to work over prior neologisms from a decade or so ago. One of the main things we did is to 'undo' a lot of neologisms that were previously developed with an explicit ideology at that time of being 'elegant.' With think that elegance is a terrible and useless criteria to apply to these sorts of projects; the result was a whole bunch of data mining of colonial Kaqchikel documents, for example, and very obscure 'one word' neologism solutions - even though the language has a rich synthetic tradition, which makes more sense from the standpoint of generating news words that are immediately intelligible and liked by native speakers.

I love the "preabfhuinneog" example which seems to be along these lines - deeply pragmatic, and also educational.

In all of our neologisms work now we have definitely moved in this direction, employing the 'complex noun phrase' morphology in Kaqchikel to generate descriptive (sometimes a bit lengthy) neologisms which are readily apparent.

We are doing preliminary localizations of some of these on a few websites, which you can check out:



On terminology Development

kscanne wrote:
Thank you Peter for raising this, which I know is an issue for many languages, including my own language of Irish.   There is an official terminology board in Ireland that coins new terms in all domains and publishes them online, but these are not always embraced by everyday users of the language, especially native speakers from the strongest Irish-speaking parts of the country.


What's the solution?
Edmond Kachale (Chichewa language) is another featured practitioner whose thoughts I'd like to hear.  He is a strong advocate for using native elements of his language in creating new terms, even if they are perceived as "hard".

I think there are a few problems that localisers/translators do overlook:

  • The problem is that most of the localizers are multilingual. When they are creating a terminology they try to think it in the source language, say English, and try to find an equivalent of the same in their target language. I think there is more danger in that because you force the target language to develop towards the source language (mostly English). One think has to be borne in mind is that language is just one of the elements of a given culture/civilisation, so forcing one language to follow the trends of another brings a lot of confusion altogether. 

I would like to give an example of a case of terminology development in Chichewa: Some language body wanted to create terminologies for emerging technologies. One of the word that under consideration was "mouse" (the computer device). The team wanted to adapt the term "mbewa", which is the Chichewa term for a "mouse" (the rodent). This brought a lot of confusion as people were debating that the mouse is a taboo in one of the indigenous tribes here hence it cannot be used for a computer gadget that will be used by all. Up to now, they haven't reached a conclusion in their argument.

  • Another problem is that we forget that the terminologies in the source language (e.g. English) do emerge naturally. They are not often forced by some board or technical grouping. Some are even created by one person who at the time did not think deeply as we do thereafter. Some words are naturally "causal". This now gives us a new perspective to look at terminology. Sometimes, having board meetings arguing about terms like these would not make sense.

Of course, sometimes view at my translationas complex. But, I view localization from a different perspective. I look at it as a way of re-vitalizing terms that are fading away at the same time as a way of ushering in new terminologies to enrich the language. With this in mind, I have a careful way of developing translations.

When I am developing terminologies I concentrate on the functionality. It is easy to explain to a native than to try to invent to do more transliteration -- which does not make sense in the long run. Sometimes, I even look at the etymological meanings on the term from the source. I try to think out of the box, not being entangled by original meaning of the word. I try not to be too academic about the term without sacrificing the rigour of terminology development. For example, I translated mouse as mlozo (a guide) and pointer as namlondola (a wand) since the guide guides a wand on the screen.

One issue we came across in

One issue we came across in localizing an operating system (which due to NDA I can’t name here), was the issue of curse/offensive words.  The Cherokee language doesn’t have "curse words" in the westernized sense.  True, we may be able to say things that may sound offensive or whatnot, but it all depends on context.  We were asked to develop a list of, for lack of a better term, "naughty words" that were to be avoided by our translators.  It took a bit for some monolingual English speaking people to understand the notion that by themselves, Cherokee words aren’t inherently bad.  It was a humorous discussion.  LOL

Based on my experience

Based on my experience working with fluent Cherokee elders, a big problem with creating online content is input of the language.  Cherokee is a syllabic language, so one way of typing that is available is phonetic typing, similar to how many Japanese users type their language using the QWERTY based hardware keyboard.  We also have a one key-one stroke keyboard input which utilizes the upper case and lower case shift option since Cherokee has a large number of characters (85).  www.languagegeek.com has many keyboard layouts and fonts for indigenous languages. 

For the more enterprising, you can create your own custom keyboard layouts by using free software.  One for the Mac is Ukelele.  http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=ukelele I’ve developed a Cherokee syllabary diacritic marking keyboard using this software.  Apple keyboards are simply XML files, so no software is actually needed if one knows XML well enough.  However, Ukelele provides a GUI for those that do not know XML.  Microsoft has the Microsoft Keyboard Layout Creator.  http://msdn.microsoft.com/en-us/goglobal/bb964665 

It’s also important to note that if you create your own keyboard that you use the Unicode code point values for your key inputs.  For example, if you made a key that you wanted to type Ḁ (which is Latin Captial Letter A with Ring Below but may be used in your language as nasalized vowel sound for example), you must make sure in your keyboard that you include the Unicode point 1E00.  I hope I’m not getting too technical, but have Unicode compatible fonts and keyboards are of the paramount importance when dealing with langauge material online.

If anyone wants to read about

If anyone wants to read about the challenges of language input online, Microsoft’s Michael Kaplan has a blog about it.  He recently wrote a post about his work with us here at Cherokee Nation. 



Another potential reason that languages may be under-represented online could be a reluctance of individual speakers to use their languages in online situations for a variety of reasons.  One reason I have discovered recently in my research project is a sense of guilt about excluding people who don't speak the language from a conversation.

This is something I have always been aware of in the case of Irish in Ireland, where socially a group will switch a conversation to English if one non-Irish speaking person is present.  I think this can and is happening online too.

When you communicate to a wide group of people, for example in Twitter or in Facebook, you will exclude some people if you communicate in a language which everyone does not understand.  In most cases of minority or lesser-used languages, speakers interact online with people from language groups other than there own.  Sometimes the majority of those they interact with will speak languages other than their own.  This can lead to a sense of guilt about excluding those who do not use the chosen language, and therefore a reluctance to use the language.

This may sound trivial but I think it could be a significant challenge in some cases, particularly in instance where languages are seriously endangered.  We have a saying in Irish 'Beatha teanga í a labhairt' (a language needs to be spoken in order to survive). 

I would be interested in hearing if other people experience similar situations.  As there is no technical solution to the problem, I wonder do people have ideas about or experience of implementing other solutions.

Míle buíochas!

guilt and fear

I too have witnessed this phenomenon of switching to English. Within the Anishinaabe culture it is described as being inclusive or polite. It is very difficult to overcome. I suspect we will have to change this mindset as our language depends on our fluent speakers talking to us even if that makes us uncomfortable. We have to work through it. 

I imagine that second language learners like myself, and perhaps even fluent speakers are shy about writing online because of our fear of making a grammatical or spelling mistakes in front of our friends. Sometimes I am shy about mistakes and other days I charge ahead without much concern. On those days, if I make a mistake I hope someone corrects me on Facebook so that my fellow language learners can learn from my mistake. The more often I practice the less painful the mistakes.

Learners and semi-speakers

Hi Tessa, thanks for joining the discussion!

Regarding the "fear" issue; I'm aware of this myself since Irish is my second language.  Two quick points: first, I think that the short posts (tweets, status updates) and informality of the language on social media sites help with this.  Learners and non-fluents speakers can use the words and phrases they know - it's happening on Twitter for sure.  Second, I've helped develop spell checkers for about 25 different languages over the last 10+ years; this may seem like a crazy way for someone to spend their time, but I put a lot of stock in simple tools like spell checkers exactly because they can help learners (or even fluent speaking elders who aren't comfortable with the written language) get past this fear.

standardization across dialects 4 spell check and user-interface

Hi Kevin, Thanks for the input. :)

I agree that informal status updates on Facebook and Twitter are useful tools to promote our language in social media, and they help people to overcome their fear of writing. Within my social network, I notice fluent speakers from different dialects reverting to English to explain what they wrote in our language, Anishinaabemowin (Nishnaabemwin). This is because not everyone uses the same writing system. Some people write phonetically, others use a system called the double vowel, and we often unintentionally mix the two. There is also a syllabic writing system but that isn't being used within my social network.  

We also face the challenge of trying to communicate across Anishinaabemowin dialects. In some cases, there is only a 40% overlap of common terms between our dialects. As a language learner it is often difficult for me to distinguish between spelling errors and differences in dialectical terminology. As I continue to learn Anishinaabemowin online I wonder about the ability of second language learners like myself to be able to functionally communicate across dialects, especially when I see fluent speakers revert to English to explain what they wrote. 

One new speaker advised me to stick with one dialect and master it one before attempting to learn the others. I think that is wise advice. However, it is challenging when a significant portion of my daily language input comes from status updates from different dialects. 

I think a spell-check would be useful for those who adopt one writing system but unless the Anishinaabeg come to agreement about common terminology, I think the spell-checkers in multiple dialects would be necessary. Has this been the case in the communities you have worked?

The discussion regarding the process of having Google, Facebook, and other software programs adopt our language for its user-interface brought to mind the significance of standardization of common terms across our dialects. I think having an Anishinaabemowin user-interface would help to reinforce language use. I am curious about what impact it has had in other language communities.   

Thank Niamh for raising the

Thank Niamh for raising the point of minority groups excluding speakers of other languages in their conversations and communications online. This is most interesting especially when these groups know that their language is disappearing. What it seems to come down to is whether it is more important to preserve one's language (an integral part of one's culture) or be able to communicate with all people at all times? Does this then mean we are all to adopt just one common language? It really is a difficult situation but I do strongly believe that people should not feel guilty about sharing with their own people things in their ow language. Cross cultural communciation is so very important but so is preserving one's own culture and hence language. I too look forward to reading about people's experiences in this.

Re: Guilt

Indeed, the tension between wanting to use an under-represented language and wanting to be understood by everyone present is a huge challenge. I've also observed it innumerable times that even Rangi friends who knew that I understood and spoke Rangi would switch to Swahili when I entered the room. That's where, I think, the importance of online spaces dedicated to a particular language comes in (as I wrote elsewhere, the set-up of a FB group for Rangi has been quite a success, and even though every once in a while someone may switch to Swahili occasionally, they are quickly brought back to Rangi by another group member, pointing to the title of the group "Tʉlʉʉsɨke Kɨlaangi" = Lët's talk Rangi).

Another aspect of this, imho, is that awareness of multilingualism has to be raised. Earlier this year, I attended a meeting of ANLoc (African Network for Localization) and everyone spoke English. Of course, there wasn't a single African language which everyone in the room would have understood but conversely, there wasn't a single person in the room who didn't speak and understand at least one African language. In such settings, bilingual opportunities need to be seized. Either the speaker says everything twice (e.g. once in Hausa and once in French) which of course makes the speech longer (but maybe has the added side effect of the speaker trying to be more concise), or the presenter talks in one language but gives a handout in another, or the PowerPoint is in another.

Something similar can be done not only in meetings but online too. Hivyo ndivyo ambavyo tungeweza kufanya mtandaoni pia. For example, I'm writing most of my FB status updates in Rangi (with an English translation of course). Kwa mfano naandika habari zangu za Facebook kwa Kirangi (pamoja na tafsiri ya Kiingereza). Let's increase our use of two languages simultaneously! Tutumie zaidi lugha mbili wakati mmoja!

PS: The previous paragraph was English-Swahili, not English-Rangi.


Hi Niamh,

   Glad you raised this issue.  I see it happen all the time among Irish-speaking friends, and even in real-world "social spaces" which are supposed to be strictly Irish-speaking (e.g. Club Chonradh na Gaeilge in Dublin).  This is, I think, one of the major advantages of online communities and social media - they allow the construction of monolingual spaces which would otherwise be impossible (because of speakers being physically scattered) or difficult to maintain (because of the politeness issue when awash in a sea of majority-language speakers).  

   Oliver points this out in relation to his Rangi Facebook group, for example.  Lots of similar Facebook groups are popping up as well in multiple languages.  Also, Facebook allows you to maintain lists of friends, and this works nicely if you post status updates in multiple languages; if I post in Irish I can choose to direct that only to my list of Irish-speaking friends (I don't actually do that; speaking Irish is part of who I am so I expect my friends to accept that).  I should say also that monolingual online spaces for minority languages have been around for a long time; for example, the email list Gaeilge-A has been running since the 1990's and has a strict Irish-only rule.

   Twitter is trickier.  The idea behind Indigenous Tweets is to encourage more people to tweet in their native language, but there is reluctance to do so since some people are afraid they will lose followers who don't speak the language.   When this is a concern, I've been encouraging them to create a second account and tweet in their native language from that account.  It's a bit of a pain to switch between accounts through the web interface, but several mobile clients (iOS for example) make it quite easy.

   It might be that very few people are using these solutions (Facebook groups, Facebook friend lists, multiple Twitter accounts), and I'm overstating the ability of sites like these to encourage the use of minority languages. Research on this question would help - I hope Niamh will keep us updated on her work! 


Hi Makadebinesikwe, Kevin, Ana and Oliver,

Go raibh míle maith agaibh! There is a lot to think about here and I am hoping to address some of these issues in my research project. 

So far, I can see that the participants in my project are using Facebook groups to create environments where Irish or Sámi are the only languages spoken.  I think this is useful and it works in terms of encouraging people to stick to using the language.  It also ensures that users get in contact with wider circles of people with an interest in and ability to speak the language than their immediate friends. 

However, I'm not sure that segregating languages in this way is the only way to go.  I think it is important to use minority languages in other contexts in social media as well.  danah boyd (http://www.danah.org/) has written about the fact that contexts collapse in social media.  Simply put, you are connected to people from different contexts in your life and cannot manage your communication with these people in the same way as you would in separate offline contexts.   I think it is important for users of minority languages to make sure that their languages are present where these contexts collapse.  This is important because it makes the fact that these languages are part of our lives and important to us visible to others in our circles of friends and followers. 

I think Twitter is an interesting platform in this regard.  The more open structure of the platform means that you can easily come in contact with other speakers of your language without having to form a separate group.  It also means that those who follow you are aware that you communicate with other people in a minority language.  This can in turn have the positive effect of creating awareness of the language community in a broader context.

I think the question of how to deal with this problem of politeness or guilt is difficult.  Although I think it is useful to have separate spaces where the languages can be used, I think it is also important to use the languages in broader contexts.  The question then is, how to encourage people to do this and not to worry about being impolite or feeling guilty. 


Some great points here, and

Some great points here, and thanks for the useful reference to boyd. 

I think it is important for users of minority languages to make sure that their languages are present where these contexts collapse.  This is important because it makes the fact that these languages are part of our lives and important to us visible to others in our circles of friends and followers.

One question on this point that interests me is when does a language move from its speakers not having the confidence to be present in these spaces, to being one where speakers feel at ease with, say,  tweeting in more than one language in the same feed without feeling that they are alienating too many people. I feel that Welsh Twitter use is reaching some kind of tipping point where, there is a critical mass of peoplke using Welsh on the platform that make it 'safe' to use the language and know that you can be heard by a substantial amount of people. But as I've said elsewhere, the subject matter is still mainly kept to the local or culturally specific, so there may be willingness to use more than one language but a wish to be part of a global discussion at the same time, which is completely natural and positive of course.

Because there is no Funding!

Another major barrier that hasn't come up yet in our discussions, but that I think should be explicitly addressed, is the issue of funding. In the end, many great ideas don't get implemented because it is really hard to fund localization efforts and other technical implementations. At Wuqu' Kawoq we constantly run into the problem of major donors in this area being unwilling to fund 'salary' - we have tried to point out that localization projects are 5% hardware and infrastructure and 95% personnel time - but largely to no avail. 

Our primary current approach is to work in personnel time as overhead and draw on our general operating fund to get the necessary work done, when it needs to get done. This has been a reasonably successful approach, but still with the underlying frustration of wishing major donors would "get this" and be more willing to put up for personnel. 

What approaches do others use? 


Hi Peter,

I agree that funding is most definitely an issue - but it raises interesting questions - should people wait for funding or take the initiative and work voluntarily?  Where should funding come from?  How might it affect the sustainability of these continuous uphill projects?

We in the Irish language community have been extremely lucky that Kevin Scannell and a relatively small number of others have put in a huge amount of voluntary effort to localise the software and social media that we use.  I see this is not happening to the same extent in the Sámi community, and I presume that it is also an issue in other communities. 

While it may be possible to apply to a number of sources for funding for such projects, in most cases the time to act is now (or yesterday, or earlier still).  In addition, some of the projects have no definite end point.  Most software projects update continuously.  Should funding be provided for the initial effort and then volunteer work be depended on? How might this affect the motivation/recruitment/coordination of volunteers?

It may be the case that raising awareness about the possibility of contributing to localisation and other projects would help to overcome some obstacles.  In the case of Facebook for example, it is possible that their translation project attracted more users to the platform than would have joined otherwise.  Users of Irish and Sámi were happy to contribute voluntarily to translating the site interface to their languages.

I want to be clear that I am not in anyway advocating working for free as the only solution to these problems.  I do however think it may be possible to coordinate voluntary contributions from large numbers of people to overcome some issues.  I also think in most cases that this might be necessary, as funding isn't always easy to come by.

What do you think?


Fundraising or volunteerism - can we do both?

Niamh Ní Bhroin wrote:

While it may be possible to apply to a number of sources for funding for such projects, in most cases the time to act is now (or yesterday, or earlier still).  In addition, some of the projects have no definite end point.  Most software projects update continuously.  Should funding be provided for the initial effort and then volunteer work be depended on? How might this affect the motivation/recruitment/coordination of volunteers?

Interesting conversation Peter, Niamh, Kevin and Michael!  I agree that it's important to engage volunteers in order to make a sustainable project and community around this work.  But I can also see the point that the promotion and preservation of these languages online is worth funding and people's time is worth monetary compensation.  I wonder if there could be something in between? 

The open source development effort that I am most familiar with is Drupal (the CMS that this website is built upon).  Drupal itself is free and open source.  There are many volunteers that maintain code and modules, write documentation, test and strategize.  But there are also Drupal developers that are happy to be paid for this time and effort.  Their hard work is then put back into the platform to be shared and built upon by the community (paid or voluntary).  Nonprofits, businesses, governments all contribute to the funding of this platform because they are purchasing a product at the end of the day.  Is there a potential for this in your translation/localization work? Just an idea.

I also wanted to plug our January dialogue (the page is not posted yet) on fundraising for human rights work.  It would be great to have your ideas and experiences on this topic shared in that dialogue!  Stay tuned!


Hi Peter,

   Niamh has already done a good job of answering for me (go raibh míle maith agat!) but I'll add some comments.  Not only do I believe that it's possible to do software translation and content generation (e.g. Wikipedia) on a volunteer-only basis, I've been arguing for a long time that this is the only way it can be done for indigenous and minority languages without lots of financial resources.

   There are just two of us who have done the great majority of open source software translations into Irish (Séamus Ó Ciardhuáin and I) - well over 1 million words translated between us.  There is a full range of free software now available in Irish.   There are many other open source localization teams, even for some "bigger" languages, that are made up of just 1-3 committed individuals.  So there's no doubt it's possible to do - the trick is finding people who are (1) fluent in the language (2) have the willingness to give up their free time and/or sleep to do this work (3) have access to computers and enough technical knowledge to do the work.   I won't lie - finding people has been exceptionally hard for us over the years, which is why the steady-state has been a team of two.  That said, we've had many people "step up" and help on a short-term basis when the work was too much for the two of us (notably in the leadup to our first release of OpenOffice.org in 2006). 

This leads into why I think volunteerism is the only sustainable approach.  Niamh raised the key point - software changes over time, and so there needs to be a team committed to updating and maintaining translations for the long term.  Under a salaried model, this means maintaining funding or grant support for the long term, which is just not realistic in most situations.  And we've seen this model fail in exactly this way (without calling anyone out publicly, I'll just note that there have been several software translations produced with funding from a grant, but then never updated after the funding ran out, and so virtually useless).

Essentially what we're doing with open source translation is leveraging the infrastructure already in place with organizations like Mozilla.  They have deep pockets, teams of full-time (salaried) employees working as localization support engineers, etc.  All we need to do is find the time and energy to provide the translations!

developing countries

kscanne wrote:

There are just two of us who have done the great majority of open source software translations into Irish (Séamus Ó Ciardhuáin and I) - well over 1 million words translated between us.  There is a full range of free software now available in Irish.   

Your work is absolutely amazing, and very inspiring. I do agree that volunteers are really important, and perhaps sustainable in some contexts, like Irish. 

However, if we are talking about endangered languages in developing countries, the context is a bit different. In particular, all of the volunteers are really poor and, most of the time, just struggling to feed their famlies. So it is harder to talk about volunteerism in that setting, and our practice is to try to pay salary, or at least honoraria for piece work, whenever possible. 


I don't think it's funding

Well, don't get me wrong, long term funding would be very nice even though few bodies provide that sort of funding. I'll partly echo Kevin on his volunteerism model, partly because it's his fault that I'm now the team of 1 doing everything from Firefox to LibreOffice into Scots Gaelic.

There's lots of factors that you can link to the failure to compete successfully in cyberspace for a small language - funding, access to technology of whatever flavour, time, electricity - the number of factors is legion. The fact that sporadically it happens though to me suggests the one thing that's lacking is not a silver bullet solving one of those problems but some planning of how to overcome these issues on the whole. Each of these problems, or even combinations thereof, can be overcome but as long as we operate in our own little pigeonholes (by choice or circumstance) doing projects pretty much on an ad-hoc basis, is not highly sustainable. Whether it's Kevin and Ciarán doing Mozilla or a public body in Scotland shelling out money for a once-off OpenOffice project, I bet most happen out of pure chance. Someone happened to have some funding and thought it was a good idea. You run into someone in a Dublin pub (/me looks at Kevin) or you ask a silly question on a forum online. But we speakers of small tongues rarely sit down and say, "right, we need to make software available in our language to prevent loss of speakers etc. Let's build a sustainable plan for the next 100 years. What are our main problems? How do we tackle them?".

Think about it the other way, in any county, poor or rich, can it be that impossible to build a model which pays a small permanent team of 2+ to do nothing but grind out software in language X? You adapt the solution to your situation but it requires for more than 1 person to sit down and make a plan.

In our case, I'm working on setting up a foundation in collaboration with a university to provide at least 2 FTEs working on this stuff. But it doesn't have to be that. Maybe you pay your team in eggs and bacon. But we need to move away from tackling everything on a day to day basis and plan long term cause this isn't something we can solve in a day.

No, it's funding

Much of what we've been talking about this week involves a few people taking the time to produce resources that can be used by an entire language community. It's not rocket science to produce those resources - it is effort. In order to have a lot of content in a language, you need people to sit down and write blogs and Wikipedia entries. In order to have an ICT environment in a language, you need people to sit down and develop an agreed terminology and use it to localize software. In order to facilitate language learning, you need people to sit down and produce teaching tools. Etc. These are not mysterious techniques - the myriad methods and material developed over the years to teach Spanish can largely be adapted to teach Quechua (taking into account cultural specificity in illustrations, appropriate reading content...), people can write Wikipedia entries in their language as soon as the Mediawiki platform has been localized and their Wikipedia has been activated, groups can work on terminology development based on the terms they encounter in their independent localization projects or taking advantage of the system we'll soon be opening up through Kamusi. I'm not downplaying the many innovative approaches that we've heard about in this forum, but I am saying that an innovation that works for Cherokee is likely to be one that can work for Sami, and one that works for Welsh may well work for Bambara - these are replicable approaches in which the most important factor in their implementation is human labor.

And human labor has a cost. Some people are in a position to absorb that cost - for example, receiving a university salary that includes language work within the job description, or having a good job in another field that enables language development as an evening hobby. For many people, though, giving labor to a project is an either/or proposition - either I do something to earn money to feed my family, or I develop online instructional material for my language.

If funds are available, then the equation changes. I can do something to feed my family, by taking on the job of producing online instructional material for my language, using replicable models from other languages so I don't have to reinvent the wheel.

At the Kamusi Project, we are in the process of systematizing the production of dictionaries across languages - in the near future we'll be launching a software platform that will build toward an interlinked multilingual dictionary that will be, toot toot, massively cool and enormously useful. Of relevance to this discussion is that each entry has a price. It takes a person some amount of time to produce a complete dictionary entry, averaging about 8 entries an hour. We just need one other data point - what is a fair hourly wage? - and we can calculate the cost of producing a single dictionary term in a language. We can also calculate the cost of producing, say, 10,000 terms. The plan is, as soon as the software is completed, to approach funders and offer them the opportunity to support dictionary development to the depth of their desire. The idea is simple - pay for the work in advance, and it will be done once, done well, and done forever. The people doing the work will be happy because they will be able to pay the rent at the same time they create resources for their language, and the language user community will be happy because suddenly there will be a free resource available for them to look up words in their language, find glosses to any other participating language, and make use of the data in applications that can be further expanded, such as machine translation or on-the-fly browser lookups. 

Having such comprehensive online dictionaries for currently under-resourced languages will, I hope, help encourage people to think of and work toward solidifying their languages as viable and important within the realm of contemporary technology. But these dictionaries, and many other resources that take time and effort to produce, will not create themselves. Someone has to pay for it, either the person doing the labor by foregoing other income activities, or people with cash who value the existence of language resources and are willing to support their construction. That's funding. That's what it will take for many languages where volunteerism is an impossibly utopian ideal. Let's not be shy about acknowledging it.


One consideration when trying to increase usage/awareness online is the nature/shape of the community.  If you have a group of people who regularly see each other, who are excited to work together, & who share an enthusiasm for not just the language, but also for other common themes, you have a good starting formula for increased use.  When people feel secure in their social group & have a common set of interests that would unite them regardless of the language they are using, they will likely be more inclined to work towards the goal of language revitalization.  It's not so much about how you can alter the community structure, in some ways, as about how you can pinpoint & take advantage of the structures that are already in place.  If you have a sewing club, for example, ask if they would be willing to come together once a week to chat about sewing in the heritage language, record it, and upload it as a podcast.  Or ask if one of the members would be willing to write a summary of their weekly meeting in the language on a blog.  One way to increase participation in a language is to see where participation for something already exists & build on top of that.  Of course, that's depending on how well-received that idea is by the speakers. In such a situation, it might be a wise idea to consult speakers one on one and ask if they would consult each other.  The key, I think, is not necessarily to pull a mountain up by the top, but build one from the bottom.

My thoughts

I am part of the "Jaqi Aru" Aymara Virtual Community in El Alto-Bolivia. 

JaqiAru (which means “the language of the human” or "the language of the people" in the native language of Aymara) is an aymara bilingual and trilingual community (university students, professionals and who finished a university career) in El Alto, Bolivia, who are committed to preserving and promoting the use of their native Aymara tongue on the internet and the Aymara culture, through the creation of content using of digital media tools.

The community of JaqiAru has been working online and meeting in-person to participate in a wide variety of activities related to their stated goals of promoting the use and to value their native language and culture. The team has established 5 main activities towards accomplishing their goal:


Accoring to my experience we faced many problems.

- A language is unerrepresented because the governments of the countries. They donot support minority languages, and the current generations donot realize about the importnace the language.

- we faced a lot of problems, internet access (right now I am rwiting from a Cafe Internet), tools, financial problems (it is really har to work with volunteers).

- what is important for us as community is that we have been acquiring many new ideas in order to make strong our language in the internet.


On the other hand, I ma really glad to hear all the comments with very nice experiences.




Topic locked