The number of languages spoken in Africa ranges from 1,000 to 2,500, depending on estimates and definitions. Monolingual States do not really exist on this continent since languages usually spread across borders. The number of languages varies from 2 or 3, in Burundi and Rwanda, to more than 400 in Nigeria. Multilingualism is indeed ubiquitous in Sub-Saharan African societies. To support the development and use of languages, many institutions and organizations have been created, often under the auspices of the UNESCO or the African Union. In summary, the major issues met by these initiatives are:
the development and standardization of linguistic resources in many languages, not only the higher-resourced ones,
the introduction of national languages in the digital space through the creation and dissemination of content in those languages,
the multilingual access to digital resources.
If equipped with linguistic and computer resources, languages having a written form can be integrated into the development products of major players in the digital world, attracted by a market with great economic potential. For instance, the mobile phone manufacturers offer more and more models with textual and graphical interfaces in African languages. Nevertheless the use of written/textual interfaces requires to be literate! According to Denis Gikunda, director of the development program in African languages at Google, one of the highlights of the online market development in Africa is to ensure that applications talk to Africans in the true sense of the word. Several publications of UNESCO make explicit reference to speech synthesis (and recognition) as a technological facilitator (one can quote, for instance, the following: “The illiteracy rate remains high: the use of voice interfaces is relevant”).
Thus, today is very favorable to the development of a market for speech in African languages. People’s access to ICT is done mainly through mobile (and keyboard) and the need for voice services can be found in all sectors, from higher priority (health, food) to more fun (games, social media). For this, overcoming the language barrier is needed and this is what we propose in this project where two main aspects are involved: fundamentals of speech analysis (language phonetic and linguistic description, dialectology) and speech technologies (ASR and TTS) for African languages. ALFFA project is really interdisciplinary since it not only gathers technology experts (LIA, LIG, VOXYGEN) but also includes fieldwork linguists/phoneticians (DDL). Such a partnership is very important since we want to reuse the strong experience of field linguists in data collection, as well as their knowledge in dialectal/regional differences, particularly important in Africa. In the project, developed ASR and TTS technologies would be used to build micro speech services for mobile phones in Africa (for instance, a phone service to consult the “price of commodities” or provide “voice reporting for information systems”). The ALFFA project will also strengthen the interactions between a start-up (Voxygen) and academics on the fundamentals aspects of African languages in order to start deploying prototypes/services in a continent where the Telecom market has a strong potential. In addition, the project will help the academic partners to reach an international leadership in the domain of speech processing and analysis for African languages which will reinforce their (already large) collaboration network on this continent. On this purpose, subcontracting is planned in the framework of the ALFFA project in order to set up sustainable collaborations with local actors (academics, NGO) in Africa.