Digital Revolution for Under-Resourced Languages (DigRevURL 2017)
The digital revolution is having an increasingly large impact on many aspects of our daily lives, and creating a new way of communication and socialization. There is no doubt that speech and language technology is one of the key enabling technologies in the digital revolution. Many institutions and organizations have been started to develop and standardize spoken and written resources in many languages to provide multilingual access to digital resources and to promote national languages in the digital space. A lot of speech and language technologies such as speech recognition and speech translation have also been developed and readily available for our everyday life.
However, despite all the hype about how digital revolution could connect the world, currently, it only covers about five percent of the world’s languages. This is because most of those activities have been conducted only in several languages where large resources are available. There are however more than 6000 languages and most of them have not been covered yet. And even languages with millions of speakers can lack the resources needed to construct spoken and language technologies.
This special session aims to accelerate the research activities for under-resourced languages, and to provide a forum for linguistic and speech technology researchers, as well as academic and industrial counterparts to share achievements and challenges in all areas related to natural language processing and spoken language processing of under-resourced languages, mainly used in South, Southeast and West Asia; North and Sub-Africa; North and Eastern Europe. Particularly, as Interspeech 2017 will be held in Sweden, we highly encourage any submissions on under-resourced languages from Nordic, Uralic, and Slavic regions.
The theme of this special session will focus towards digital revolution for under-resourced languages, including but not limited to:
- Linguistic and cognitive studies
- Resources acquisition of text and speech corpora
- Zero resource speech technologies
- Cross-lingual/multi-lingual acoustic and lexical modeling
- Code-switched Lexical modeling
- Speech-to-text and speech-to-speech translation
- Speech recognition, text-to-speech synthesis, and dialog system
- Applications of spoken language technologies for under-resourced languages