What is the Glossed Audio Corpus of Ainu Folklore?

The last decade has been marked with an increase in global awareness of language endangerment and the emergence of language documentation as a separate field that focuses on building multi-purpose corpora of data from endangered languages.

Now Ainu is very gravely endangered. Originally, Ainu was not a written language but thanks to accumulated efforts on recording Ainu that started more than a century ago, the language, culture and oral literature have been well documented. Thus, Ainu studies will continue and they are most likely to thrive when presented on a wider international scale. This will strengthen the connection of Ainu studies to parallel endangered-language communities elsewhere.

A Glossed Audio Corpus of Ainu folklore is the first fully glossed and annotated digital collection of Ainu folktales with translations into Japanese and English. Most materials were recorded by Hiroshi Nakagawa in 1977 to 1983 with a very talented speaker and story-teller, Mrs. Kimi Kimura (1900-1988, born in Penakori Village, upper district of the Saru River) whose proficiency in Ainu considerably surpassed that of her Japanese. The abundance, repertoire and tempo of the folktales are outstanding.

To provide a safe long-term repository of language materials, the audio files were deposited with the Endangered Language Archive of SOAS, University of London, along with other outcome of the project “Documentation of the Saru Dialect of Ainu” (2007-2009; principal investigator: Anna Bugaeva) funded by the Endangered Languages Documentation Programme (out of the Rausing Foundation). The deposit http://hdl.handle.net/2196/00-0000-0000-0001-E77F-A includes 23 folktales, viz. 20 uepeker ‘prosaic folktales’ and 3 kamuy yukar ‘divine epics’; the total recording time is about 7 hours and the total number of Ainu words is 44,717.

In fiscal year 2015, we released 10 glossed folktales (8 uepeker ‘prosaic folktales’ and 2 kamuy yukar ‘divine epics’) with a total recording time of about 3 hours.

This portion of the corpus was created as part of the “Typological and Historical/Comparative Research on the languages of the Japanese Archipelago and its Environs” (project leader: John Whitman; the Ainu research group leader: Anna Bugaeva) and “Documentation and Transmission of Endangered Languages and Dialects in Japan” (project leader: Nobuko Kibe), NINJAL Collaborative Research Projects and funded by the FY2015 Grant for Publication of Project Outcomes.

In fiscal year 2017, we released 13 glossed folktales (12 uepeker ‘prosaic folktales’ and 1 kamuy yukar ‘divine epics’) with a total recording time of about 4 hours. We gratefully acknowledge the funding received towards the development of corpus from the NINJAL research project “Endangered Languages and Dialects in Japan” (project leader: Nobuko Kibe).

Ainu texts were transcribed by Hiroshi Nakagawa (Chiba University, professor; NINJAL project member) and Anna Bugaeva (NINJAL project associate professor; Ainu research group leader). Translations into Japanese were carried out by Hiroshi Nakagawa and into English by Anna Bugaeva (with the assistance of Sarah Rumme). English and Japanese glossing (morphological annotation) was done by Miki Kobayashi (Chiba University PhD student; NINJAL adjunct researcher; NINJAL project member) under the supervision of Anna Bugaeva.

In fiscal year 2019, we released 7 glossed folktales (3 uepeker ‘prosaic folktales’ and 4 kamuy yukar ‘divine epics’; about 1 hour) recorded by Anna Bugaeva in 1999 to 2000 with a native speaker of the Chitose dialect of Ainu, Mrs. Ito Oda (1908-2000), and previously published with translations into English in Bugaeva, Anna (2004) Grammar and Folklore Texts of the Chitose Dialect of Ainu (Idiolect of Ito Oda). ELPR A2-045, Suita: Osaka Gakuin University. Translations into Japanese and Japanese glossing were carried out by Yoshimi Yoshikawa (Chiba University PhD student; NINJAL adjunct researcher) under the supervision of Anna Bugaeva (Tokyo University of Science/NINJAL).

In fiscal year 2020, we are pleased to released 8 glossed folktales (6 uepeker ‘prosaic folktales’ and 2 kamuy yukar ‘divine epics’; 96 minutes) recorded by Anna Bugaeva in 1999 to 2000 with a native speaker of the Chitose dialect of Ainu, Mrs. Ito Oda (1908-2000), and previously published with translations into English in Bugaeva, Anna (2004) Grammar and Folklore Texts of the Chitose Dialect of Ainu (Idiolect of Ito Oda). ELPR A2-045, Suita: Osaka Gakuin University. Translations into Japanese and Japanese glossing were carried out by Yoshimi Yoshikawa (Chiba University PhD student; NINJAL adjunct researcher) under the supervision of Anna Bugaeva (Tokyo University of Science/NINJAL). The total number of Ainu words in all Ito Oda’s texts is 14,285, which makes it reach 59,002 words in the whole corpus. This outcome would not been possible without the help of Shirō Akasegawa (Lago Institute of Language) who built the online system. We gratefully acknowledge the funding received towards the development of corpus from the NINJAL research project “Endangered Languages and Dialects in Japan” (project leader: Nobuko Kibe).

We truly hope that the corpus will be useful to the Ainu people who are now in the process of revitalizing their language and culture, to the international community of linguists and cultural anthropologists, and to all people who are interested in the Ainu language and oral literature, which are an integral part of human intellectual heritage.

Hiroshi Nakagawa, Anna Bugaeva, Miki Kobayashi, and Yoshimi Yoshikawa

Tokyo, February 22, 2021