Ruumiandmete rakendamine keeleteaduses (rurake)


Estonian Dialect Corpus

The corpus is based on dialect recordings and contains phonetically transcribed texts, dialect texts in simplified transcription, morphologically tagged texts, a database containing information about informants and recordings, and syntactically parsed texts.

Archives of Estonian Dialects and Kindred Languages

The University of Tartu Archives of Estonian Dialects and Kindred Languages (AEDKL) consist of fieldwork recordings and written materials of Estonian Dialects and Finno-Ugric languages.

Shiny applications

Various interactive applications based on the data from Estonian Dialect Corpus and the dialect atlases by Andrus Saareste. The applications are made using RStudio’s package Shiny.

Rurake in Github

Andrus Saareste’s digitized maps and Python scripts for working with Estonian Dialect Corpus

Applying Spatial Data in Linguistics

The project "Ruumiandmete rakendamine keeleteaduses" (Applying Spatial Data in Linguistics) was initially meant for adding geographical information to the existing corpus of Estonian dialects (CED) and for digitizing dialect maps of the „Väike Eesti murdeatlas” („Small atlas of Estonian dialects") by Andrus Saareste. It has now grown to be an undertaking, that also aims to create resources for combining and analyzing different data sources with a spatial dimension and bringing dialect data closer to people not actively working in the field.

The team


Data Management Specialist (University of Tartu Library)


Junior researcher of Applied Dialectology


Junior researcher of Automatic Language Processing


Specialist of Geolinguistics


Here you'll find the latest project updates.