Computational Methods Meet Parallel Data: Approaches for Comparative Analysis of Passives in European Languages

Atle Grønn

University of Oslo


The RuN-Euro corpus and its applications

The RuN-Euro corpus is a parallel corpus originally consisting of Norwegian and Russian literary texts (the RuN Corpus, developed at the University of Oslo in 2008–2010 https://www.hf.uio.no/iln/tjenester/kunnskap/sprak/korpus/flersprakligekorpus/run/). Parallel texts in other European languages were later added to constitute an extended RuN-Euro Corpus (2010-). For an overview of the languages and texts, see: http://www.nevmenandr.net/run/index.php . The texts are aligned at the sentence level and have been tagged for grammatical information at the word level. The RuN-Euro corpus uses the Glossa interface developed by the Text Laboratory at the University of Oslo.

In the presentation, I will discuss some concrete applications of the parallel corpus RuN-Euro in three different areas: translation studies, lexicography and grammar.

Translation studies. The problem of realia. In a recent PhD thesis, Alla Kharina (2019) studies the translation of realia terms from Russian into Norwegian and English, using data from the RuN Corpus. It proves fruitful to compare translation strategies on a continuum from foreignization (1) to domestication (2), here illustrated with two examples from English translations of Russian novels:

  1. … шар сорвался вперед, просвистев в нескольких вершках надо мной. [Akunin]

  2. … the sphere went hurtling forward, whistling by only a few vershoks above me.

  3. ... одежда на нем тоже какая-то старомодная, что-то вроде косоворотки [Tsypkin]

    ... his clothes are also out-of-date, […] what looks like a traditional Russian shirt.

Kharina provides a detailed classification with several intermediate categories between full foreignization and full domestication (=omission).

Lexicography. I will discuss certain phenomena that occur frequently in fiction texts, such as pragmatic particles (Mozharovskaya 2017) and idioms (Prušková 2013). I will argue that even a parallel corpus of limited size can give us interesting insights into these phenomena.

Grammar. The RuN parallel corpus is particularly well suited for the study of lexico-grammatical categories in Russian such as Aktionsarten (verbal prefixes), as studied in (Roos 2012) and converbs (Krave 2011). The translation of these constructions into English and Norwegian sheds light on the meaning of the prefixes and the morphology used in the converb constructions.



Grønn A. & Marijanović I. Russian in Contrast: form, meaning and parallel corpora. Oslo Studies in Language, 2(1), 2010.

Kharina A. Realia in Literary Translation. A quantitative and qualitative study of Russian realia in Norwegian and English translations, PhD thesis, University of Oslo, 2019.

Krave M.F. Converbs in Contrast: Russian converb constructions and their English and Norwegian counterparts. PhD thesis, University of Oslo, 2011.

Mozharovskaya M. Русские субъективно-модальные частицы же и ведь и способы их отражения в норвежском языке. Master thesis, University of Oslo, 2017.

Prušková J. Kroppslige idiomer i russisk, norsk og tsjekkisk. Master thesis, University of Oslo, 2013.

Roos T. Изучение семантики начинательности и чистовидовой функции приставок за- и по- на основе сопоставительного анализа русского и норвежского языка, Master thesis, University of Oslo, 2012.