Natalia Levshina (MPI for Psycholinguistics)

Parallel corpora and big questions in linguistics

Parallel corpora are indispensable in contrastive linguistics and translation studies. In my talk, I will discuss the contribution of parallel corpus research to answering big linguistic questions about universal functional biases of communication, mechanisms of grammaticalization and cultural variation. I will also show how parallel corpora can help us to create semantic maps – a popular tool for typological research. These ideas will be illustrated by the following case studies:

For all these case studies, I will use data from the ParTy corpus of film subtitles and multivariate statistical methods (random forests, Multidimensional Scaling, hierarchical cluster analysis) and graph-theoretical approaches to model cross-linguistic variation.


Brown, R., &Gilman, A. 1960. The Pronouns of Power and Solidarity. In: Sebeok, T. A. (ed.), Style in Language, 253-276. Cambridge, Mass: MIT Press.

Dixon, R. M. W. 2000. A typology of causatives: form, syntax and meaning. In: R. M. W. Dixon & A. Y. Aikhenvald (eds.), Changing valency: Case studies in transitivity, 30–83. Cambridge: Cambridge University Press.

Haiman, J. 1983. Iconic and economic motivation. Language 59(4). 781–819.

Haspelmath, M. 2003. The geometry of grammatical meaning: Semantic maps and cross-linguistic comparison. In: M. Tomasello (ed.), The New Psychology of Language: Cognitive and Functional Approaches to Language Structure, Vol. 2, 211-242. Mahwah, NJ: Lawrence Erlbaum Associates.

Haspelmath, M. 2008. Frequency v s. iconicity in explaining grammatical asymmetries. Cognitive Linguistics 19(1): 1–33.

Levshina, N. 2015. European analytic causatives as a comparative concept: Evidence from a parallel corpus of film subtitles. Folia Linguistica 49(2): 487-520.

Levshina, N. 2017. A multivariate study of T/V forms in European languages based on a parallel corpus of film subtitles. Research in Language 15(2): 153–172.

Soares da Silva, A. 2012. Stages of grammaticalization of causative verbs and constructions in Portuguese, Spanish, French and Italian. Folia Linguistica 46(2). 513–552.

Talmy, L. 2000. Toward a Cognitive Semantics. Cambridge, MA: MIT Press.

van der Auwera, J. 2013. Semantic maps, for synchronic and diachronic typology. In A. G. Ramat, C. Mauri, & P. Molinelli (eds.), Synchrony and Diachrony: A dynamic interface, 153–176. Amsterdam/Philadelphia: John Benjamins.