Russian Learner Corpora Research: State of the Art and Call for Action

Authors

Keywords:

Corpus linguistics, Learner corpus research, Corpus-based research, Russian language corpora, Second language acquisition, Heritage language acquisition

Abstract

With the increase in availability and user-friendliness of Russian language corpora and corpus-analytic tools, the field of Russian language education has recently begun to employ corpus linguistics as an approach to understanding the dynamic of language development in users of Russian as a second and heritage language. The paper provides a brief overview of the current state of learner corpus research as a field and explores the benefits of application of corpus linguistics methods and instruments to the study of Russian. The paper reviews pertinent issues in corpora design, compilation, and annotation; offers an overview of the existing Russian language corpora and reports on the currently available corpus-based studies of Russian as a second/heritage language. The paper concludes with a call to the field to explore the benefit of corpus-based approaches to the study of Russian.

Downloads

Download data is not yet available.

References

ALSUFIEVA, A.; KISSELEV, O.; FREELS, S. Results 2012: Using Flagship Data to Develop a Russian Learner Corpus of Academic Writing. Russian Language Journal, n. 62, pp.79-105, 2012.

ANDRJUSHINA, N.; KOZLOVA, T. Leksicheskii minimum po russkomu yazyku kak inostrannomu. Bazovyj Uroven’ [Lexical minimum for Russian as a foreign language. Basic level]. 5.ed. St. Petersburg: Zlatoust, 2020.

ANTHONY, L. AntConc (Version 3.5.8) [Computer Software]. Tokyo, Japan: Waseda University. Available on: http://www.laurenceanthony.net/software, 2019.

APRESJAN, V. YU. Russkie possessivnye konstrukcii s nulevym i vyraženynnym glagolom: pravila i ošibki. Russkij jazyk v naučnom osvesčenii, n. 33, pp.86-116, 2017.

BIBER, D.; CONRAD, S.; REPPEN, R. Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press, 2004.

BIBER, D.; CONRAD, S. Corpus Linguistics and Grammar Teaching. White Plains, NY: Pearson Education, 2010.

BIRIUK, O.; GUSEV, V.; KALININA, YE. Slovar’ glagol’noj sochetaemosti nepredmetnyx imion russkogo yazyka [Dictionary of Verbal Compatibility of Non-Objective Names of the Russian Language]. Available from http://dict.ruslang.ru/abstr_noun.php.

BOULTON, A. Data-Driven Learning and Language Pedagogy. In: THORNE, S., MAY, S. (eds.). Language, Education and Technology. Encyclopedia of Language and Education. New York: Springer, Cham, 2017.

BREZINA, V.; WEILL-TESSIER, P.; MCENERY, A. #LancsBox v. 5.x. [software]. 2020. Available from http://corpora.lancs.ac.uk/lancsbox.

BULTÉ, B.; HOUSEN, A. Conceptualizing and Measuring Short-Term Changes in L2 Writing Complexity. Journal of Second Language Writing, n. 26, pp.42-65, 2014.

CONRAD, S.; BIBER, D. Real Grammar: A Corpus-Based Approach to English. New York: Pearson/Longman, 2009.

CROSSLEY, S. A.; KYLE, K. Assessing Writing with the Tool for the Automatic Analysis of Lexical Sophistication (TAALES). Assessing Writing, n. 38, pp.46-50, 2018.

DONRUSHINA R. N.; LEVINZON, A. I. Informatsionnye tehnologii v gumanitarnom obrazovanii: Natsional’nyj korpus russkogo yazyka [Information Technologies in Humanities Education: National Corpus of the Russian Language]. Voprosy obrazovaniia, n. 4, 2006.

EREMINA, O. S. Russkie nesvobodnye vyrazhenia v rechi inostrantsev: korpusnyi podhod [Russian Formulaic Expressions in the Speech of Foreigners: Corpus Approach]. Russkii jazyk za rubezhom, n. 6, pp.29-35, 2020.

FURNISS, E. Using a Corpus-Based Approach to Russian as a Foreign Language Materials Development. Russian Language Journal, n. 63, pp.195-212, 2013.

GRANGER, S. The Contribution of Learner Corpora to Second Language Acquisition and Foreign Language Teaching. In: AJMER, K. (ed.). Corpora and Language Teaching. Philadelphia/Amsterdam: John Benjamins, 2009, pp.13-32.

GRIES, S. What is Corpus Linguistics? Language and Linguistics Compass, v. 3, n. 5, pp.1225-1241, 2009.

GRIES, S. Methodological and Interdisciplinary Stance in Corpus Linguistics. In: BARNBROOK, G.; VIANA, V.; ZYNGIER, S. (eds.). Perspectives on Corpus Linguistics: Connections and Controversies. Philadelphia/Amsterdam: John Benjamins, 2011, pp.81-98.

HUNSTON, S. Corpora in Applied Linguistics. Cambridge: Cambridge UP, 2002.

KISSELEV, O. Corpus-Based Methods in the Study of Heritage Languages. In: POLINSKY, M.; MONTRUL, S. (eds.). The Cambridge Handbook on Heritage Languages. Cambridge University Press, 2021, pp.520-544.

KISSELEV, O. Word Order Patterns in the Writing of Heritage and Second Language Learners of Russian. Russian Language Journal, n. 69, pp.149-174, 2019.

KISSELEV, O.; KOPOTEV, M.; KLIMOV, A. Specific Markers of Syntactic Complexity in Academic Russian: A Longitudinal Corpus Study. In: LEŃKO-SZYMAŃSKA, A.; GÖTZ, S. (eds.). Complexity, Accuracy & Fluency in Learner Corpus Research. John Benjamins, forthcoming.

KISSELEV, O.; FURNISS, E. Corpus Linguistics and Russian Language Pedagogy. In: DENGUB, E.; DUBININA, I.; MERILL, J. (eds.). The Art of Teaching Russian. Washington: Georgetown University Press, 2020, pp.307-332.

KISSELEV, O.; ALSUFIEVA, A. The Development of Syntactic Complexity in the Writing of Russian Language Learners: A Longitudinal Corpus Study. Russian Language Journal, n. 67, pp.27-53, 2017.

KOPOTEV, M. Ispol'zovanie èlektronnyx korpusov v prepodavanii russkogo jazyka [The Use of Electronic Corpora in Teaching the Russian Language]. In: LINDSTEDT J. et al. (eds.), SLAVICA HELSINGIENSIA 35, S ljubov'ju k slovu, Festschrift in honour of Professor Arto Mustajoki on the occasion of his 60th birthday. Helsinki, 2008, pp.110-118.

KOPOTEV, M. O samom slozhnom: Izuchenie sochetaemosti slov online [About the Most Difficult: Learning the Combination of Words Online]. Russkij jazyk za rubezhom, n. 6, pp.36-43, 2020.

KOPOTEV, M.; MUSTAJOKI, A. Sovremennaja korpusnaja rusistika [Modern Corpus Russian Studies]. In: MUSTAJOKI, A.; KOPOTEV, М.; BIRJULIN, L.; PROTASOVA, YU. (eds.). Instrumentarij rusistiki: Korpusnye podxody. Helsinki: Helsinki UP, 2008, pp.7-24.

KUSTOVA, G.I. Slovar' russkoi idiomatiki. Sochetaniya slov so znacheniyem vysokoi stepeni [A Dictionary of Russian Idiomology. Word Combinations with the Significance of a High Degree]. Moscow, 2008. http://dict.rislang.ru/magn.php.

KYLE, K.; CROSSLEY, S. A. Automatically Assessing Lexical Sophistication: Indices, Tools, Findings, and Application. TESOL Quarterly, v. 49, n. 4, pp.757-786, 2015.

LEBEDEVA M. YU. Dano mne telo – chto mne delat’ s nim? Primenenie korpusnyh tehnologii v lingvodidaktike RKI [I Have Been Given a Body - What Am I to Do with It? Application of Corpus Technologies in Linguodidactics of Russian as a Foreign Language.]. Russkij jazyk za rubezhom, n. 6, pp.4-13, 2020.

LEECH, G. Corpora and Theories of Linguistic Performance. In: SVARTVIK, J. (ed.). Directions in Corpus Linguistics: Proceedings of Nobel Symposium 82. Berlin, New York: Mouton de Gruyter, 1992, pp.105-122.

LEECH, G. Teaching and Language Corpora: A Convergence. In: WICHMANN, A. et al. (ed.). Teaching and Language Corpora. London and New York: Routledge, pp.1-24, 2014.

LEE, S. H.; JANG, S. B.; SEO, S. K. Annotation of Korean Learner Corpora for Particle Error Detection. CALICO Journal, v. 26, n. 3, pp.529-544, 2009.

LJASHEVSKAJA, О. N.; SHAROV, S.A. Chastotnyi slovar’ sovremennogo russkogo yazyka: Na materialax Natsional’nogo korpusa russkogo yazyka [Frequency Dictionary of the Modern Russian Language: On the Materials of the National Russian Corpus]. Azbukovnik, 2009.

LU, X.; YOON, J.; KISSELEV, O. Adding to Academic Formula Lists: Phrase-Frames for Research Article Introductions in Social Sciences. Journal of English for Academic Purposes, v. 36, pp.76-85, 2018.

MCENERY, T.; HARDIE, A. Corpus Linguistics: Method, Theory and Practice. Cambridge: Cambridge UP, 2012.

NORRIS, J.; ORTEGA, L. Measurement for Understanding: An Organic Approach to Investigating Complexity, Accuracy, and Fluency in SLA. Applied Linguistics, v. 30, n. 4, pp.555–578, 2009.

NOVIKOV, A.; VINOKUROVA, V. Learner Corpus as a Medium for Tasks. In: NUSS, S. V.; WHITEHEAD MARTELLE, W. (eds.). Task-Based Instruction for Teaching Russian as a Foreign Language. London and New York: Routledge, 2022.

PAVLENKO, A.; DRIAGINA, V. Russian Emotion Vocabulary in American Learners’ Narratives. The Modern Language Journal, n. 91, pp.213-234, 2007.

PAQUOT, M.; GRANGER, S. Formulaic Language in Learner Corpora. Annual Review of Applied Linguistics, v. 32, n. 1, pp.130-149, 2012.

PEIRCE, G. Representational and Processing Constraints on the Acquisition of Case and Gender by Heritage and L2 Learners of Russian: A Corpus Study. Heritage Language Journal, v. 15, n. 1, pp.95-111, 2018.

POLAT, N.; MAHALINGAPPA, L.; MANCILLA, R. L. Longitudinal Growth Trajectories of Written Syntactic Complexity: The Case of Turkish Learners in an Intensive English Program. Applied Linguistics, v. 41, n. 5, pp.688-711, 2020.

RAKHILINA, E.; VYRENKOVA, A.; MUSTAKIMOVA, E.; LADYGINA, A.; SMIRNOV, I. Building a Learner Corpus for Russian. In: VOLODINA, E. et al. (ed.). Proceedings of the Joint Workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition. Umea, Sweden: LiU Electronic Press, 2016, pp.66-75.

ROSEN, A.; HANA, J.; ŠTINDLOVÁ, B.; FELDMAN, A. Evaluating and Automating the Annotation of a Learner Corpus. Language Resources and Evaluation, v. 48, n. 1, pp.65-92, 2014.

ROZOVSKAYA, A.; ROTH, D. Building a State-of-the-Art Grammatical Error Correction System. Transactions of American Computational Linguistics, v. 2, pp.419-434, 2014.

SCOTT, M. WordSmith Tools Version 7 [Computer Program]. Stroud: Lexical Analysis Software, 2016.

SHAROFF, S.; UMANSKAYA, E.; WILSON, J. A Frequency Dictionary of Russian: Core Vocabulary for Learners. London and New York: Routledge, 2014.

STRAKA, M.; STRAKOVÁ, J. Tokenizing, Pos Tagging, Lemmatizing and Parsing ud 2.0 with Udpipe. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, 2017, pp.88-99.

Published

2022-12-19

How to Cite

Kisselev, O. (2022). Russian Learner Corpora Research: State of the Art and Call for Action . Bakhtiniana. Revista De Estudos Do Discurso, 18(1), Port. 8–31 / Eng. 8. Retrieved from https://revistas.pucsp.br/index.php/bakhtiniana/article/view/55747

Issue

Section

Articles