Russian Learner Corpora Research: State of the Art and Call for Action
Keywords:
Corpus linguistics, Learner corpus research, Corpus-based research, Russian language corpora, Second language acquisition, Heritage language acquisitionAbstract
With the increase in availability and user-friendliness of Russian language corpora and corpus-analytic tools, the field of Russian language education has recently begun to employ corpus linguistics as an approach to understanding the dynamic of language development in users of Russian as a second and heritage language. The paper provides a brief overview of the current state of learner corpus research as a field and explores the benefits of application of corpus linguistics methods and instruments to the study of Russian. The paper reviews pertinent issues in corpora design, compilation, and annotation; offers an overview of the existing Russian language corpora and reports on the currently available corpus-based studies of Russian as a second/heritage language. The paper concludes with a call to the field to explore the benefit of corpus-based approaches to the study of Russian.
Downloads
References
ALSUFIEVA, A.; KISSELEV, O.; FREELS, S. Results 2012: Using Flagship Data to Develop a Russian Learner Corpus of Academic Writing. Russian Language Journal, n. 62, pp.79-105, 2012.
ANDRJUSHINA, N.; KOZLOVA, T. Leksicheskii minimum po russkomu yazyku kak inostrannomu. Bazovyj Uroven’ [Lexical minimum for Russian as a foreign language. Basic level]. 5.ed. St. Petersburg: Zlatoust, 2020.
ANTHONY, L. AntConc (Version 3.5.8) [Computer Software]. Tokyo, Japan: Waseda University. Available on: http://www.laurenceanthony.net/software, 2019.
APRESJAN, V. YU. Russkie possessivnye konstrukcii s nulevym i vyraženynnym glagolom: pravila i ošibki. Russkij jazyk v naučnom osvesčenii, n. 33, pp.86-116, 2017.
BIBER, D.; CONRAD, S.; REPPEN, R. Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press, 2004.
BIBER, D.; CONRAD, S. Corpus Linguistics and Grammar Teaching. White Plains, NY: Pearson Education, 2010.
BIRIUK, O.; GUSEV, V.; KALININA, YE. Slovar’ glagol’noj sochetaemosti nepredmetnyx imion russkogo yazyka [Dictionary of Verbal Compatibility of Non-Objective Names of the Russian Language]. Available from http://dict.ruslang.ru/abstr_noun.php.
BOULTON, A. Data-Driven Learning and Language Pedagogy. In: THORNE, S., MAY, S. (eds.). Language, Education and Technology. Encyclopedia of Language and Education. New York: Springer, Cham, 2017.
BREZINA, V.; WEILL-TESSIER, P.; MCENERY, A. #LancsBox v. 5.x. [software]. 2020. Available from http://corpora.lancs.ac.uk/lancsbox.
BULTÉ, B.; HOUSEN, A. Conceptualizing and Measuring Short-Term Changes in L2 Writing Complexity. Journal of Second Language Writing, n. 26, pp.42-65, 2014.
CONRAD, S.; BIBER, D. Real Grammar: A Corpus-Based Approach to English. New York: Pearson/Longman, 2009.
CROSSLEY, S. A.; KYLE, K. Assessing Writing with the Tool for the Automatic Analysis of Lexical Sophistication (TAALES). Assessing Writing, n. 38, pp.46-50, 2018.
DONRUSHINA R. N.; LEVINZON, A. I. Informatsionnye tehnologii v gumanitarnom obrazovanii: Natsional’nyj korpus russkogo yazyka [Information Technologies in Humanities Education: National Corpus of the Russian Language]. Voprosy obrazovaniia, n. 4, 2006.
EREMINA, O. S. Russkie nesvobodnye vyrazhenia v rechi inostrantsev: korpusnyi podhod [Russian Formulaic Expressions in the Speech of Foreigners: Corpus Approach]. Russkii jazyk za rubezhom, n. 6, pp.29-35, 2020.
FURNISS, E. Using a Corpus-Based Approach to Russian as a Foreign Language Materials Development. Russian Language Journal, n. 63, pp.195-212, 2013.
GRANGER, S. The Contribution of Learner Corpora to Second Language Acquisition and Foreign Language Teaching. In: AJMER, K. (ed.). Corpora and Language Teaching. Philadelphia/Amsterdam: John Benjamins, 2009, pp.13-32.
GRIES, S. What is Corpus Linguistics? Language and Linguistics Compass, v. 3, n. 5, pp.1225-1241, 2009.
GRIES, S. Methodological and Interdisciplinary Stance in Corpus Linguistics. In: BARNBROOK, G.; VIANA, V.; ZYNGIER, S. (eds.). Perspectives on Corpus Linguistics: Connections and Controversies. Philadelphia/Amsterdam: John Benjamins, 2011, pp.81-98.
HUNSTON, S. Corpora in Applied Linguistics. Cambridge: Cambridge UP, 2002.
KISSELEV, O. Corpus-Based Methods in the Study of Heritage Languages. In: POLINSKY, M.; MONTRUL, S. (eds.). The Cambridge Handbook on Heritage Languages. Cambridge University Press, 2021, pp.520-544.
KISSELEV, O. Word Order Patterns in the Writing of Heritage and Second Language Learners of Russian. Russian Language Journal, n. 69, pp.149-174, 2019.
KISSELEV, O.; KOPOTEV, M.; KLIMOV, A. Specific Markers of Syntactic Complexity in Academic Russian: A Longitudinal Corpus Study. In: LEŃKO-SZYMAŃSKA, A.; GÖTZ, S. (eds.). Complexity, Accuracy & Fluency in Learner Corpus Research. John Benjamins, forthcoming.
KISSELEV, O.; FURNISS, E. Corpus Linguistics and Russian Language Pedagogy. In: DENGUB, E.; DUBININA, I.; MERILL, J. (eds.). The Art of Teaching Russian. Washington: Georgetown University Press, 2020, pp.307-332.
KISSELEV, O.; ALSUFIEVA, A. The Development of Syntactic Complexity in the Writing of Russian Language Learners: A Longitudinal Corpus Study. Russian Language Journal, n. 67, pp.27-53, 2017.
KOPOTEV, M. Ispol'zovanie èlektronnyx korpusov v prepodavanii russkogo jazyka [The Use of Electronic Corpora in Teaching the Russian Language]. In: LINDSTEDT J. et al. (eds.), SLAVICA HELSINGIENSIA 35, S ljubov'ju k slovu, Festschrift in honour of Professor Arto Mustajoki on the occasion of his 60th birthday. Helsinki, 2008, pp.110-118.
KOPOTEV, M. O samom slozhnom: Izuchenie sochetaemosti slov online [About the Most Difficult: Learning the Combination of Words Online]. Russkij jazyk za rubezhom, n. 6, pp.36-43, 2020.
KOPOTEV, M.; MUSTAJOKI, A. Sovremennaja korpusnaja rusistika [Modern Corpus Russian Studies]. In: MUSTAJOKI, A.; KOPOTEV, М.; BIRJULIN, L.; PROTASOVA, YU. (eds.). Instrumentarij rusistiki: Korpusnye podxody. Helsinki: Helsinki UP, 2008, pp.7-24.
KUSTOVA, G.I. Slovar' russkoi idiomatiki. Sochetaniya slov so znacheniyem vysokoi stepeni [A Dictionary of Russian Idiomology. Word Combinations with the Significance of a High Degree]. Moscow, 2008. http://dict.rislang.ru/magn.php.
KYLE, K.; CROSSLEY, S. A. Automatically Assessing Lexical Sophistication: Indices, Tools, Findings, and Application. TESOL Quarterly, v. 49, n. 4, pp.757-786, 2015.
LEBEDEVA M. YU. Dano mne telo – chto mne delat’ s nim? Primenenie korpusnyh tehnologii v lingvodidaktike RKI [I Have Been Given a Body - What Am I to Do with It? Application of Corpus Technologies in Linguodidactics of Russian as a Foreign Language.]. Russkij jazyk za rubezhom, n. 6, pp.4-13, 2020.
LEECH, G. Corpora and Theories of Linguistic Performance. In: SVARTVIK, J. (ed.). Directions in Corpus Linguistics: Proceedings of Nobel Symposium 82. Berlin, New York: Mouton de Gruyter, 1992, pp.105-122.
LEECH, G. Teaching and Language Corpora: A Convergence. In: WICHMANN, A. et al. (ed.). Teaching and Language Corpora. London and New York: Routledge, pp.1-24, 2014.
LEE, S. H.; JANG, S. B.; SEO, S. K. Annotation of Korean Learner Corpora for Particle Error Detection. CALICO Journal, v. 26, n. 3, pp.529-544, 2009.
LJASHEVSKAJA, О. N.; SHAROV, S.A. Chastotnyi slovar’ sovremennogo russkogo yazyka: Na materialax Natsional’nogo korpusa russkogo yazyka [Frequency Dictionary of the Modern Russian Language: On the Materials of the National Russian Corpus]. Azbukovnik, 2009.
LU, X.; YOON, J.; KISSELEV, O. Adding to Academic Formula Lists: Phrase-Frames for Research Article Introductions in Social Sciences. Journal of English for Academic Purposes, v. 36, pp.76-85, 2018.
MCENERY, T.; HARDIE, A. Corpus Linguistics: Method, Theory and Practice. Cambridge: Cambridge UP, 2012.
NORRIS, J.; ORTEGA, L. Measurement for Understanding: An Organic Approach to Investigating Complexity, Accuracy, and Fluency in SLA. Applied Linguistics, v. 30, n. 4, pp.555–578, 2009.
NOVIKOV, A.; VINOKUROVA, V. Learner Corpus as a Medium for Tasks. In: NUSS, S. V.; WHITEHEAD MARTELLE, W. (eds.). Task-Based Instruction for Teaching Russian as a Foreign Language. London and New York: Routledge, 2022.
PAVLENKO, A.; DRIAGINA, V. Russian Emotion Vocabulary in American Learners’ Narratives. The Modern Language Journal, n. 91, pp.213-234, 2007.
PAQUOT, M.; GRANGER, S. Formulaic Language in Learner Corpora. Annual Review of Applied Linguistics, v. 32, n. 1, pp.130-149, 2012.
PEIRCE, G. Representational and Processing Constraints on the Acquisition of Case and Gender by Heritage and L2 Learners of Russian: A Corpus Study. Heritage Language Journal, v. 15, n. 1, pp.95-111, 2018.
POLAT, N.; MAHALINGAPPA, L.; MANCILLA, R. L. Longitudinal Growth Trajectories of Written Syntactic Complexity: The Case of Turkish Learners in an Intensive English Program. Applied Linguistics, v. 41, n. 5, pp.688-711, 2020.
RAKHILINA, E.; VYRENKOVA, A.; MUSTAKIMOVA, E.; LADYGINA, A.; SMIRNOV, I. Building a Learner Corpus for Russian. In: VOLODINA, E. et al. (ed.). Proceedings of the Joint Workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition. Umea, Sweden: LiU Electronic Press, 2016, pp.66-75.
ROSEN, A.; HANA, J.; ŠTINDLOVÁ, B.; FELDMAN, A. Evaluating and Automating the Annotation of a Learner Corpus. Language Resources and Evaluation, v. 48, n. 1, pp.65-92, 2014.
ROZOVSKAYA, A.; ROTH, D. Building a State-of-the-Art Grammatical Error Correction System. Transactions of American Computational Linguistics, v. 2, pp.419-434, 2014.
SCOTT, M. WordSmith Tools Version 7 [Computer Program]. Stroud: Lexical Analysis Software, 2016.
SHAROFF, S.; UMANSKAYA, E.; WILSON, J. A Frequency Dictionary of Russian: Core Vocabulary for Learners. London and New York: Routledge, 2014.
STRAKA, M.; STRAKOVÁ, J. Tokenizing, Pos Tagging, Lemmatizing and Parsing ud 2.0 with Udpipe. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, 2017, pp.88-99.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Bakhtiniana. Revista de Estudos do Discurso
This work is licensed under a Creative Commons Attribution 4.0 International License.
The authors grant the journal all copyrights relating to the work published. The concepts expressed in signed articles are absolute and exclusive responsibility of their authors.