Sammendrag
This study investigates cross-linguistic influence ('transfer') in Norwegian interlanguage using predictive data mining technology and with a focus on lexical transfer. The impetus for the present work came from the publication of a series of studies (Jarvis & Crossley 2012) that explore the 'detection-based approach' to language transfer.
The following research questions are addressed:
1. Can data mining techniques be used to identify the L1 background of Norwegian language learners on the basis of their use of lexical features of the target language?
2. If so, what are the best predictors of L1 background?
3. And can those predictors be traced to cross-linguistic influence?
The study utilizes data from Norsk andrespråkskorpus (ASK), the Norwegian Second Language Corpus housed at the University of Bergen, and draws on resources from the ASKeladden project. The source data consists of texts written by 1,736 second language learners of Norwegian from ten different L1 backgrounds, and a control corpus of 200 texts written by native speakers. Word frequencies computed from this data are analysed using multivariate statistical methods that include analysis of variance and linear discriminant analysis, and the results are subjected to contrastive analysis.
The combination of discriminant analysis and contrastive analysis produces all three types of evidence called for by Jarvis (2000) in his methodological requirements for language transfer research: intragroup homogeneity, intergroup heterogeneity and cross-language congruity. Well-known transfer effects, such as the tendency for Russian learners to omit indefinite articles, are confirmed, and other, more subtle patterns of learner language are revealed, such as the tendency amongst Dutch learners to overuse the modal verb skal to a far greater extent than other learners. In addition to confirming the reality of lexical transfer, these results provide abundant material for further research, while the methodology employed can be harnessed in many areas of linguistic research.