Image by Brett Jordan from Unsplash

This study was carried out with funding from the Research Committee of the College of Arts, Social Sciences and Celtic Studies at the University of Galway. I would like to thank Ms Cristina García Sánchez for her work as an assistant researcher for this project, Ms Beatriz Rubio Martínez, for her advice and insights about corpus research and analysis, Dr Nadia Albaladejo García, for her relentless support and help with the Leaving Cert questionnaires, and Ms Raquel Rodríguez Fernández for being an inspiration and sharing the primary sources collected for her Ph.D. dissertation with this project. I also would like to acknowledge the support and participation in the project with the members of ELE in Éirínn Community of Practice, a cooperative community of teachers of Spanish in Ireland. For more information about this group and its projects, go to: If you want to join us, please email:

This open press book has been reformatted by the automated pressbook creator. For an easier to read pdf version of the monograph, you can email us in the email address above.


NOTE: This project has been developed as a pilot study and there are many limitations to its scientific scope. Its main objective is to provide a guide of common errors for teachers of Spanish as a foreign language in the Irish formal and informal educational context. I hope that it will serve as a precursor for further academic research and the establishment of a formal computerized corpus of learners of Spanish in Ireland.

It is human to err; and the only final and deadly error,

among all our errors, is denying that we have ever erred.

GK Chesterton

A learner corpus is a rich collection of data extracted from the utterances made by students of an L2. Applied linguists have been compiling such corpora to be analysed at different language levels and conversely, so that different levels of interlanguage can be better described within our communities of learners. Although Spanish has traditionally been taught in Ireland for quite a long time now, Irish output of Spanish has not received the attention that it deserves. This small case study aims to inspire enhanced case studies and reach a more localized description of non-native Spanish levels in Ireland. Describing a linguistic level from non-native utterances is not an easy task, but it is a useful tool for teachers and learners alike. Learner corpora have filled this gap as

‘systematic  computerized  collections  of  texts  produced  by  language  learners’ (Nesselhauf,  2004:  125), serving as samples for peers and for teachers to, at least, define the limits within a level. Students have benefited from these non-native examples, because they serve as realistic goals in terms of language learning, as opposed to the far-fetched unrealistic constant exposure to native authentic language, or fake scripted non-native fictional dialogues available in some of the Spanish manuals in the market.

These collections of samples generally stem from real, spontaneous or directed learner language. Once the samples have been analysed, these outputs can be applied to curriculum design, material creation, and development of dictionaries or learner support. The most famous example of learner corpus is the ICLE (learner English) at Lovaina University. Spanish learner corpora are still rare, but they are proliferating through different institutions. The University of Santiago de Compostela in conjunction with the Instituto Cervantes are developing the Corpus de Aprendices de ESpañol Como lengua extranjera – CAES. The Spanish Learner Language Oral Corpora (SPLLOC) is being developed between three universities in the UK (Newcastle, Southampton and York). Ainciburu (2010) offers detailed descriptions of several projects that are building Spanish learner corpora, namely, the research group called Woslac’s Proyecto CEDEL2 at Universidad Autónoma de Madrid and the Spanish Learner Corpus and Exercises (SLCE) at University of Texas.  In Spain, Alcalá’s University is developing a written Corpus para el Análisis de errores de aprendices de E/LE (CORANE), with a variety of native languages. In Brazil, the USP Multilingual Learner Corpus (MLC) works with Spanish too. Overall, these projects have shown that learner corpora facilitate the study of written and oral learner language in context, accounting for their communicative competence and sociolinguistic variables. An advantage of corpora studies is that they can be replicated and tested from a linguistic and localised pedagogical perspective (Granger, Sylvianne, Gaëtanelle Gilquin & Fanny Meunie, 2015). Ireland has a relatively small population and within this, a small incidence of learners of Spanish. The common curriculum in all its provinces, except the parts of Ulster that ascribe to the UK teaching curriculum, would allow for easier comparisons than in other English-speaking countries that have built learner corpora.

In terms of methodological tools, the majority of learner corpora collections available have relied on corpus linguistics, contrastive analysis and error analysis. These methods of analysis reveal areas where learners of a certain level tend to underuse or overuse certain linguistic features as opposed to native-language users.  These deviant utterances have been traditionally baptised as errors. Error tagging is, therefore, inherent to learner corpora. The creation of an error annotation guideline is necessary whenever we engage in this activity and it involves a clearly defined taxonomy of errors alongside its digital or non-digital tags. Good practice suggests that these taxonomies should be based on the observable data and well-defined linguistic categories or standardized to minimize the subjectivity involved in the process (Díaz-Negrillo and Fernández-Domínguez, 2006: 85). However, there is no agreement on what best taxonomy to use and different systems seem to be shared among different projects. This small-scale project uses these systems as a reference to develop our own taxonomy to account for errors that did not fit in the categories previously developed in other projects.

A good system involves consistency, usability and flexibility. A few new less traditional tags have been added to indicate the differences between Irish learners and other learners of Spanish. Digital tools were not used for error tagging because our primary sources were handwritten texts and oral-recordings as primary sources. The research assistant, Ms Cristina García Sánchez and the principal investigator, Dr Pilar Alderete, agreed on a simple analogue colour-coding system for the written samples and classical sound tables for spoken errors. The linguistic areas that this project has targeted are spelling, grammar and a few lexical items, but phonetic tags have been included to illustrate some of the issues that Irish learners encounter at pronunciation levels. These taxonomies are not exhaustive because the scope of this research is very limited. In order to continue developing this research, I would like to invite other institutions in Ireland to collect more written and oral data at this level. Thus, some of the tags at grammatical level could be refined, and enable a better description the difficulties at A1/A2 oral levels for Irish learners of Spanish.