The relatively recent digitization of large textual corpora – roughly speaking, collections of written or oral texts gathered for some purpose – has opened a new era for the study of the use of language and therefore of society and its changes and challenges. While linguistics has experienced the development of a new branch – corpus linguistics –, other disciplines such as history, literature or social sciences can greatly benefit from such resources. Therefore, corpora have become an integral part of research in many fields. However, their usefulness is also beginning to be recognized in education. Both the use of corpora, either annotated or unannotated, and the practice of linguistic annotation serve to deepen the learners’ knowledge of language structures.

Corpus annotation can be limited to the analysis of basic aspects such as the identification of lemmata (the basic form of a word) or parts of speech (whether a word is a verb, a noun, an adjective etc.). But it can also be more complex and involve the analysis of morphology, syntax and sometimes even semantics and pragmatics. In other words, annotating a corpus is a way of enriching it with information and also of learning, no matter whether the annotator is an experienced scholar or a first-year student. However, to my knowledge, at least in the teaching of ancient languages such as Ancient Greek and Latin, the use of annotation as a learning method is very rare. But there are some exceptions and someone has even developed an entire course based on annotation.

Vanessa Gorman, professor of Ancient History at the University of Nebraska-Lincoln, is a scholar who has not only pioneered the use of annotated corpora for carrying out groundbreaking research (see, e.g., her co-authored paper on questions of authorship: Approaching Questions of Text Reuse in Ancient Greek Using Computational Syntactic Stylometry), but she has also used annotation to teach ancient languages in an innovative way. Her online, open-access course Reading Ancient Greek in the Digital Age introduces learners to the basics of language. It prepares students to be able to read (relatively easy) Ancient Greek prose. The breakthrough aspect is the use of annotation tools and the digital environment for teaching the language. In fact, both teaching and practice rely on Perseids, a free online user-friendly platform that enables people to carry out morpho-syntactic annotation of (mainly Ancient Greek and Latin) texts. This is not a trivial task as each single word needs to be precisely recognized, described and put in relation with other words in the sentence and within the wider context of the passage. In this way, learners are confronted from the very beginning not only with the challenge of recognizing and describing the changing forms of words in a sentence (Ancient Greek has a quite complex morphology), but also of how words are related to one another and result in a meaningful sentence (syntax and semantics). In this way students learn the deeper elements of grammar, a knowledge transferrable to other language studies – as underlined by Gorman herself. She also tries to familiarize learners early on with the terms of the metalanguage of language, which is very useful when learners need to use reference tools such as dictionaries and specialized grammars. In line with current digital practices, her course also offers a collaborative way of learning, as the sentences are annotated together by thinking aloud. The fact that learners are introduced to annotation from the very beginning is important – I believe –, as the relatively complex rules of annotation are learnt together with the language. Interestingly, Gorman has achieved some promising results using this method of teaching.

Gorman’s experience shows that annotation is a powerful resource for collaboration not only at research level, but also in educational environments. Research and didactics teaching do not necessarily have to be close, but they can share some tools and there are many in-between situations in which, e.g., students collaborate in research projects and at the same time acquire knowledge and skills that will last a lifetime. For example, over the course of 18 months (2019-2020) the Swiss FNS-founded project WoPoss. A World of Possibilities. Modal pathways over an extra-long period of time: the diachrony of modality in the Latin language offered some students the chance of working as annotators. Modality – in this case, meaning the semantic notions of possibility, necessity and volition – is a notoriously challenging research field, as scholars sometimes have to deal with very subtle distinctions in the contextual uses of verbs such as possum (‘be able, can, etc.’), debeo (‘to owe, must’), and so on. In this project, collaborators corrected automatic annotation of lemmata, parts of speech and morphological analysis and carried out (guided) semantic annotation of modal passages. During their time as assistants, the hired students – all classicists and without a specific linguistic or digital training – learned the basics of the theory of modality and became acquainted with fine nuances of meanings. They annotated using a user-friendly annotation environment (Inception) where their analysis is visualized as they go along. As none of them were trained in linguistics, they were able to look at the passages in the texts with fresh eyes and at the same they developed a sensitivity to the shades of meanings of (Latin) texts.

These experiences show that fruitful exchanges between research and teaching can be and have been established in the use of corpus linguistics and, in particular, in the practice of annotation.

Author(s) of this blog post

Web page | Other publications

Francesca Dell’Oro is SNSF assistant professor in Historical Linguistics at the university of Neuchâtel and Associate Fellow of the Center for Hellenic Studies of Harvard University. As the PI of the WoPoss project, she investigates with her team the emergence and development of modal constructions in Latin and the Romance languages adopting a typological perspective and using corpus linguistic methods. She is also interested in the development of new and experimental methods to teach (ancient) languages.