Over the last few decades, the wide diffusion of digital technology and the growing ease of transferring information via the Internet have provided scholars with an enormous amount of textual data at hand. The vastly increased availability of primary sources has radically changed the everyday life of scholars in the humanities, who are now enabled to access, query, and process a wealth of empirical evidence like never before. The development also encompasses ancient languages. The first aim in the eighties and the nineties was to digitize textual data and make them available on CD-ROM and online. Later, the need for linguistic annotation gave rise to projects aimed at building corpora enhanced with increasingly complex layers of metalinguistic information, such as part-of-speech tagging and syntactic annotation, opening the field to precise queries for particular linguistic phenomena. We are now at a stage at which several of these syntactically annotated corpora, or treebanks, have reached a mature state, providing representative selections of texts for several diachronic stages of the language in question. These new linguistic resources allow a new approach to diachronic studies of syntactic phenomena where scholars previously had to content themselves with data work on a much smaller scale.

The added value of diachronic treebanks for historical linguistics

Luraghi S.;Passarotti M.
2018-01-01

Abstract

Over the last few decades, the wide diffusion of digital technology and the growing ease of transferring information via the Internet have provided scholars with an enormous amount of textual data at hand. The vastly increased availability of primary sources has radically changed the everyday life of scholars in the humanities, who are now enabled to access, query, and process a wealth of empirical evidence like never before. The development also encompasses ancient languages. The first aim in the eighties and the nineties was to digitize textual data and make them available on CD-ROM and online. Later, the need for linguistic annotation gave rise to projects aimed at building corpora enhanced with increasingly complex layers of metalinguistic information, such as part-of-speech tagging and syntactic annotation, opening the field to precise queries for particular linguistic phenomena. We are now at a stage at which several of these syntactically annotated corpora, or treebanks, have reached a mature state, providing representative selections of texts for several diachronic stages of the language in question. These new linguistic resources allow a new approach to diachronic studies of syntactic phenomena where scholars previously had to content themselves with data work on a much smaller scale.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11571/1309306
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 4
social impact