This tutorial presents a corpus-driven, pattern-based empirical approach to meaning representation and computation. Patterns in text are everywhere, but techniques for identifying and processing them are still rudimentary. Patterns are not merely syntactic but syntagmatic: each pattern identifies a lexico-semantic clause structure consisting of a predicator (verb or predicative adjective) together with open-ended lexical sets of collocates in different clause roles (subject, object, prepositional argument, etc.). If NLP is to make progress in identifying and processing text meaning, pattern recognition and collocational analysis will play an essential role, because: "Many, if not most meanings, require the presence of more than one word for their normal realization. ... Patterns of co-selection among words, which are much stronger than any description has yet allowed for, have a direct connection with meaning. (J. M. Sinclair, 1998)." The tutorial presents methods for building patterns on the basis of corpus evidence, using machine learning methods. It discusses some possible applications of pattern inventories and invites discussion of others. It is intended for an audience with heterogeneous competences but with a common interest in corpus linguistics and computational models for meaning-related tasks in NLP. We report on the methodologies for building resources for semantic processing and their contribution to NLP tasks. The goal is to provide the audience with an operative understanding of the methodology used to acquire corpus patterns and of their utility in NLP applications.
Corpus Patterns for Semantic Processing
JEZEK, ELISABETTA;
2015-01-01
Abstract
This tutorial presents a corpus-driven, pattern-based empirical approach to meaning representation and computation. Patterns in text are everywhere, but techniques for identifying and processing them are still rudimentary. Patterns are not merely syntactic but syntagmatic: each pattern identifies a lexico-semantic clause structure consisting of a predicator (verb or predicative adjective) together with open-ended lexical sets of collocates in different clause roles (subject, object, prepositional argument, etc.). If NLP is to make progress in identifying and processing text meaning, pattern recognition and collocational analysis will play an essential role, because: "Many, if not most meanings, require the presence of more than one word for their normal realization. ... Patterns of co-selection among words, which are much stronger than any description has yet allowed for, have a direct connection with meaning. (J. M. Sinclair, 1998)." The tutorial presents methods for building patterns on the basis of corpus evidence, using machine learning methods. It discusses some possible applications of pattern inventories and invites discussion of others. It is intended for an audience with heterogeneous competences but with a common interest in corpus linguistics and computational models for meaning-related tasks in NLP. We report on the methodologies for building resources for semantic processing and their contribution to NLP tasks. The goal is to provide the audience with an operative understanding of the methodology used to acquire corpus patterns and of their utility in NLP applications.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.