One of the many ways in which the work of Patrick Hanks has contributed to our understanding of the organization of lexical knowledge and of the phenomenon of meaning modulation in language is the study of the complex relation between the ontological classification proposed for words and their distributional/syntagmatic behavior. More specifically, how types actually behave in context and how this behavior can be modeled in a type system that is consistent with the conceptual organization unveiled by language use. This issue touches the foundations of ontological representation and the complex interplay between semantics and cognition more generally (Jackendoff 2002). In this paper we focus on the tension between the semantic types (STs) assigned by verbs to their arguments and their extensional definition, that is, the paradigmatic set of words that may fill the different argument positions (lexical set, LS) - a tension that the work within the Pattern Dictionary of English Verbs (PDEV) project coordinated by Hanks had substantially contributed to identify, sharpen and problematize. After reviewing Hanks' insights on this phenomenon (section 1), we argue that the analysis of the mismatch between STs and LSs aimed at building a corpus-based ontology for word sense disambiguation (Hanks et al. 2007) can be improved by extending the Corpus Pattern Analysis (CPA) technique used in PDEV so that it includes the annotation of verb patterns onto the corpus instances that instantiate them (section 2). This produces a resource (the "Patternbank") that not only allows one to see the patterns of each verb and to retrieve the relevant contexts (as in the initial PDEV architecture) but also to see how the elements of the patterns (the semantic types associated to the argument positions) map specifically onto the elements of the context (the words that actually instantiate the types in context). A closer look at the benefits of pattern annotation reveals that it can be useful to capture and study linguistic phenomena related not only to the semantics/ontology interface but also to the semantics/syntax interface (syntactic alternations, argument dropping) (2.1), as well as for several NLP applications. To date, an annotation effort has already been initiated within PDEV. Here, we report the first steps that we have taken with the aim of building a "Patternbank" for Italian (2.2-2.3), starting from the Italian implementation of the Pattern Dictionary project.
From Pattern Dictionary to Patternbank
JEZEK, ELISABETTA;
2010-01-01
Abstract
One of the many ways in which the work of Patrick Hanks has contributed to our understanding of the organization of lexical knowledge and of the phenomenon of meaning modulation in language is the study of the complex relation between the ontological classification proposed for words and their distributional/syntagmatic behavior. More specifically, how types actually behave in context and how this behavior can be modeled in a type system that is consistent with the conceptual organization unveiled by language use. This issue touches the foundations of ontological representation and the complex interplay between semantics and cognition more generally (Jackendoff 2002). In this paper we focus on the tension between the semantic types (STs) assigned by verbs to their arguments and their extensional definition, that is, the paradigmatic set of words that may fill the different argument positions (lexical set, LS) - a tension that the work within the Pattern Dictionary of English Verbs (PDEV) project coordinated by Hanks had substantially contributed to identify, sharpen and problematize. After reviewing Hanks' insights on this phenomenon (section 1), we argue that the analysis of the mismatch between STs and LSs aimed at building a corpus-based ontology for word sense disambiguation (Hanks et al. 2007) can be improved by extending the Corpus Pattern Analysis (CPA) technique used in PDEV so that it includes the annotation of verb patterns onto the corpus instances that instantiate them (section 2). This produces a resource (the "Patternbank") that not only allows one to see the patterns of each verb and to retrieve the relevant contexts (as in the initial PDEV architecture) but also to see how the elements of the patterns (the semantic types associated to the argument positions) map specifically onto the elements of the context (the words that actually instantiate the types in context). A closer look at the benefits of pattern annotation reveals that it can be useful to capture and study linguistic phenomena related not only to the semantics/ontology interface but also to the semantics/syntax interface (syntactic alternations, argument dropping) (2.1), as well as for several NLP applications. To date, an annotation effort has already been initiated within PDEV. Here, we report the first steps that we have taken with the aim of building a "Patternbank" for Italian (2.2-2.3), starting from the Italian implementation of the Pattern Dictionary project.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.