In this paper, we present the results of two small-scale experiments aimed at verifying whether part of the Corpus Pattern Analysis (CPA) procedure developed by Hanks (2004) to manually extract recurrent language patterns from texts, can be automated using LLMs. Specifically, we examine ChatGPT and Gemini performance in the task of semantic type tagging of arguments in 150 Italian sentences realising 30 verb patterns (5 sentences per pattern). We run two experiments. In the first, we prompt ChatGPT to use the CPA ontology (about 200 hierarchically organized semantic types) in the annotation task; we provide the model with 5 sentences per pattern and ask it to assign the most specific type to the argument(s) of each sentence. In the second, we prompt both ChatGPT and Gemini to perform the task without the ontology, and ask the models to assign a single label to the argument(s) of the 5 sentences. Both experiments are performed in a zero-shot setting. We evaluate the results using the existing Italian T-PAS pattern resource as benchmark. Our results show that LLMs perform comparably well on both concrete and abstract type tagging and can therefore be used in a pilot study to support analysts in acquiring verb patterns from text.
Leveraging LLMs for Semantic Type Annotation of Verbs’ Arguments
Jezek
;E. Errico
2026-01-01
Abstract
In this paper, we present the results of two small-scale experiments aimed at verifying whether part of the Corpus Pattern Analysis (CPA) procedure developed by Hanks (2004) to manually extract recurrent language patterns from texts, can be automated using LLMs. Specifically, we examine ChatGPT and Gemini performance in the task of semantic type tagging of arguments in 150 Italian sentences realising 30 verb patterns (5 sentences per pattern). We run two experiments. In the first, we prompt ChatGPT to use the CPA ontology (about 200 hierarchically organized semantic types) in the annotation task; we provide the model with 5 sentences per pattern and ask it to assign the most specific type to the argument(s) of each sentence. In the second, we prompt both ChatGPT and Gemini to perform the task without the ontology, and ask the models to assign a single label to the argument(s) of the 5 sentences. Both experiments are performed in a zero-shot setting. We evaluate the results using the existing Italian T-PAS pattern resource as benchmark. Our results show that LLMs perform comparably well on both concrete and abstract type tagging and can therefore be used in a pilot study to support analysts in acquiring verb patterns from text.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


