This study explores the application of Large Language Models (LLMs) to verb subcategorization in Italian, focusing on the identification and classification of syntactic patterns in sentences. While LLMs have made lexical analysis more implicit, explicit argument structure identification remains crucial in domain-specific contexts. The research leverages T-PAS, a rich lexical resource for Italian verbs, to fine-tune the open multilingual model Mistral 7B using the Iterative Reasoning Preference Optimization (IRPO) technique. This approach aims to enhance the recognition and extraction of verbal patterns from Italian sentences, addressing challenges in resource quality, coverage, and frame extraction methods. By combining curated lexical-semantic resources with neural language models, this work contributes to improving verb subcategorization tasks, particularly for the Italian language, and demonstrates the potential of LLMs in refining linguistic analysis tools.

Subcategorization of Italian Verbs with LLMs and T-PAS

Jezek E.
;
2024-01-01

Abstract

This study explores the application of Large Language Models (LLMs) to verb subcategorization in Italian, focusing on the identification and classification of syntactic patterns in sentences. While LLMs have made lexical analysis more implicit, explicit argument structure identification remains crucial in domain-specific contexts. The research leverages T-PAS, a rich lexical resource for Italian verbs, to fine-tune the open multilingual model Mistral 7B using the Iterative Reasoning Preference Optimization (IRPO) technique. This approach aims to enhance the recognition and extraction of verbal patterns from Italian sentences, addressing challenges in resource quality, coverage, and frame extraction methods. By combining curated lexical-semantic resources with neural language models, this work contributes to improving verb subcategorization tasks, particularly for the Italian language, and demonstrates the potential of LLMs in refining linguistic analysis tools.
2024
Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024)
Dell'Orletta F., Lenci A., Montemagni S., Sprugnoli R.
Language & Linguistics
AI, Robotics & Automatic Control
Esperti anonimi
Inglese
contributo
CLiC-it 2024 – Tenth Italian Conference on Computational Linguistics
4 - 6 December 2024
Pisa
Internazionale
ELETTRONICO
1
6
6
9791221070606
CEUR Workshop Proceedings
NLP, T-PAS, Verb Subcategorization, Mistral, CLiC-it
https://ceur-ws.org/Vol-3878/99_main_long.pdf
no
none
Simonetti, L.; Jezek, E.; Vetere, G.
273
info:eu-repo/semantics/conferenceObject
3
4 Contributo in Atti di Convegno (Proceeding)::4.1 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11571/1514655
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact