Niklaus, Joël (2024). Decoding Legalese Without Borders: Multilingual Evaluation of Language Models on Long Legal Texts. (Thesis). Universität Bern, Bern
|
Text
24niklaus_j.pdf - Thesis Available under License Creative Commons: Attribution (CC-BY 4.0). Download (26MB) | Preview |
Abstract
Pretrained transformers have sparked an explosion of research in the field of Natural Language Processing (NLP). Scaling up language models based on the transformer architecture in terms of size, compute, and data led to impressive emergent capabilities that were considered unattainable in such a brief span, a mere three years ago, prior to the launch of GPT-3. These advances catapulted the previously niche field of legal NLP into the mainstream, at the latest, with GPT-4 passing the bar. Many products based on GPT-4 and other large language models are entering the market at an increasing pace, many of those targeting the legal field. This dissertation makes contributions in two key areas within Natural Language Processing (NLP) focused on legal text: resource curation and detailed model analysis. First, we curate an extensive set of multilingual legal datasets, train a variety of language models on these, and establish comprehensive benchmarks for evaluating Large Language Models (LLMs) in the legal domain. Second, we conduct a multidimensional analysis of model performance, focusing on metrics like explainability and calibration in the context of Legal Judgment Prediction. We introduce novel evaluation frameworks and find that while our trained models exhibit high performance and better calibration than human experts, they do not necessarily offer improved explainability. Furthermore, we investigate the feasibility of re-identification in anonymized legal texts, concluding that large-scale re-identification using LLMs is currently unfeasible. For future work, we propose exploring domain adaptation and instruction tuning to enhance language model performance on legal benchmarks, while also advocating for a detailed examination of dataset overlaps and model interpretability. Additionally, we emphasize the need for dataset extension to unexplored legal tasks and underrepresented jurisdictions, aiming for a more comprehensive coverage of the global legal landscape in NLP resources.
Item Type: | Thesis |
---|---|
Dissertation Type: | Cumulative |
Date of Defense: | 24 January 2024 |
Subjects: | 000 Computer science, knowledge & systems 300 Social sciences, sociology & anthropology > 340 Law 400 Language > 410 Linguistics |
Institute / Center: | 08 Faculty of Science |
Depositing User: | Hammer Igor |
Date Deposited: | 08 Mar 2024 13:42 |
Last Modified: | 30 Mar 2024 20:12 |
URI: | https://boristheses.unibe.ch/id/eprint/4944 |
Actions (login required)
View Item |