
Title |
|---|
CaseSumm: A Large-Scale Dataset for Long-Context Summarization from U.S. Supreme Court OpinionsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024 |
![]() BetterBench: Assessing AI Benchmarks, Uncovering Issues, and
Establishing Best PracticesNeural Information Processing Systems (NeurIPS), 2024 |
![]() Robots in the Middle: Evaluating LLMs in Dispute ResolutionInternational Conference on Legal Knowledge and Information Systems (JURIX), 2024 |
![]() Objection Overruled! Lay People can Distinguish Large Language Models
from Lawyers, but still Favour Advice from an LLMInternational Conference on Human Factors in Computing Systems (CHI), 2024 |
![]() Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic
Fact-checkersConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |