ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2511.17808
59
1
v1v2 (latest)

PoETa v2: Toward More Robust Evaluation of Large Language Models in Portuguese

IEEE Access (IEEE Access), 2025
21 November 2025
Thales Sales Almeida
Ramon Pires
Hugo Queiroz Abonizio
ArXiv (abs)PDFHTMLGithub (3★)
Main:9 Pages
7 Figures
Bibliography:5 Pages
5 Tables
Appendix:18 Pages
Abstract

Large Language Models (LLMs) exhibit significant variations in performance across linguistic and cultural contexts, underscoring the need for systematic evaluation in diverse languages. In this work, we present the most extensive evaluation of LLMs for the Portuguese language to date. Leveraging our newly introduced PoETa v2 benchmark -- a comprehensive suite of over 40 tasks in Portuguese -- we assess more than 20 models covering a broad spectrum of training scales and computational resources. Our study reveals how computational investment and language-specific adaptation impact performance in Portuguese, while also analyzing performance gaps in comparison to equivalent tasks in English. Through this benchmark and analysis, PoETa v2 lays the groundwork for future research on Portuguese language modeling and evaluation. The benchmark is available at this https URL.

View on arXiv
Comments on this paper