ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2509.08824
  4. Cited By
Building High-Quality Datasets for Portuguese LLMs: From Common Crawl Snapshots to Industrial-Grade Corpora

Building High-Quality Datasets for Portuguese LLMs: From Common Crawl Snapshots to Industrial-Grade Corpora

10 September 2025
Thales Sales Almeida
Rodrigo Nogueira
Hélio Pedrini
ArXiv (abs)PDFHTMLGithub (871★)

Papers citing "Building High-Quality Datasets for Portuguese LLMs: From Common Crawl Snapshots to Industrial-Grade Corpora"

2 / 2 papers shown
PoETa v2: Toward More Robust Evaluation of Large Language Models in Portuguese
PoETa v2: Toward More Robust Evaluation of Large Language Models in PortugueseIEEE Access (IEEE Access), 2025
Thales Sales Almeida
Ramon Pires
Hugo Queiroz Abonizio
Rodrigo Nogueira
Hélio Pedrini
78
1
0
21 Nov 2025
BRoverbs -- Measuring how much LLMs understand Portuguese proverbs
BRoverbs -- Measuring how much LLMs understand Portuguese proverbs
Thales Sales Almeida
Giovana K. Bonás
João Guilherme Alves Santos
134
2
0
10 Sep 2025
1
Page 1 of 1