ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2201.10474
  4. Cited By
Whose Language Counts as High Quality? Measuring Language Ideologies in
  Text Data Selection

Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection

25 January 2022
Suchin Gururangan
Dallas Card
Sarah K. Drier
E. K. Gade
Leroy Z. Wang
Zeyu Wang
Luke Zettlemoyer
Noah A. Smith
ArXivPDFHTML

Papers citing "Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection"

1 / 1 papers shown
Title
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
220
1,508
0
31 Dec 2020
1