ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.07055
  4. Cited By
Large Language Models are not Models of Natural Language: they are
  Corpus Models

Large Language Models are not Models of Natural Language: they are Corpus Models

13 December 2021
Csaba Veres
ArXivPDFHTML

Papers citing "Large Language Models are not Models of Natural Language: they are Corpus Models"

8 / 8 papers shown
Title
A Review of the Challenges with Massive Web-mined Corpora Used in Large
  Language Models Pre-Training
A Review of the Challenges with Massive Web-mined Corpora Used in Large Language Models Pre-Training
Michał Perełkiewicz
Rafał Poświata
48
1
0
10 Jul 2024
Evaluating Large Language Models on the GMAT: Implications for the
  Future of Business Education
Evaluating Large Language Models on the GMAT: Implications for the Future of Business Education
Vahid Ashrafimoghari
Necdet Gurkan
Jordan W. Suchow
ELM
37
6
0
02 Jan 2024
The Quo Vadis of the Relationship between Language and Large Language
  Models
The Quo Vadis of the Relationship between Language and Large Language Models
Evelina Leivada
Vittoria Dentella
Elliot Murphy
38
3
0
17 Oct 2023
Position: Key Claims in LLM Research Have a Long Tail of Footnotes
Position: Key Claims in LLM Research Have a Long Tail of Footnotes
Anna Rogers
A. Luccioni
53
19
0
14 Aug 2023
The Linguistic Blind Spot of Value-Aligned Agency, Natural and
  Artificial
The Linguistic Blind Spot of Value-Aligned Agency, Natural and Artificial
Travis LaCroix
33
3
0
02 Jul 2022
Deduplicating Training Data Makes Language Models Better
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
242
599
0
14 Jul 2021
Measuring Coding Challenge Competence With APPS
Measuring Coding Challenge Competence With APPS
Dan Hendrycks
Steven Basart
Saurav Kadavath
Mantas Mazeika
Akul Arora
...
Collin Burns
Samir Puranik
Horace He
D. Song
Jacob Steinhardt
ELM
AIMat
ALM
208
631
0
20 May 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
282
2,007
0
31 Dec 2020
1