Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2505.16078
Cited By
v1
v2
v3 (latest)
Small Language Models in the Real World: Insights from Industrial Text Classification
21 May 2025
Lujun Li
Lama Sleem
Niccolo Gentile
Geoffrey Nichil
Radu State
LLMAG
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Small Language Models in the Real World: Insights from Industrial Text Classification"
23 / 23 papers shown
Chain of Draft: Thinking Faster by Writing Less
Silei Xu
Wenhao Xie
Lingxiao Zhao
Pengcheng He
AI4TS
LRM
525
175
0
25 Feb 2025
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Benjamin Warner
Antoine Chaffin
Benjamin Clavié
Orion Weller
Oskar Hallström
...
Tom Aarsen
Nathan Cooper
Griffin Adams
Jeremy Howard
Iacopo Poli
463
402
0
18 Dec 2024
Gemma 2: Improving Open Language Models at a Practical Size
Gemma Team
Gemma Team Morgane Riviere
Shreya Pathak
Pier Giuseppe Sessa
Cassidy Hardin
...
Noah Fiedel
Armand Joulin
Kathleen Kenealy
Robert Dadashi
Alek Andreev
VLM
MoE
OSLM
632
1,625
0
31 Jul 2024
Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification
Pierre Lepagnol
Thomas Gerald
Sahar Ghannay
Christophe Servan
Sophie Rosset
219
22
0
17 Apr 2024
How do Large Language Models Handle Multilingualism?
Yiran Zhao
Wenxuan Zhang
Guizhen Chen
Kenji Kawaguchi
Lidong Bing
LRM
381
139
0
29 Feb 2024
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
8.9K
18,046
0
27 Feb 2023
How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation
Amr Hendy
M. Abdelrehim
Amr Sharaf
Vikas Raunak
Mohamed Gabr
Hitokazu Matsushita
Young Jin Kim
Mohamed Afify
Hany Awadalla
ELM
LM&MA
AI4CE
287
548
0
18 Feb 2023
Self-Consistency Improves Chain of Thought Reasoning in Language Models
International Conference on Learning Representations (ICLR), 2022
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
2.7K
5,693
0
21 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Neural Information Processing Systems (NeurIPS), 2022
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
2.4K
15,070
0
28 Jan 2022
Which Student is Best? A Comprehensive Knowledge Distillation Exam for Task-Specific BERT Models
Made Nindyatama Nityasya
Haryo Akbarianto Wibowo
Rendi Chevi
Radityo Eko Prasojo
Alham Fikri Aji
180
7
0
03 Jan 2022
NER-BERT: A Pre-trained Model for Low-Resource Entity Tagging
Zihan Liu
Feijun Jiang
Yuxiang Hu
Chen Shi
Pascale Fung
304
43
0
01 Dec 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
1.5K
5,049
0
18 Apr 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Annual Meeting of the Association for Computational Linguistics (ACL), 2021
Xiang Lisa Li
Abigail Z. Jacobs
656
5,287
0
01 Jan 2021
Language Models are Few-Shot Learners
Neural Information Processing Systems (NeurIPS), 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
2.0K
53,198
0
28 May 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
715
4,928
0
10 Apr 2020
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
M. Lewis
Yinhan Liu
Naman Goyal
Marjan Ghazvininejad
Abdel-rahman Mohamed
Omer Levy
Veselin Stoyanov
Luke Zettlemoyer
AIMat
VLM
870
12,171
0
29 Oct 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Journal of machine learning research (JMLR), 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
1.6K
24,131
0
23 Oct 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
4.0K
28,140
0
26 Jul 2019
Large-Scale Multi-Label Text Classification on EU Legislation
Annual Meeting of the Association for Computational Linguistics (ACL), 2019
Ilias Chalkidis
Manos Fergadiotis
Prodromos Malakasiotis
Ion Androutsopoulos
AILaw
215
246
0
05 Jun 2019
Evolutionary Data Measures: Understanding the Difficulty of Text Classification Tasks
Edward Collins
Nikolai Rozanov
M. Kaptein
198
31
0
05 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
3.0K
109,193
0
11 Oct 2018
Generative and Discriminative Text Classification with Recurrent Neural Networks
Dani Yogatama
Chris Dyer
Wang Ling
Phil Blunsom
281
210
0
06 Mar 2017
Convolutional Neural Networks for Sentence Classification
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014
Yoon Kim
AILaw
VLM
1.6K
13,965
0
25 Aug 2014
1
Page 1 of 1