v1v2v3 (latest)

Small Language Models in the Real World: Insights from Industrial Text Classification

21 May 2025

Papers citing "Small Language Models in the Real World: Insights from Industrial Text Classification"

23 / 23 papers shown

Chain of Draft: Thinking Faster by Writing Less

525

175

25 Feb 2025

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

...

463

402

18 Dec 2024

Gemma 2: Improving Open Language Models at a Practical Size

Gemma Team

Gemma Team Morgane Riviere

...

632

1,625

31 Jul 2024

Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification

219

17 Apr 2024

How do Large Language Models Handle Multilingualism?

381

139

29 Feb 2024

LLaMA: Open and Efficient Foundation Language Models

...

8.9K

18,046

27 Feb 2023

How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation

Young Jin Kim

287

548

18 Feb 2023

Self-Consistency Improves Chain of Thought Reasoning in Language ModelsInternational Conference on Learning Representations (ICLR), 2022

2.7K

5,693

21 Mar 2022

Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022

2.4K

15,070

28 Jan 2022

Which Student is Best? A Comprehensive Knowledge Distillation Exam for Task-Specific BERT Models

Made Nindyatama Nityasya

Haryo Akbarianto Wibowo

Rendi Chevi

Radityo Eko Prasojo

Alham Fikri Aji

180

03 Jan 2022

NER-BERT: A Pre-trained Model for Low-Resource Entity Tagging

304

01 Dec 2021

The Power of Scale for Parameter-Efficient Prompt TuningConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

1.5K

5,049

18 Apr 2021

Prefix-Tuning: Optimizing Continuous Prompts for GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

Xiang Lisa Li

Abigail Z. Jacobs

656

5,287

01 Jan 2021

Language Models are Few-Shot LearnersNeural Information Processing Systems (NeurIPS), 2020

...

2.0K

53,198

28 May 2020

Longformer: The Long-Document Transformer

Iz Beltagy

Matthew E. Peters

Arman Cohan

RALM VLM

715

4,928

10 Apr 2020

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and ComprehensionAnnual Meeting of the Association for Computational Linguistics (ACL), 2019

Luke Zettlemoyer

870

12,171

29 Oct 2019

Exploring the Limits of Transfer Learning with a Unified Text-to-Text TransformerJournal of machine learning research (JMLR), 2019

Sharan Narang

1.6K

24,131

23 Oct 2019

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Luke Zettlemoyer

4.0K

28,140

26 Jul 2019

Large-Scale Multi-Label Text Classification on EU LegislationAnnual Meeting of the Association for Computational Linguistics (ACL), 2019

Ilias Chalkidis

Manos Fergadiotis

Prodromos Malakasiotis

Ion Androutsopoulos

AILaw

215

246

05 Jun 2019

Evolutionary Data Measures: Understanding the Difficulty of Text Classification Tasks

Edward Collins

Nikolai Rozanov

M. Kaptein

198

05 Nov 2018

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

3.0K

109,193

11 Oct 2018

Generative and Discriminative Text Classification with Recurrent Neural Networks

281

210

06 Mar 2017

Convolutional Neural Networks for Sentence ClassificationConference on Empirical Methods in Natural Language Processing (EMNLP), 2014

Yoon Kim

AILaw VLM

1.6K

13,965

25 Aug 2014