Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2004.08900
Cited By
The Cost of Training NLP Models: A Concise Overview
19 April 2020
Or Sharir
Barak Peleg
Y. Shoham
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"The Cost of Training NLP Models: A Concise Overview"
50 / 106 papers shown
Confidential LLM Inference: Performance and Cost Across CPU and GPU TEEs
Marcin Chrapek
Marcin Copik
Etienne Mettaz
Torsten Hoefler
117
4
0
23 Sep 2025
SLM-Bench: A Comprehensive Benchmark of Small Language Models on Environmental Impacts--Extended Version
Nghiem Thanh Pham
Tung Kieu
Duc-Manh Nguyen
Son Ha Xuan
Nghia Duong-Trung
Danh Le-Phuoc
205
4
0
21 Aug 2025
Spark Transformer: Reactivating Sparsity in FFN and Attention
Chong You
Kan Wu
Zhipeng Jia
Lin Chen
Srinadh Bhojanapalli
...
Felix X. Yu
Prateek Jain
David Culler
Henry M. Levy
Sanjiv Kumar
294
5
0
07 Jun 2025
QUPID: Quantified Understanding for Enhanced Performance, Insights, and Decisions in Korean Search Engines
Ohjoon Kwon
Changsu Lee
Jihye Back
Lim Sun Suk
Inho Kang
Donghyeon Jeon
361
1
0
12 May 2025
Harnessing uncertainty when learning through Equilibrium Propagation in neural networks
Jonathan Peters
Philippe Talatchian
258
0
0
28 Mar 2025
Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
International Conference on Learning Representations (ICLR), 2025
Yiqin Yang
Quanwei Wang
Chenghao Li
Hao Hu
Chengjie Wu
...
Dianyu Zhong
Ziyou Zhang
Qianchuan Zhao
Chongjie Zhang
Xu Bo
OffRL
304
1
0
26 Feb 2025
TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models
Computer Vision and Pattern Recognition (CVPR), 2024
Xin Wang
Kai-xiang Chen
Jiaming Zhang
Yue Yu
Jiabo He
AAML
VPVLM
VLM
439
20
0
20 Nov 2024
Understanding Adam Requires Better Rotation Dependent Assumptions
Tianyue H. Zhang
Lucas Maes
Alexia Jolicoeur-Martineau
Alexia Jolicoeur-Martineau
Damien Scieur
Damien Scieur
Simon Lacoste-Julien
Charles Guille-Escuret
346
9
0
25 Oct 2024
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
International Conference on Learning Representations (ICLR), 2024
Yiding Jiang
Allan Zhou
Zhili Feng
Sadhika Malladi
J. Zico Kolter
307
42
0
15 Oct 2024
Fortify Your Foundations: Practical Privacy and Security for Foundation Model Deployments In The Cloud
Marcin Chrapek
Anjo Vahldiek-Oberwagner
Marcin Spoczynski
Scott Constable
Mona Vij
Torsten Hoefler
375
6
0
08 Oct 2024
Taylor Unswift: Secured Weight Release for Large Language Models via Taylor Expansion
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Guanchu Wang
Yu-Neng Chuang
Ruixiang Tang
Shaochen Zhong
Jiayi Yuan
...
Zirui Liu
Vipin Chaudhary
Shuai Xu
James Caverlee
Helen Zhou
PILM
495
4
0
06 Oct 2024
On Tables with Numbers, with Numbers
Konstantinos Kogkalidis
S. Chatzikyriakidis
LMTD
496
3
0
12 Aug 2024
AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning
ACM Multimedia (MM), 2024
Xin Wang
Kai-xiang Chen
Jiabo He
Zhineng Chen
Yue Yu
Yu-Gang Jiang
AAML
361
11
0
04 Aug 2024
DeepCodeProbe: Towards Understanding What Models Trained on Code Learn
Vahid Majdinasab
Amin Nikanjam
Foutse Khomh
318
2
0
11 Jul 2024
Optimizing Language Model's Reasoning Abilities with Weak Supervision
Yongqi Tong
Sizhe Wang
Dawei Li
Yifan Wang
Simeng Han
Zi Lin
Chengsong Huang
Jiaxin Huang
Jingbo Shang
LRM
ReLM
423
14
0
07 May 2024
Analyzing the Role of Semantic Representations in the Era of Large Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Zhijing Jin
Yuen Chen
Fernando Gonzalez
Jiarui Liu
Jiayi Zhang
Julian Michael
Bernhard Schölkopf
Mona T. Diab
280
17
0
02 May 2024
LLeMpower: Understanding Disparities in the Control and Access of Large Language Models
Vishwas Sathish
Hannah Lin
Aditya K Kamath
Anish Nyayachavadi
252
9
0
14 Apr 2024
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Elliot Bolton
Abhinav Venigalla
Michihiro Yasunaga
David Leo Wright Hall
Betty Xiong
...
R. Daneshjou
Jonathan Frankle
Abigail Z. Jacobs
Michael Carbin
Christopher D. Manning
LM&MA
MedIm
357
124
0
27 Mar 2024
Improving Sampling Methods for Fine-tuning SentenceBERT in Text Streams
International Conference on Pattern Recognition (ICPR), 2024
C. M. Garcia
A. L. Koerich
A. Britto
J. P. Barddal
261
1
0
18 Mar 2024
GenOL: Generating Diverse Examples for Name-only Online Learning
Minhyuk Seo
Diganta Misra
Seongwon Cho
Minjae Lee
Jonghyun Choi
Seon Joo Kim
Jonghyun Choi
SyDa
447
13
0
16 Mar 2024
Knowledge Conflicts for LLMs: A Survey
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Rongwu Xu
Zehan Qi
Zhijiang Guo
Cunxiang Wang
Hongru Wang
Yue Zhang
Wei Xu
1.3K
240
0
13 Mar 2024
Copyleft for Alleviating AIGC Copyright Dilemma: What-if Analysis, Public Perception and Implications
Xinwei Guo
Yujun Li
Yafeng Peng
Xuetao Wei
216
3
0
19 Feb 2024
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?
Agustinus Kristiadi
Felix Strieth-Kalthoff
Marta Skreta
Pascal Poupart
Alán Aspuru-Guzik
Geoff Pleiss
430
52
0
07 Feb 2024
Efficient Prompt Caching via Embedding Similarity
Hanlin Zhu
Banghua Zhu
Jiantao Jiao
RALM
240
13
0
02 Feb 2024
The Compute Divide in Machine Learning: A Threat to Academic Contribution and Scrutiny?
T. Besiroglu
S. Bergerson
Amelia Michael
Lennart Heim
Xueyun Luo
Neil Thompson
320
29
0
04 Jan 2024
Train ñ Trade: Foundations of Parameter Markets
Neural Information Processing Systems (NeurIPS), 2023
Tzu-Heng Huang
Harit Vishwakarma
Frederic Sala
AIFin
233
4
0
07 Dec 2023
Collaboration or Corporate Capture? Quantifying NLP's Reliance on Industry Artifacts and Contributions
Will Aitken
Mohamed Abdalla
K. Rudie
Catherine Stinson
270
4
0
06 Dec 2023
ASPEN: High-Throughput LoRA Fine-Tuning of Large Language Models with a Single GPU
Proceedings of the VLDB Endowment (PVLDB), 2023
Zhengmao Ye
Dengchun Li
Jingqi Tian
Tingfeng Lan
Jie Zuo
...
Hui Lu
Yexi Jiang
Jian Sha
Ke Zhang
Mingjie Tang
361
7
0
05 Dec 2023
Refine, Discriminate and Align: Stealing Encoders via Sample-Wise Prototypes and Multi-Relational Extraction
European Conference on Computer Vision (ECCV), 2023
Shuchi Wu
Chuan Ma
Kang Wei
Xiaogang Xu
Ming Ding
Yuwen Qian
Tao Xiang
295
1
0
01 Dec 2023
Generalisable Agents for Neural Network Optimisation
Kale-ab Tessera
C. Tilbury
Sasha Abramowitz
Ruan de Kock
Omayma Mahjoub
Benjamin Rosman
Sara Hooker
Arnu Pretorius
AI4CE
241
0
0
30 Nov 2023
Efficient Transformer Knowledge Distillation: A Performance Review
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Nathan Brown
Ashton Williamson
Tahj Anderson
Logan Lawrence
VLM
184
11
0
22 Nov 2023
Show Your Work with Confidence: Confidence Bands for Tuning Curves
Nicholas Lourie
Kyunghyun Cho
He He
227
3
0
16 Nov 2023
Exploring Dataset-Scale Indicators of Data Quality
Ben Feuer
Chinmay Hegde
241
1
0
07 Nov 2023
KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training
Truong Thao Nguyen
Balazs Gerofi
Edgar Josafat Martinez-Noriega
Franccois Trahay
Mohamed Wahib
287
2
0
16 Oct 2023
"A Nova Eletricidade: Aplicações, Riscos e Tendências da IA Moderna -- "The New Electricity": Applications, Risks, and Trends in Current AI
A. Bazzan
Anderson R. Tavares
André G. Pereira
C. R. Jung
Jacob Scharcanski
J. Carbonera
Luís C. Lamb
Mariana Recamonde Mendoza
T. L. T. D. Silveira
V. P. Moreira
216
0
0
08 Oct 2023
Beyond Labeling Oracles: What does it mean to steal ML models?
Avital Shafran
Ilia Shumailov
Murat A. Erdogdu
Nicolas Papernot
AAML
423
5
0
03 Oct 2023
Challenges and Opportunities of Using Transformer-Based Multi-Task Learning in NLP Through ML Lifecycle: A Survey
Lovre Torbarina
Tin Ferkovic
Lukasz Roguski
Velimir Mihelčić
Bruno Šarlija
Z. Kraljevic
265
8
0
16 Aug 2023
Using Artificial Populations to Study Psychological Phenomena in Neural Models
AAAI Conference on Artificial Intelligence (AAAI), 2023
Jesse Roberts
Kyle Moore
Drew Wilenzick
Doug Fisher
303
8
0
15 Aug 2023
RAI Guidelines: Method for Generating Responsible AI Guidelines Grounded in Regulations and Usable by (Non-)Technical Roles
Marios Constantinides
Edyta Bogucka
Daniele Quercia
Susanna Kallio
Mohammad Tahaei
320
29
0
27 Jul 2023
Improving Retrieval-Augmented Large Language Models via Data Importance Learning
Xiaozhong Lyu
Stefan Grafberger
Samantha Biegel
Shaopeng Wei
Meng Cao
Sebastian Schelter
Ce Zhang
RALM
218
22
0
06 Jul 2023
Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement Learning
International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2023
Qi-Dong Ding
Pengfei Zheng
Shreyas Kudari
Shivaram Venkataraman
Zhao-jie Zhang
VLM
OffRL
234
6
0
25 Jun 2023
Document Image Cleaning using Budget-Aware Black-Box Approximation
Ganesh Tata
Katyani Singh
E. V. Oeveren
Nilanjan Ray
AAML
185
0
0
22 Jun 2023
Lost in Translation: Large Language Models in Non-English Content Analysis
Gabriel Nicholas
Aliya Bhatia
ELM
301
65
0
12 Jun 2023
Evaluating the Social Impact of Generative AI Systems in Systems and Society
Irene Solaiman
Zeerak Talat
William Agnew
Lama Ahmad
Dylan K. Baker
...
Marie-Therese Png
Shubham Singh
A. Strait
Lukas Struppek
Arjun Subramonian
ELM
EGVM
563
161
0
09 Jun 2023
On Optimal Caching and Model Multiplexing for Large Model Inference
Banghua Zhu
Ying Sheng
Lianmin Zheng
Clark W. Barrett
Sai Li
Jiantao Jiao
410
28
0
03 Jun 2023
DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method
Neural Information Processing Systems (NeurIPS), 2023
Ahmed Khaled
Konstantin Mishchenko
Chi Jin
ODL
479
44
0
25 May 2023
Automated Tensor Model Parallelism with Overlapped Communication for Efficient Foundation Model Training
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2023
Shengwei Li
Zhiquan Lai
Yanqi Hao
Weijie Liu
Ke-shi Ge
Xiaoge Deng
Dongsheng Li
KaiCheng Lu
207
11
0
25 May 2023
Annotation Imputation to Individualize Predictions: Initial Studies on Distribution Dynamics and Model Predictions
London Lowmanstone
Ruyuan Wan
Risako Owan
Jaehyung Kim
Luan Tuyen Chau
303
1
0
24 May 2023
MoMo: Momentum Models for Adaptive Learning Rates
International Conference on Machine Learning (ICML), 2023
Fabian Schaipp
Ruben Ohana
Michael Eickenberg
Aaron Defazio
Robert Mansel Gower
407
21
0
12 May 2023
INGENIOUS: Using Informative Data Subsets for Efficient Pre-Training of Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
H. S. V. N. S. K. Renduchintala
Krishnateja Killamsetty
S. Bhatia
Milan Aggarwal
Ganesh Ramakrishnan
Rishabh K. Iyer
Balaji Krishnamurthy
AIFin
166
4
0
11 May 2023
1
2
3
Next
Page 1 of 3