ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.08900
  4. Cited By
The Cost of Training NLP Models: A Concise Overview

The Cost of Training NLP Models: A Concise Overview

19 April 2020
Or Sharir
Barak Peleg
Y. Shoham
ArXivPDFHTML

Papers citing "The Cost of Training NLP Models: A Concise Overview"

50 / 104 papers shown
Title
QUPID: Quantified Understanding for Enhanced Performance, Insights, and Decisions in Korean Search Engines
QUPID: Quantified Understanding for Enhanced Performance, Insights, and Decisions in Korean Search Engines
Ohjoon Kwon
Changsu Lee
Jihye Back
Lim Sun Suk
Inho Kang
Donghyeon Jeon
40
0
0
12 May 2025
Harnessing uncertainty when learning through Equilibrium Propagation in neural networks
Harnessing uncertainty when learning through Equilibrium Propagation in neural networks
Jonathan Peters
Philippe Talatchian
37
0
0
28 Mar 2025
Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
Yiqin Yang
Quanwei Wang
Chenghao Li
Hao Hu
Chengjie Wu
...
Dianyu Zhong
Ziyou Zhang
Qianchuan Zhao
Chongjie Zhang
Xu Bo
OffRL
47
0
0
26 Feb 2025
TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in
  Vision-Language Models
TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models
Xin Wang
Kai-xiang Chen
Jiaming Zhang
Jingjing Chen
Xingjun Ma
AAML
VPVLM
VLM
83
1
0
20 Nov 2024
Understanding Adam Requires Better Rotation Dependent Assumptions
Understanding Adam Requires Better Rotation Dependent Assumptions
Lucas Maes
Tianyue H. Zhang
Alexia Jolicoeur-Martineau
Ioannis Mitliagkas
Damien Scieur
Simon Lacoste-Julien
Charles Guille-Escuret
32
3
0
25 Oct 2024
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Yiding Jiang
Allan Zhou
Zhili Feng
Sadhika Malladi
J. Zico Kolter
39
15
0
15 Oct 2024
Fortify Your Foundations: Practical Privacy and Security for Foundation
  Model Deployments In The Cloud
Fortify Your Foundations: Practical Privacy and Security for Foundation Model Deployments In The Cloud
Marcin Chrapek
Anjo Vahldiek-Oberwagner
Marcin Spoczynski
Scott Constable
Mona Vij
Torsten Hoefler
35
1
0
08 Oct 2024
Taylor Unswift: Secured Weight Release for Large Language Models via Taylor Expansion
Taylor Unswift: Secured Weight Release for Large Language Models via Taylor Expansion
Guanchu Wang
Yu-Neng Chuang
Ruixiang Tang
Shaochen Zhong
Jiayi Yuan
...
Zirui Liu
V. Chaudhary
Shuai Xu
James Caverlee
Xia Hu
PILM
76
1
0
06 Oct 2024
On Tables with Numbers, with Numbers
On Tables with Numbers, with Numbers
Konstantinos Kogkalidis
S. Chatzikyriakidis
LMTD
24
0
0
12 Aug 2024
AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial
  Contrastive Prompt Tuning
AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning
Xin Wang
Kai-xiang Chen
Xingjun Ma
Zhineng Chen
Jingjing Chen
Yu-Gang Jiang
AAML
38
3
0
04 Aug 2024
DeepCodeProbe: Towards Understanding What Models Trained on Code Learn
DeepCodeProbe: Towards Understanding What Models Trained on Code Learn
Vahid Majdinasab
Amin Nikanjam
Foutse Khomh
40
1
0
11 Jul 2024
Optimizing Language Model's Reasoning Abilities with Weak Supervision
Optimizing Language Model's Reasoning Abilities with Weak Supervision
Yongqi Tong
Sizhe Wang
Dawei Li
Yifan Wang
Simeng Han
Zi Lin
Chengsong Huang
Jiaxin Huang
Jingbo Shang
LRM
ReLM
36
8
0
07 May 2024
Analyzing the Role of Semantic Representations in the Era of Large
  Language Models
Analyzing the Role of Semantic Representations in the Era of Large Language Models
Zhijing Jin
Yuen Chen
Fernando Gonzalez
Jiarui Liu
Jiayi Zhang
Julian Michael
Bernhard Schölkopf
Mona T. Diab
57
4
0
02 May 2024
LLeMpower: Understanding Disparities in the Control and Access of Large
  Language Models
LLeMpower: Understanding Disparities in the Control and Access of Large Language Models
Vishwas Sathish
Hannah Lin
Aditya K Kamath
Anish Nyayachavadi
26
4
0
14 Apr 2024
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Elliot Bolton
Abhinav Venigalla
Michihiro Yasunaga
David Leo Wright Hall
Betty Xiong
...
R. Daneshjou
Jonathan Frankle
Percy Liang
Michael Carbin
Christopher D. Manning
LM&MA
MedIm
32
50
0
27 Mar 2024
Improving Sampling Methods for Fine-tuning SentenceBERT in Text Streams
Improving Sampling Methods for Fine-tuning SentenceBERT in Text Streams
C. M. Garcia
A. L. Koerich
A. Britto
J. P. Barddal
24
0
0
18 Mar 2024
Just Say the Name: Online Continual Learning with Category Names Only
  via Data Generation
Just Say the Name: Online Continual Learning with Category Names Only via Data Generation
Minhyuk Seo
Diganta Misra
Seongwon Cho
Minjae Lee
Jonghyun Choi
CLL
35
7
0
16 Mar 2024
Knowledge Conflicts for LLMs: A Survey
Knowledge Conflicts for LLMs: A Survey
Rongwu Xu
Zehan Qi
Zhijiang Guo
Cunxiang Wang
Hongru Wang
Yue Zhang
Wei Xu
198
92
0
13 Mar 2024
Copyleft for Alleviating AIGC Copyright Dilemma: What-if Analysis,
  Public Perception and Implications
Copyleft for Alleviating AIGC Copyright Dilemma: What-if Analysis, Public Perception and Implications
Xinwei Guo
Yujun Li
Yafeng Peng
Xuetao Wei
25
2
0
19 Feb 2024
A Sober Look at LLMs for Material Discovery: Are They Actually Good for
  Bayesian Optimization Over Molecules?
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?
Agustinus Kristiadi
Felix Strieth-Kalthoff
Marta Skreta
Pascal Poupart
Alán Aspuru-Guzik
Geoff Pleiss
30
21
0
07 Feb 2024
Efficient Prompt Caching via Embedding Similarity
Efficient Prompt Caching via Embedding Similarity
Hanlin Zhu
Banghua Zhu
Jiantao Jiao
RALM
21
9
0
02 Feb 2024
The Compute Divide in Machine Learning: A Threat to Academic
  Contribution and Scrutiny?
The Compute Divide in Machine Learning: A Threat to Academic Contribution and Scrutiny?
T. Besiroglu
S. Bergerson
Amelia Michael
Lennart Heim
Xueyun Luo
Neil Thompson
15
11
0
04 Jan 2024
Train ñ Trade: Foundations of Parameter Markets
Train ñ Trade: Foundations of Parameter Markets
Tzu-Heng Huang
Harit Vishwakarma
Frederic Sala
AIFin
24
2
0
07 Dec 2023
Collaboration or Corporate Capture? Quantifying NLP's Reliance on
  Industry Artifacts and Contributions
Collaboration or Corporate Capture? Quantifying NLP's Reliance on Industry Artifacts and Contributions
Will Aitken
Mohamed Abdalla
K. Rudie
Catherine Stinson
25
0
0
06 Dec 2023
ASPEN: High-Throughput LoRA Fine-Tuning of Large Language Models with a
  Single GPU
ASPEN: High-Throughput LoRA Fine-Tuning of Large Language Models with a Single GPU
Zhengmao Ye
Dengchun Li
Jingqi Tian
Tingfeng Lan
Jie Zuo
...
Hui Lu
Yexi Jiang
Jian Sha
Ke Zhang
Mingjie Tang
91
7
0
05 Dec 2023
Refine, Discriminate and Align: Stealing Encoders via Sample-Wise
  Prototypes and Multi-Relational Extraction
Refine, Discriminate and Align: Stealing Encoders via Sample-Wise Prototypes and Multi-Relational Extraction
Shuchi Wu
Chuan Ma
Kang Wei
Xiaogang Xu
Ming Ding
Yuwen Qian
Tao Xiang
13
0
0
01 Dec 2023
Generalisable Agents for Neural Network Optimisation
Generalisable Agents for Neural Network Optimisation
Kale-ab Tessera
C. Tilbury
Sasha Abramowitz
Ruan de Kock
Omayma Mahjoub
Benjamin Rosman
Sara Hooker
Arnu Pretorius
AI4CE
20
0
0
30 Nov 2023
Efficient Transformer Knowledge Distillation: A Performance Review
Efficient Transformer Knowledge Distillation: A Performance Review
Nathan Brown
Ashton Williamson
Tahj Anderson
Logan Lawrence
VLM
17
5
0
22 Nov 2023
Show Your Work with Confidence: Confidence Bands for Tuning Curves
Show Your Work with Confidence: Confidence Bands for Tuning Curves
Nicholas Lourie
Kyunghyun Cho
He He
13
2
0
16 Nov 2023
Exploring Dataset-Scale Indicators of Data Quality
Exploring Dataset-Scale Indicators of Data Quality
Ben Feuer
Chinmay Hegde
19
1
0
07 Nov 2023
KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training
KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training
Truong Thao Nguyen
Balazs Gerofi
Edgar Josafat Martinez-Noriega
Franccois Trahay
M. Wahib
24
1
0
16 Oct 2023
"A Nova Eletricidade: Aplicações, Riscos e Tendências da IA
  Moderna -- "The New Electricity": Applications, Risks, and Trends in Current
  AI
"A Nova Eletricidade: Aplicações, Riscos e Tendências da IA Moderna -- "The New Electricity": Applications, Risks, and Trends in Current AI
A. Bazzan
Anderson R. Tavares
André G. Pereira
C. R. Jung
Jacob Scharcanski
J. Carbonera
Luís C. Lamb
Mariana Recamonde Mendoza
T. L. T. D. Silveira
V. P. Moreira
29
0
0
08 Oct 2023
Beyond Labeling Oracles: What does it mean to steal ML models?
Beyond Labeling Oracles: What does it mean to steal ML models?
Avital Shafran
Ilia Shumailov
Murat A. Erdogdu
Nicolas Papernot
AAML
24
4
0
03 Oct 2023
Challenges and Opportunities of Using Transformer-Based Multi-Task
  Learning in NLP Through ML Lifecycle: A Survey
Challenges and Opportunities of Using Transformer-Based Multi-Task Learning in NLP Through ML Lifecycle: A Survey
Lovre Torbarina
Tin Ferkovic
Lukasz Roguski
Velimir Mihelčić
Bruno Šarlija
Z. Kraljevic
19
5
0
16 Aug 2023
Using Artificial Populations to Study Psychological Phenomena in Neural
  Models
Using Artificial Populations to Study Psychological Phenomena in Neural Models
Jesse Roberts
Kyle Moore
Drew Wilenzick
Doug Fisher
19
6
0
15 Aug 2023
RAI Guidelines: Method for Generating Responsible AI Guidelines Grounded
  in Regulations and Usable by (Non-)Technical Roles
RAI Guidelines: Method for Generating Responsible AI Guidelines Grounded in Regulations and Usable by (Non-)Technical Roles
Marios Constantinides
Edyta Bogucka
Daniele Quercia
Susanna Kallio
Mohammad Tahaei
25
9
0
27 Jul 2023
Improving Retrieval-Augmented Large Language Models via Data Importance
  Learning
Improving Retrieval-Augmented Large Language Models via Data Importance Learning
Xiaozhong Lyu
Stefan Grafberger
Samantha Biegel
Shaopeng Wei
Meng Cao
Sebastian Schelter
Ce Zhang
RALM
24
14
0
06 Jul 2023
Mirage: Towards Low-interruption Services on Batch GPU Clusters with
  Reinforcement Learning
Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement Learning
Qi-Dong Ding
Pengfei Zheng
Shreyas Kudari
Shivaram Venkataraman
Zhao-jie Zhang
VLM
OffRL
8
3
0
25 Jun 2023
Document Image Cleaning using Budget-Aware Black-Box Approximation
Document Image Cleaning using Budget-Aware Black-Box Approximation
Ganesh Tata
Katyani Singh
E. V. Oeveren
Nilanjan Ray
AAML
15
0
0
22 Jun 2023
Lost in Translation: Large Language Models in Non-English Content
  Analysis
Lost in Translation: Large Language Models in Non-English Content Analysis
Gabriel Nicholas
Aliya Bhatia
ELM
13
35
0
12 Jun 2023
Evaluating the Social Impact of Generative AI Systems in Systems and
  Society
Evaluating the Social Impact of Generative AI Systems in Systems and Society
Irene Solaiman
Zeerak Talat
William Agnew
Lama Ahmad
Dylan K. Baker
...
Marie-Therese Png
Shubham Singh
A. Strait
Lukas Struppek
Arjun Subramonian
ELM
EGVM
31
104
0
09 Jun 2023
On Optimal Caching and Model Multiplexing for Large Model Inference
On Optimal Caching and Model Multiplexing for Large Model Inference
Banghua Zhu
Ying Sheng
Lianmin Zheng
Clark W. Barrett
Michael I. Jordan
Jiantao Jiao
23
17
0
03 Jun 2023
DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent
  Method
DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method
Ahmed Khaled
Konstantin Mishchenko
Chi Jin
ODL
22
22
0
25 May 2023
Automated Tensor Model Parallelism with Overlapped Communication for
  Efficient Foundation Model Training
Automated Tensor Model Parallelism with Overlapped Communication for Efficient Foundation Model Training
Shengwei Li
Zhiquan Lai
Yanqi Hao
Weijie Liu
Ke-shi Ge
Xiaoge Deng
Dongsheng Li
KaiCheng Lu
11
10
0
25 May 2023
Annotation Imputation to Individualize Predictions: Initial Studies on
  Distribution Dynamics and Model Predictions
Annotation Imputation to Individualize Predictions: Initial Studies on Distribution Dynamics and Model Predictions
London Lowmanstone
Ruyuan Wan
Risako Owan
Jaehyung Kim
Dongyeop Kang
19
1
0
24 May 2023
MoMo: Momentum Models for Adaptive Learning Rates
MoMo: Momentum Models for Adaptive Learning Rates
Fabian Schaipp
Ruben Ohana
Michael Eickenberg
Aaron Defazio
Robert Mansel Gower
30
10
0
12 May 2023
INGENIOUS: Using Informative Data Subsets for Efficient Pre-Training of
  Language Models
INGENIOUS: Using Informative Data Subsets for Efficient Pre-Training of Language Models
H. S. V. N. S. K. Renduchintala
Krishnateja Killamsetty
S. Bhatia
Milan Aggarwal
Ganesh Ramakrishnan
Rishabh K. Iyer
Balaji Krishnamurthy
AIFin
17
3
0
11 May 2023
FedSOV: Federated Model Secure Ownership Verification with Unforgeable
  Signature
FedSOV: Federated Model Secure Ownership Verification with Unforgeable Signature
Wenyuan Yang
Gongxi Zhu
Yuguo Yin
Hanlin Gu
Lixin Fan
Qiang Yang
Xiaochun Cao
FedML
11
6
0
10 May 2023
Automatic Gradient Descent: Deep Learning without Hyperparameters
Automatic Gradient Descent: Deep Learning without Hyperparameters
Jeremy Bernstein
Chris Mingard
Kevin Huang
Navid Azizan
Yisong Yue
ODL
16
17
0
11 Apr 2023
CILIATE: Towards Fairer Class-based Incremental Learning by Dataset and
  Training Refinement
CILIATE: Towards Fairer Class-based Incremental Learning by Dataset and Training Refinement
Xuan Gao
Juan Zhai
Shiqing Ma
Chao Shen
Yufei Chen
Shiwei Wang
10
2
0
09 Apr 2023
123
Next