ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.03648
  4. Cited By
A Mathematical Exploration of Why Language Models Help Solve Downstream
  Tasks

A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks

7 October 2020
Nikunj Saunshi
Sadhika Malladi
Sanjeev Arora
ArXivPDFHTML

Papers citing "A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks"

21 / 21 papers shown
Title
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao
Tina Behnia
V. Vakilian
Christos Thrampoulidis
55
8
0
20 Feb 2025
Do we really have to filter out random noise in pre-training data for language models?
Do we really have to filter out random noise in pre-training data for language models?
Jinghan Ru
Yuxin Xie
Xianwei Zhuang
Yuguo Yin
Zhihui Guo
Zhiming Liu
Qianli Ren
Yuexian Zou
83
2
0
10 Feb 2025
Semantic Captioning: Benchmark Dataset and Graph-Aware Few-Shot In-Context Learning for SQL2Text
Semantic Captioning: Benchmark Dataset and Graph-Aware Few-Shot In-Context Learning for SQL2Text
Ali Al-Lawati
Jason Lucas
Prasenjit Mitra
LMTD
43
0
0
06 Jan 2025
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin
Sadhika Malladi
Adithya Bhaskar
Danqi Chen
Sanjeev Arora
Boris Hanin
91
14
0
11 Oct 2024
Investigating the Impact of Model Complexity in Large Language Models
Investigating the Impact of Model Complexity in Large Language Models
Jing Luo
Huiyuan Wang
Weiran Huang
34
0
0
01 Oct 2024
An Overview on Language Models: Recent Developments and Outlook
An Overview on Language Models: Recent Developments and Outlook
Chengwei Wei
Yun Cheng Wang
Bin Wang
C.-C. Jay Kuo
20
41
0
10 Mar 2023
On the Provable Advantage of Unsupervised Pretraining
On the Provable Advantage of Unsupervised Pretraining
Jiawei Ge
Shange Tang
Jianqing Fan
Chi Jin
SSL
33
16
0
02 Mar 2023
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for
  Language Models
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
32
49
0
25 Oct 2022
Emergent Abilities of Large Language Models
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Percy Liang
J. Dean
W. Fedus
ELM
ReLM
LRM
48
2,333
0
15 Jun 2022
AANG: Automating Auxiliary Learning
AANG: Automating Auxiliary Learning
Lucio Dery
Paul Michel
M. Khodak
Graham Neubig
Ameet Talwalkar
34
9
0
27 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised
  Learning
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
27
34
0
12 May 2022
Empirical Evaluation and Theoretical Analysis for Representation
  Learning: A Survey
Empirical Evaluation and Theoretical Analysis for Representation Learning: A Survey
Kento Nozawa
Issei Sato
AI4TS
19
4
0
18 Apr 2022
Can Unsupervised Knowledge Transfer from Social Discussions Help
  Argument Mining?
Can Unsupervised Knowledge Transfer from Social Discussions Help Argument Mining?
Subhabrata Dutta
Jeevesh Juneja
Dipankar Das
Tanmoy Chakraborty
9
15
0
24 Mar 2022
Understanding Contrastive Learning Requires Incorporating Inductive
  Biases
Understanding Contrastive Learning Requires Incorporating Inductive Biases
Nikunj Saunshi
Jordan T. Ash
Surbhi Goel
Dipendra Kumar Misra
Cyril Zhang
Sanjeev Arora
Sham Kakade
A. Krishnamurthy
SSL
19
109
0
28 Feb 2022
Self-Supervised Representation Learning: Introduction, Advances and
  Challenges
Self-Supervised Representation Learning: Introduction, Advances and Challenges
Linus Ericsson
H. Gouk
Chen Change Loy
Timothy M. Hospedales
SSL
OOD
AI4TS
27
270
0
18 Oct 2021
On the Surrogate Gap between Contrastive and Supervised Losses
On the Surrogate Gap between Contrastive and Supervised Losses
Han Bao
Yoshihiro Nagano
Kento Nozawa
SSL
UQCV
37
19
0
06 Oct 2021
Cross-lingual Transfer for Text Classification with Dictionary-based
  Heterogeneous Graph
Cross-lingual Transfer for Text Classification with Dictionary-based Heterogeneous Graph
Nuttapong Chairatanakul
Noppayut Sriwatanasakdi
Nontawat Charoenphakdee
Xin Liu
T. Murata
16
4
0
09 Sep 2021
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods
  in Natural Language Processing
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
Pengfei Liu
Weizhe Yuan
Jinlan Fu
Zhengbao Jiang
Hiroaki Hayashi
Graham Neubig
VLM
SyDa
25
3,828
0
28 Jul 2021
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis
  of Head and Prompt Tuning
Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning
Colin Wei
Sang Michael Xie
Tengyu Ma
22
96
0
17 Jun 2021
Can Pretext-Based Self-Supervised Learning Be Boosted by Downstream
  Data? A Theoretical Analysis
Can Pretext-Based Self-Supervised Learning Be Boosted by Downstream Data? A Theoretical Analysis
Jiaye Teng
Weiran Huang
Haowei He
SSL
26
11
0
05 Mar 2021
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics
Vassilina Nikoulina
Maxat Tezekbayev
Nuradil Kozhakhmet
Madina Babazhanova
Matthias Gallé
Z. Assylbekov
29
8
0
02 Mar 2021
1