ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.15765
  4. Cited By
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large
  Language Models without Training through Attention Calibration

Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration

22 June 2024
Zhongzhi Yu
Zheng Wang
Yonggan Fu
Huihong Shi
Khalid Shaikh
Yingyan Celine Lin
ArXivPDFHTML

Papers citing "Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration"

9 / 9 papers shown
Title
Attention Mechanisms Perspective: Exploring LLM Processing of Graph-Structured Data
Attention Mechanisms Perspective: Exploring LLM Processing of Graph-Structured Data
Zhong Guan
Likang Wu
Hongke Zhao
Ming He
Jianpin Fan
GNN
25
0
0
04 May 2025
House of Cards: Massive Weights in LLMs
House of Cards: Massive Weights in LLMs
Jaehoon Oh
Seungjun Shin
Dokwan Oh
35
1
0
02 Oct 2024
Chain of LoRA: Efficient Fine-tuning of Language Models via Residual
  Learning
Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning
Wenhan Xia
Chengwei Qin
Elad Hazan
46
52
0
08 Jan 2024
PILLOW: Enhancing Efficient Instruction Fine-tuning via Prompt Matching
PILLOW: Enhancing Efficient Instruction Fine-tuning via Prompt Matching
Zhenting Qi
Xiaoyu Tan
Shaojie Shi
Chao Qu
Yinghui Xu
Yuan Qi
ALM
32
10
0
09 Dec 2023
GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models
GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models
Yonggan Fu
Yongan Zhang
Zhongzhi Yu
Sixu Li
Zhifan Ye
Chaojian Li
Cheng Wan
Ying Lin
35
59
0
19 Sep 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
203
1,651
0
15 Oct 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
278
3,784
0
18 Apr 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
4,424
0
23 Jan 2020
1