Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.18795
Cited By
Rope to Nope and Back Again: A New Hybrid Attention Strategy
30 January 2025
Bowen Yang
Bharat Venkitesh
Dwarak Talupuru
Hangyu Lin
David Cairuz
Phil Blunsom
Acyr Locatelli
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Rope to Nope and Back Again: A New Hybrid Attention Strategy"
5 / 5 papers shown
Title
RATTENTION: Towards the Minimal Sliding Window Size in Local-Global Attention Models
Bailin Wang
Chang Lan
Chong-Jun Wang
Ruoming Pang
17
0
0
18 Jun 2025
Hardware-Efficient Attention for Fast Decoding
Ted Zadouri
Hubert Strauss
Tri Dao
77
2
0
27 May 2025
Mechanistic Interpretability of GPT-like Models on Summarization Tasks
Anurag Mishra
MILM
43
0
0
20 May 2025
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Arvid Frydenlund
LRM
179
0
0
13 Mar 2025
Baichuan-M1: Pushing the Medical Capability of Large Language Models
Binghai Wang
Haizhou Zhao
Huozhi Zhou
Liang Song
Mingyu Xu
...
Yan Zhang
Yifei Duan
Yuyan Zhou
Zhi-Ming Ma
Zhikai Wu
LM&MA
ELM
AI4MH
123
10
0
18 Feb 2025
1