Title |
---|
![]() N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs Ilya Zisman Alexander Nikulin Andrei Polubarov Nikita Lyubaykin Vladislav Kurenkov Andrei Polubarov Igor Kiselev Vladislav Kurenkov |
![]() DAPE V2: Process Attention Score as Feature Map for Length Extrapolation Chuanyang Zheng Yihang Gao Han Shi Jing Xiong Jiankai Sun ...Xiaozhe Ren Michael Ng Xin Jiang Zhenguo Li Yu Li |
![]() OLMo: Accelerating the Science of Language Models Dirk Groeneveld Iz Beltagy Pete Walsh Akshita Bhagia Rodney Michael Kinney ...Jesse Dodge Kyle Lo Luca Soldaini Noah A. Smith Hanna Hajishirzi |