ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.03828
  4. Cited By
Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

4 April 2024
Jerry Yao-Chieh Hu
Pei-Hsuan Chang
Haozheng Luo
Hong-Yu Chen
Weijian Li
Wei-Po Wang
Han Liu
ArXivPDFHTML

Papers citing "Outlier-Efficient Hopfield Layers for Large Transformer-Based Models"

26 / 26 papers shown
Title
Fast and Low-Cost Genomic Foundation Models via Outlier Removal
Fast and Low-Cost Genomic Foundation Models via Outlier Removal
Haozheng Luo
Chenghao Qiu
Maojiang Su
Zhihan Zhou
Zoe Mehta
Guo Ye
Jerry Yao-Chieh Hu
Han Liu
AAML
53
0
0
01 May 2025
Video Latent Flow Matching: Optimal Polynomial Projections for Video Interpolation and Extrapolation
Video Latent Flow Matching: Optimal Polynomial Projections for Video Interpolation and Extrapolation
Yang Cao
Zhao-quan Song
Chiwun Yang
VGen
41
2
0
01 Feb 2025
Active-Dormant Attention Heads: Mechanistically Demystifying
  Extreme-Token Phenomena in LLMs
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Tianyu Guo
Druv Pai
Yu Bai
Jiantao Jiao
Michael I. Jordan
Song Mei
13
9
0
17 Oct 2024
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descent
Bo Chen
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
67
18
0
15 Oct 2024
HSR-Enhanced Sparse Attention Acceleration
HSR-Enhanced Sparse Attention Acceleration
Bo Chen
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
73
17
0
14 Oct 2024
Fine-grained Attention I/O Complexity: Comprehensive Analysis for
  Backward Passes
Fine-grained Attention I/O Complexity: Comprehensive Analysis for Backward Passes
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
Yufa Zhou
42
15
0
12 Oct 2024
A Tighter Complexity Analysis of SparseGPT
A Tighter Complexity Analysis of SparseGPT
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
58
20
0
22 Aug 2024
Advanced AI Framework for Enhanced Detection and Assessment of Abdominal
  Trauma: Integrating 3D Segmentation with 2D CNN and RNN Models
Advanced AI Framework for Enhanced Detection and Assessment of Abdominal Trauma: Integrating 3D Segmentation with 2D CNN and RNN Models
Liheng Jiang
Xuechun yang
Chang Yu
Zhizhong Wu
Yuting Wang
35
18
0
23 Jul 2024
Do LLMs dream of elephants (when told not to)? Latent concept
  association and associative memory in transformers
Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers
Yibo Jiang
Goutham Rajendran
Pradeep Ravikumar
Bryon Aragam
CLL
KELM
21
6
0
26 Jun 2024
Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge
  Devices
Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge Devices
Ruiyang Qin
Dancheng Liu
Zheyu Yan
Zhaoxuan Tan
Zixuan Pan
Zhenge Jia
Meng-Long Jiang
Ahmed Abbasi
Jinjun Xiong
Yiyu Shi
40
10
0
06 Jun 2024
Decoupled Alignment for Robust Plug-and-Play Adaptation
Decoupled Alignment for Robust Plug-and-Play Adaptation
Haozheng Luo
Jiahao Yu
Wenxin Zhang
Jialong Li
Jerry Yao-Chieh Hu
Xingyu Xing
Han Liu
31
10
0
03 Jun 2024
Enhancing Jailbreak Attack Against Large Language Models through Silent
  Tokens
Enhancing Jailbreak Attack Against Large Language Models through Silent Tokens
Jiahao Yu
Haozheng Luo
Jerry Yao-Chieh Hu
Wenbo Guo
Han Liu
Xinyu Xing
23
18
0
31 May 2024
Conv-CoA: Improving Open-domain Question Answering in Large Language
  Models via Conversational Chain-of-Action
Conv-CoA: Improving Open-domain Question Answering in Large Language Models via Conversational Chain-of-Action
Zhenyu Pan
Haozheng Luo
Manling Li
Han Liu
LRM
30
10
0
28 May 2024
Beyond Scaling Laws: Understanding Transformer Performance with
  Associative Memory
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory
Xueyan Niu
Bo Bai
Lei Deng
Wei Han
20
6
0
14 May 2024
BiSHop: Bi-Directional Cellular Learning for Tabular Data with
  Generalized Sparse Modern Hopfield Model
BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Modern Hopfield Model
Chenwei Xu
Yu-Chao Huang
Jerry Yao-Chieh Hu
Weijian Li
Ammar Gilani
H. Goan
Han Liu
32
19
0
04 Apr 2024
Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models
Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models
Dennis Wu
Jerry Yao-Chieh Hu
Teng-Yun Hsiao
Han Liu
27
28
0
04 Apr 2024
Massive Activations in Large Language Models
Massive Activations in Large Language Models
Mingjie Sun
Xinlei Chen
J. Zico Kolter
Zhuang Liu
57
64
0
27 Feb 2024
DNABERT-S: Learning Species-Aware DNA Embedding with Genome Foundation
  Models
DNABERT-S: Learning Species-Aware DNA Embedding with Genome Foundation Models
Zhihan Zhou
Weimin Wu
Harrison Ho
Jiayi Wang
Lizhen Shi
R. Davuluri
Zhong Wang
Han Liu
37
9
0
13 Feb 2024
Differentially Private Attention Computation
Differentially Private Attention Computation
Yeqi Gao
Zhao-quan Song
Xin Yang
28
19
0
08 May 2023
The Closeness of In-Context Learning and Weight Shifting for Softmax
  Regression
The Closeness of In-Context Learning and Weight Shifting for Softmax Regression
Shuai Li
Zhao-quan Song
Yu Xia
Tong Yu
Tianyi Zhou
15
32
0
26 Apr 2023
Context-enriched molecule representations improve few-shot drug
  discovery
Context-enriched molecule representations improve few-shot drug discovery
Johannes Schimunek
Philipp Seidl
Lukas Friedrich
Daniel Kuhn
F. Rippmann
Sepp Hochreiter
G. Klambauer
42
26
0
24 Apr 2023
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP
CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP
Andreas Fürst
Elisabeth Rumetshofer
Johannes Lehner
Viet-Hung Tran
Fei Tang
...
David P. Kreil
Michael K Kopp
G. Klambauer
Angela Bitto-Nemling
Sepp Hochreiter
VLM
CLIP
185
101
0
21 Oct 2021
Informer: Beyond Efficient Transformer for Long Sequence Time-Series
  Forecasting
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting
Haoyi Zhou
Shanghang Zhang
J. Peng
Shuai Zhang
Jianxin Li
Hui Xiong
Wan Zhang
AI4TS
161
3,799
0
14 Dec 2020
Open-Ended Multi-Modal Relational Reasoning for Video Question Answering
Open-Ended Multi-Modal Relational Reasoning for Video Question Answering
Haozheng Luo
Ruiyang Qin
Chenwei Xu
Guo Ye
Zening Luo
35
4
0
01 Dec 2020
Modern Hopfield Networks and Attention for Immune Repertoire
  Classification
Modern Hopfield Networks and Attention for Immune Repertoire Classification
Michael Widrich
Bernhard Schafl
Hubert Ramsauer
Milena Pavlović
Lukas Gruber
...
Johannes Brandstetter
G. K. Sandve
Victor Greiff
Sepp Hochreiter
G. Klambauer
171
117
0
16 Jul 2020
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
279
39,083
0
01 Sep 2014
1