ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.00088
  4. Cited By
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

25 June 2024
Jianyu Wei
Shijie Cao
Ting Cao
Lingxiao Ma
Lei Wang
Yanyong Zhang
Mao Yang
    MQ
ArXivPDFHTML

Papers citing "T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge"

8 / 8 papers shown
Title
Scaling Up On-Device LLMs via Active-Weight Swapping Between DRAM and Flash
Scaling Up On-Device LLMs via Active-Weight Swapping Between DRAM and Flash
Fucheng Jia
Zewen Wu
Shiqi Jiang
Huiqiang Jiang
Qianxi Zhang
Y. Yang
Yunxin Liu
Ju Ren
Deyu Zhang
Ting Cao
31
0
0
11 Apr 2025
Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency
Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency
E. J. Husom
Arda Goknil
Merve Astekin
Lwin Khin Shar
Andre Kåsen
S. Sen
Benedikt Andreas Mithassel
Ahmet Soylu
MQ
30
0
0
04 Apr 2025
Binary Neural Networks for Large Language Model: A Survey
Binary Neural Networks for Large Language Model: A Survey
Liangdong Liu
Zhitong Zheng
Cong Wang
Tianhuang Su
Z. Yang
MQ
63
0
0
26 Feb 2025
Bitnet.cpp: Efficient Edge Inference for Ternary LLMs
Bitnet.cpp: Efficient Edge Inference for Ternary LLMs
J. Wang
Hansong Zhou
Ting Song
Shijie Cao
Yan Xia
Ting Cao
Jianyu Wei
Shuming Ma
Hongyu Wang
Furu Wei
53
0
0
17 Feb 2025
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
48
13
0
06 Oct 2024
EfficientQAT: Efficient Quantization-Aware Training for Large Language
  Models
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Mengzhao Chen
Wenqi Shao
Peng Xu
Jiahao Wang
Peng Gao
Kaipeng Zhang
Yu Qiao
Ping Luo
MQ
34
21
0
10 Jul 2024
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for
  Large Language Models
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models
Bowen Ping
Shuo Wang
Hanqing Wang
Xu Han
Yuzhuang Xu
Yukun Yan
Yun Chen
Baobao Chang
Zhiyuan Liu
Maosong Sun
MQ
41
4
0
13 Jun 2024
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via
  Self-Distillation
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
Dayou Du
Yijia Zhang
Shijie Cao
Jiaqi Guo
Ting Cao
Xiaowen Chu
Ningyi Xu
MQ
35
28
0
16 Feb 2024
1