ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.02385
  4. Cited By
TinyLlama: An Open-Source Small Language Model

TinyLlama: An Open-Source Small Language Model

4 January 2024
Peiyuan Zhang
Guangtao Zeng
Tianduo Wang
Wei Lu
    ALM
    LRM
ArXivPDFHTML

Papers citing "TinyLlama: An Open-Source Small Language Model"

50 / 261 papers shown
Title
Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding
Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding
Sukmin Cho
S. Choi
T. Hwang
Jeongyeon Seo
Soyeong Jeong
Huije Lee
Hoyun Song
Jong C. Park
Youngjin Kwon
51
0
0
08 Feb 2025
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs
Yuhang Zhou
Giannis Karamanolakis
Victor Soto
Anna Rumshisky
Mayank Kulkarni
Furong Huang
Wei Ai
Jianhua Lu
MoMe
101
0
0
03 Feb 2025
Nearly Lossless Adaptive Bit Switching
Nearly Lossless Adaptive Bit Switching
Haiduo Huang
Zhenhua Liu
Tian Xia
Wenzhe zhao
Pengju Ren
MQ
53
0
0
03 Feb 2025
Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance
Evaluating Small Language Models for News Summarization: Implications and Factors Influencing Performance
Borui Xu
Yao Chen
Zeyi Wen
Weiguo Liu
Bingsheng He
64
1
0
02 Feb 2025
Vision-centric Token Compression in Large Language Model
Vision-centric Token Compression in Large Language Model
Ling Xing
Alex Jinpeng Wang
Rui Yan
J. Tang
VLM
60
0
0
02 Feb 2025
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
Makoto Shing
Kou Misaki
Han Bao
Sho Yokoi
Takuya Akiba
VLM
57
1
0
28 Jan 2025
Irrational Complex Rotations Empower Low-bit Optimizers
Irrational Complex Rotations Empower Low-bit Optimizers
Zhen Tian
Wayne Xin Zhao
Ji-Rong Wen
MQ
41
0
0
22 Jan 2025
Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework
Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework
Yushen Lin
Ruichen Zhang
Wenqi Huang
Kaidi Wang
Z. Ding
Daniel K. C. So
Dusit Niyato
59
0
0
17 Jan 2025
On the uncertainty principle of neural networks
On the uncertainty principle of neural networks
Jun-Jie Zhang
Dong-xiao Zhang
Jian-Nan Chen
L. Pang
Deyu Meng
57
2
0
17 Jan 2025
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
82
0
0
30 Dec 2024
Deploying Foundation Model Powered Agent Services: A Survey
Deploying Foundation Model Powered Agent Services: A Survey
Wenchao Xu
Jinyu Chen
Peirong Zheng
Xiaoquan Yi
Tianyi Tian
...
Quan Wan
Haozhao Wang
Yunfeng Fan
Qinliang Su
Xuemin Shen
AI4CE
112
1
0
18 Dec 2024
Learning to Reason via Self-Iterative Process Feedback for Small
  Language Models
Learning to Reason via Self-Iterative Process Feedback for Small Language Models
Kaiyuan Chen
Jin Wang
Xuejie Zhang
LRM
ReLM
74
2
0
11 Dec 2024
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision
  Language Models
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models
Byung-Kwan Lee
Ryo Hachiuma
Yu-Chiang Frank Wang
Y. Ro
Yueh-Hua Wu
VLM
81
0
0
02 Dec 2024
Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile
  Vision-Language Model
Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Model
Qianhan Feng
Wenshuo Li
Tong Lin
Xinghao Chen
VLM
67
0
0
02 Dec 2024
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Zichun Liao
Yusuke Kato
Kazuki Kozuka
Aditya Grover
VGen
90
5
0
02 Dec 2024
CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning
CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning
Duo Wu
J. Wang
Yuan Meng
Yanning Zhang
Le Sun
Zhi Wang
117
0
0
25 Nov 2024
Bi-Mamba: Towards Accurate 1-Bit State Space Models
Shengkun Tang
Liqun Ma
H. Li
Mingjie Sun
Zhiqiang Shen
Mamba
70
3
0
18 Nov 2024
LLäMmlein: Compact and Competitive German-Only Language Models from Scratch
Jan Pfister
Julia Wunderle
Andreas Hotho
23
1
0
17 Nov 2024
SoftLMs: Efficient Adaptive Low-Rank Approximation of Language Models
  using Soft-Thresholding Mechanism
SoftLMs: Efficient Adaptive Low-Rank Approximation of Language Models using Soft-Thresholding Mechanism
Priyansh Bhatnagar
Linfeng Wen
Mingu Kang
34
0
0
15 Nov 2024
FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for
  Scalable Training
FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training
Philip Zmushko
Aleksandr Beznosikov
Martin Takáč
Samuel Horváth
37
0
0
12 Nov 2024
LLM-NEO: Parameter Efficient Knowledge Distillation for Large Language Models
LLM-NEO: Parameter Efficient Knowledge Distillation for Large Language Models
Runming Yang
Taiqiang Wu
Jiahao Wang
Pengfei Hu
Ngai Wong
Yujiu Yang
Yujiu Yang
74
0
0
11 Nov 2024
Privacy Risks of Speculative Decoding in Large Language Models
Privacy Risks of Speculative Decoding in Large Language Models
Jiankun Wei
Abdulrahman Abdulrazzag
Tianchen Zhang
Adel Muursepp
Gururaj Saileshwar
33
2
0
01 Nov 2024
Improving Few-Shot Cross-Domain Named Entity Recognition by Instruction
  Tuning a Word-Embedding based Retrieval Augmented Large Language Model
Improving Few-Shot Cross-Domain Named Entity Recognition by Instruction Tuning a Word-Embedding based Retrieval Augmented Large Language Model
Subhadip Nandi
Neeraj Agrawal
24
0
0
01 Nov 2024
MoD: A Distribution-Based Approach for Merging Large Language Models
MoD: A Distribution-Based Approach for Merging Large Language Models
Quy-Anh Dang
Chris Ngo
MoMe
VLM
21
0
0
01 Nov 2024
MESS+: Energy-Optimal Inferencing in Language Model Zoos with Service
  Level Guarantees
MESS+: Energy-Optimal Inferencing in Language Model Zoos with Service Level Guarantees
Ryan Zhang
Herbert Woisetschläger
Shiqiang Wang
Hans-Arno Jacobsen
21
0
0
31 Oct 2024
Multilingual Pretraining Using a Large Corpus Machine-Translated from a
  Single Source Language
Multilingual Pretraining Using a Large Corpus Machine-Translated from a Single Source Language
Jiayi Wang
Yao Lu
Maurice Weber
Max Ryabinin
Yihong Chen
Raphael Tang
Pontus Stenetorp
LRM
39
1
0
31 Oct 2024
Mobility-LLM: Learning Visiting Intentions and Travel Preferences from
  Human Mobility Data with Large Language Models
Mobility-LLM: Learning Visiting Intentions and Travel Preferences from Human Mobility Data with Large Language Models
Letian Gong
Yan Lin
Xinyue Zhang
Yiwen Lu
Xuedi Han
Yichen Liu
S. Guo
Youfang Lin
Huaiyu Wan
33
4
0
29 Oct 2024
Transferable Post-training via Inverse Value Learning
Transferable Post-training via Inverse Value Learning
Xinyu Lu
Xueru Wen
Y. Lu
Bowen Yu
Hongyu Lin
Haiyang Yu
Le Sun
Xianpei Han
Yongbin Li
17
1
0
28 Oct 2024
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
Sangmin Bae
Adam Fisch
Hrayr Harutyunyan
Ziwei Ji
Seungyeon Kim
Tal Schuster
KELM
68
5
0
28 Oct 2024
Computational Bottlenecks of Training Small-scale Large Language Models
Computational Bottlenecks of Training Small-scale Large Language Models
Saleh Ashkboos
Iman Mirzadeh
Keivan Alizadeh
Mohammad Hossein Sekhavat
Moin Nabi
Mehrdad Farajtabar
Fartash Faghri
16
0
0
25 Oct 2024
Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and
  Evaluation
Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and Evaluation
Krzysztof Ociepa
Łukasz Flis
Krzysztof Wróbel
Adrian Gwoździej
Remigiusz Kinas
20
1
0
24 Oct 2024
Scaling up Masked Diffusion Models on Text
Scaling up Masked Diffusion Models on Text
Shen Nie
Fengqi Zhu
Chao Du
Tianyu Pang
Qian Liu
Guangtao Zeng
Min-Bin Lin
Chongxuan Li
AI4CE
45
13
0
24 Oct 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
65
5
0
22 Oct 2024
A Systematic Study of Cross-Layer KV Sharing for Efficient LLM Inference
A Systematic Study of Cross-Layer KV Sharing for Efficient LLM Inference
You Wu
Haoyi Wu
Kewei Tu
34
3
0
18 Oct 2024
BenTo: Benchmark Task Reduction with In-Context Transferability
BenTo: Benchmark Task Reduction with In-Context Transferability
Hongyu Zhao
Ming Li
Lichao Sun
Tianyi Zhou
28
0
0
17 Oct 2024
Learning from Imperfect Data: Towards Efficient Knowledge Distillation
  of Autoregressive Language Models for Text-to-SQL
Learning from Imperfect Data: Towards Efficient Knowledge Distillation of Autoregressive Language Models for Text-to-SQL
Qihuang Zhong
Kunfeng Chen
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
29
0
0
15 Oct 2024
ControlMM: Controllable Masked Motion Generation
ControlMM: Controllable Masked Motion Generation
Ekkasit Pinyoanuntapong
Muhammad Usama Saleem
Korrawe Karunratanakul
Pu Wang
Hongfei Xue
C. L. P. Chen
Chuan Guo
Junli Cao
J. Ren
Sergey Tulyakov
VGen
29
4
0
14 Oct 2024
Reverse Modeling in Large Language Models
Reverse Modeling in Large Language Models
S. Yu
Yuanchen Xu
Cunxiao Du
Yanying Zhou
Minghui Qiu
Q. Sun
Hao Zhang
Jiawei Wu
29
2
0
13 Oct 2024
CAMPHOR: Collaborative Agents for Multi-input Planning and High-Order
  Reasoning On Device
CAMPHOR: Collaborative Agents for Multi-input Planning and High-Order Reasoning On Device
Yicheng Fu
R. Anantha
Jianpeng Cheng
LRM
LLMAG
21
2
0
12 Oct 2024
Generation with Dynamic Vocabulary
Generation with Dynamic Vocabulary
Yanting Liu
Tao Ji
Changzhi Sun
Yuanbin Wu
Xiaoling Wang
30
0
0
11 Oct 2024
KV Prediction for Improved Time to First Token
KV Prediction for Improved Time to First Token
Maxwell Horton
Qingqing Cao
Chenfan Sun
Yanzi Jin
Sachin Mehta
Mohammad Rastegari
Moin Nabi
AI4TS
30
1
0
10 Oct 2024
VerifierQ: Enhancing LLM Test Time Compute with Q-Learning-based
  Verifiers
VerifierQ: Enhancing LLM Test Time Compute with Q-Learning-based Verifiers
Jianing Qi
Hao Tang
Zhigang Zhu
OffRL
LRM
18
4
0
10 Oct 2024
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation
  Experts
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
27
4
0
09 Oct 2024
Exploring the Readiness of Prominent Small Language Models for the
  Democratization of Financial Literacy
Exploring the Readiness of Prominent Small Language Models for the Democratization of Financial Literacy
Tagore Rao Kosireddy
Jeffrey D. Wall
Evan Lucas
24
1
0
09 Oct 2024
Personal Intelligence System UniLM: Hybrid On-Device Small Language
  Model and Server-Based Large Language Model for Malay Nusantara
Personal Intelligence System UniLM: Hybrid On-Device Small Language Model and Server-Based Large Language Model for Malay Nusantara
Azree Nazri
Olalekan Agbolade
Faisal Aziz
20
0
0
09 Oct 2024
Root Defence Strategies: Ensuring Safety of LLM at the Decoding Level
Root Defence Strategies: Ensuring Safety of LLM at the Decoding Level
Xinyi Zeng
Yuying Shang
Yutao Zhu
Jingyuan Zhang
Yu Tian
AAML
63
2
0
09 Oct 2024
QERA: an Analytical Framework for Quantization Error Reconstruction
QERA: an Analytical Framework for Quantization Error Reconstruction
Cheng Zhang
Jeffrey T. H. Wong
Can Xiao
G. Constantinides
Yiren Zhao
MQ
35
0
0
08 Oct 2024
TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning
  Large Language Models
TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning Large Language Models
Zefang Liu
Yinzhu Quan
21
0
0
02 Oct 2024
Mitigating Training Imbalance in LLM Fine-Tuning via Selective Parameter
  Merging
Mitigating Training Imbalance in LLM Fine-Tuning via Selective Parameter Merging
Yiming Ju
Ziyi Ni
Xingrun Xing
Zhixiong Zeng
hanyu Zhao
Siqi Fan
Zheng Zhang
MoMe
24
2
0
01 Oct 2024
Fisher Information-based Efficient Curriculum Federated Learning with
  Large Language Models
Fisher Information-based Efficient Curriculum Federated Learning with Large Language Models
Ji Liu
Jiaxiang Ren
Ruoming Jin
Zijie Zhang
Yang Zhou
P. Valduriez
Dejing Dou
FedML
19
1
0
30 Sep 2024
Previous
123456
Next