ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.02385
  4. Cited By
TinyLlama: An Open-Source Small Language Model

TinyLlama: An Open-Source Small Language Model

4 January 2024
Peiyuan Zhang
Guangtao Zeng
Tianduo Wang
Wei Lu
    ALM
    LRM
ArXivPDFHTML

Papers citing "TinyLlama: An Open-Source Small Language Model"

50 / 261 papers shown
Title
Unveiling and Mitigating Bias in Mental Health Analysis with Large
  Language Models
Unveiling and Mitigating Bias in Mental Health Analysis with Large Language Models
Yuqing Wang
Yun Zhao
Sara Alessandra Keller
A. D. Hond
Marieke M. van Buchem
Malvika Pillai
Tina Hernandez-Boussard
AI4MH
15
2
0
17 Jun 2024
HARE: HumAn pRiors, a key to small language model Efficiency
HARE: HumAn pRiors, a key to small language model Efficiency
Lingyun Zhang
Bin jin
Gaojian Ge
Lunhui Liu
Xuewen Shen
Mingyong Wu
Houqian Zhang
Yongneng Jiang
Shiqi Chen
Shi Pu
ALM
27
0
0
17 Jun 2024
Self-training Large Language Models through Knowledge Detection
Self-training Large Language Models through Knowledge Detection
Wei Jie Yeo
Teddy Ferdinan
Przemyslaw Kazienko
Ranjan Satapathy
Erik Cambria
38
9
0
17 Jun 2024
On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion
On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion
Chenghao Fan
Zhenyi Lu
Wei Wei
Jie Tian
Xiaoye Qu
Dangyang Chen
Yu Cheng
MoMe
44
5
0
17 Jun 2024
GEB-1.3B: Open Lightweight Large Language Model
GEB-1.3B: Open Lightweight Large Language Model
Jie Wu
Yufeng Zhu
Lei Shen
Xuqing Lu
ALM
21
0
0
14 Jun 2024
MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases
MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases
Rithesh Murthy
Liangwei Yang
Juntao Tan
Tulika Awalgaonkar
Yilun Zhou
...
Zuxin Liu
Ming Zhu
Huan Wang
Caiming Xiong
Silvio Savarese
57
5
0
12 Jun 2024
The Impact of Initialization on LoRA Finetuning Dynamics
The Impact of Initialization on LoRA Finetuning Dynamics
Soufiane Hayou
Nikhil Ghosh
Bin Yu
AI4CE
34
10
0
12 Jun 2024
Prompt-Based Length Controlled Generation with Multiple Control Types
Prompt-Based Length Controlled Generation with Multiple Control Types
Renlong Jie
Xiaojun Meng
Lifeng Shang
Xin Jiang
Qun Liu
26
6
0
12 Jun 2024
OLMES: A Standard for Language Model Evaluations
OLMES: A Standard for Language Model Evaluations
Yuling Gu
Oyvind Tafjord
Bailey Kuehl
Dany Haddad
Jesse Dodge
Hannaneh Hajishirzi
ELM
35
13
0
12 Jun 2024
Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs
  Evaluation, Benchmark, and Arena
Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena
Aidar Myrzakhan
Sondos Mahmoud Bsharat
Zhiqiang Shen
ELM
33
26
0
11 Jun 2024
Language Models Resist Alignment
Language Models Resist Alignment
Jiaming Ji
Kaile Wang
Tianyi Qiu
Boyuan Chen
Jiayi Zhou
Changye Li
Hantao Lou
Yaodong Yang
37
1
0
10 Jun 2024
Revisiting Catastrophic Forgetting in Large Language Model Tuning
Revisiting Catastrophic Forgetting in Large Language Model Tuning
Hongyu Li
Liang Ding
Meng Fang
Dacheng Tao
CLL
KELM
45
15
0
07 Jun 2024
LipGER: Visually-Conditioned Generative Error Correction for Robust
  Automatic Speech Recognition
LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
Sreyan Ghosh
Sonal Kumar
Ashish Seth
Purva Chiniya
Utkarsh Tyagi
R. Duraiswami
Dinesh Manocha
41
0
0
06 Jun 2024
RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning
  and Manipulation
RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation
Jiaming Liu
Mengzhen Liu
Zhenyu Wang
Lily Lee
Kaichen Zhou
Pengju An
Senqiao Yang
Renrui Zhang
Yandong Guo
Shanghang Zhang
LM&Ro
LRM
Mamba
32
5
0
06 Jun 2024
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and
  Effective for LMMs
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs
Lingchen Meng
Jianwei Yang
Rui Tian
Xiyang Dai
Zuxuan Wu
Jianfeng Gao
Yu-Gang Jiang
VLM
22
8
0
06 Jun 2024
LLMEmbed: Rethinking Lightweight LLM's Genuine Function in Text
  Classification
LLMEmbed: Rethinking Lightweight LLM's Genuine Function in Text Classification
Chun Liu
Hongguang Zhang
Kainan Zhao
Xinghai Ju
Lin Yang
39
2
0
06 Jun 2024
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large
  Language Models
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models
Peijie Dong
Lujun Li
Zhenheng Tang
Xiang Liu
Xinglin Pan
Qiang-qiang Wang
Xiaowen Chu
48
22
0
05 Jun 2024
Defending Large Language Models Against Attacks With Residual Stream Activation Analysis
Defending Large Language Models Against Attacks With Residual Stream Activation Analysis
Amelia Kawasaki
Andrew Davis
Houssam Abbas
AAML
KELM
27
2
0
05 Jun 2024
Loki: Low-Rank Keys for Efficient Sparse Attention
Loki: Low-Rank Keys for Efficient Sparse Attention
Prajwal Singhania
Siddharth Singh
Shwai He
S. Feizi
A. Bhatele
32
13
0
04 Jun 2024
SpecExec: Massively Parallel Speculative Decoding for Interactive LLM
  Inference on Consumer Devices
SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices
Ruslan Svirschevski
Avner May
Zhuoming Chen
Beidi Chen
Zhihao Jia
Max Ryabinin
23
12
0
04 Jun 2024
Sparsity-Accelerated Training for Large Language Models
Sparsity-Accelerated Training for Large Language Models
Da Ma
Lu Chen
Pengyu Wang
Hongshen Xu
Hanqi Li
Liangtai Sun
Su Zhu
Shuai Fan
Kai Yu
LRM
31
0
0
03 Jun 2024
Probing Language Models for Pre-training Data Detection
Probing Language Models for Pre-training Data Detection
Zhenhua Liu
Tong Zhu
Chuanyuan Tan
Haonan Lu
Bing Liu
Wenliang Chen
21
10
0
03 Jun 2024
Joint Embeddings for Graph Instruction Tuning
Joint Embeddings for Graph Instruction Tuning
Vlad Argatu
Aaron Haag
Oliver Lohse
31
0
0
31 May 2024
Shotluck Holmes: A Family of Efficient Small-Scale Large Language Vision
  Models For Video Captioning and Summarization
Shotluck Holmes: A Family of Efficient Small-Scale Large Language Vision Models For Video Captioning and Summarization
Richard Luo
Austin Peng
Adithya Vasudev
Rishabh Jain
34
2
0
31 May 2024
Would I Lie To You? Inference Time Alignment of Language Models using
  Direct Preference Heads
Would I Lie To You? Inference Time Alignment of Language Models using Direct Preference Heads
Avelina Asada Hadji-Kyriacou
Ognjen Arandjelović
20
1
0
30 May 2024
Improve Student's Reasoning Generalizability through Cascading
  Decomposed CoTs Distillation
Improve Student's Reasoning Generalizability through Cascading Decomposed CoTs Distillation
Chengwei Dai
Kun Li
Wei Zhou
Song Hu
LRM
39
3
0
30 May 2024
Beyond Imitation: Learning Key Reasoning Steps from Dual
  Chain-of-Thoughts in Reasoning Distillation
Beyond Imitation: Learning Key Reasoning Steps from Dual Chain-of-Thoughts in Reasoning Distillation
Chengwei Dai
Kun Li
Wei Zhou
Song Hu
LRM
36
5
0
30 May 2024
Large Language Model Pruning
Large Language Model Pruning
Hanjuan Huang
Hao-Jia Song
H. Pao
33
0
0
24 May 2024
A Comparative Analysis of Distributed Training Strategies for GPT-2
A Comparative Analysis of Distributed Training Strategies for GPT-2
Ishan Patwardhan
Shubham Gandhi
Om M. Khare
Amit Joshi
Suraj Sawant
27
1
0
24 May 2024
Bayesian WeakS-to-Strong from Text Classification to Generation
Bayesian WeakS-to-Strong from Text Classification to Generation
Ziyun Cui
Ziyang Zhang
Wen Wu
Wen Wu
Chao Zhang
31
1
0
24 May 2024
AstroPT: Scaling Large Observation Models for Astronomy
AstroPT: Scaling Large Observation Models for Astronomy
Michael J. Smith
Ryan J. Roberts
E. Angeloudi
M. Huertas-Company
38
1
0
23 May 2024
Super Tiny Language Models
Super Tiny Language Models
Dylan Hillier
Leon Guertler
Cheston Tan
Palaash Agrawal
Ruirui Chen
Bobby Cheng
45
3
0
23 May 2024
Dense Connector for MLLMs
Dense Connector for MLLMs
Huanjin Yao
Wenhao Wu
Taojiannan Yang
Yuxin Song
Mengxi Zhang
Haocheng Feng
Yifan Sun
Zhiheng Li
Wanli Ouyang
Jingdong Wang
MLLM
VLM
32
16
0
22 May 2024
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via
  Alignment Tax Reduction
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction
Tingchen Fu
Deng Cai
Lemao Liu
Shuming Shi
Rui Yan
MoMe
45
13
0
22 May 2024
TinyLLaVA Factory: A Modularized Codebase for Small-scale Large
  Multimodal Models
TinyLLaVA Factory: A Modularized Codebase for Small-scale Large Multimodal Models
Junlong Jia
Ying Hu
Xi Weng
Yiming Shi
Miao Li
...
Baichuan Zhou
Ziyu Liu
Jie Luo
Lei Huang
Ji Wu
30
9
0
20 May 2024
Layer-Condensed KV Cache for Efficient Inference of Large Language
  Models
Layer-Condensed KV Cache for Efficient Inference of Large Language Models
Haoyi Wu
Kewei Tu
MQ
41
17
0
17 May 2024
Thinking Fair and Slow: On the Efficacy of Structured Prompts for
  Debiasing Language Models
Thinking Fair and Slow: On the Efficacy of Structured Prompts for Debiasing Language Models
Shaz Furniturewala
Surgan Jandial
Abhinav Java
Pragyan Banerjee
Simra Shahid
Sumita Bhatia
Kokil Jaidka
38
8
0
16 May 2024
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage
  Pruning
OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Dan Qiao
Yi Su
Pinzheng Wang
Jing Ye
Wen Xie
...
Wenliang Chen
Guohong Fu
Guodong Zhou
Qiaoming Zhu
Min Zhang
MQ
32
0
0
09 May 2024
MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language
  Models using 2D Priors
MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors
Yuan Tang
Xu Han
Xianzhi Li
Qiao Yu
Yixue Hao
Long Hu
Min Chen
24
14
0
02 May 2024
HLSTransform: Energy-Efficient Llama 2 Inference on FPGAs Via High Level
  Synthesis
HLSTransform: Energy-Efficient Llama 2 Inference on FPGAs Via High Level Synthesis
Andy He
Darren Key
Mason Bulling
Andrew Chang
Skyler Shapiro
Everett Lee
26
1
0
29 Apr 2024
HateTinyLLM : Hate Speech Detection Using Tiny Large Language Models
HateTinyLLM : Hate Speech Detection Using Tiny Large Language Models
Tanmay Sen
Ansuman Das
Mrinmay Sen
36
4
0
26 Apr 2024
AI Coders Are Among Us: Rethinking Programming Language Grammar Towards
  Efficient Code Generation
AI Coders Are Among Us: Rethinking Programming Language Grammar Towards Efficient Code Generation
Zhensu Sun
Xiaoning Du
Zhou Yang
Li Li
David Lo
28
10
0
25 Apr 2024
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging
  Upcycled Mixture-of-Experts
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
Yifeng Ding
Jiawei Liu
Yuxiang Wei
Terry Yue Zhuo
Lingming Zhang
ALM
MoE
40
3
0
23 Apr 2024
Automated Multi-Language to English Machine Translation Using Generative
  Pre-Trained Transformers
Automated Multi-Language to English Machine Translation Using Generative Pre-Trained Transformers
Elijah Pelofske
Vincent Urias
L. Liebrock
26
0
0
23 Apr 2024
OpenELM: An Efficient Language Model Family with Open Training and
  Inference Framework
OpenELM: An Efficient Language Model Family with Open Training and Inference Framework
Sachin Mehta
Mohammad Hossein Sekhavat
Qingqing Cao
Maxwell Horton
Yanzi Jin
...
Iman Mirzadeh
Mahyar Najibi
Dmitry Belenko
Peter Zatloukal
Mohammad Rastegari
OSLM
AIFin
38
49
0
22 Apr 2024
Graphic Design with Large Multimodal Model
Graphic Design with Large Multimodal Model
Yutao Cheng
Zhao Zhang
Maoke Yang
Hui Nie
Chunyuan Li
Xinglong Wu
Jie Shao
36
10
0
22 Apr 2024
A Survey on Efficient Inference for Large Language Models
A Survey on Efficient Inference for Large Language Models
Zixuan Zhou
Xuefei Ning
Ke Hong
Tianyu Fu
Jiaming Xu
...
Shengen Yan
Guohao Dai
Xiao-Ping Zhang
Yuhan Dong
Yu-Xiang Wang
46
80
0
22 Apr 2024
When Life gives you LLMs, make LLM-ADE: Large Language Models with
  Adaptive Data Engineering
When Life gives you LLMs, make LLM-ADE: Large Language Models with Adaptive Data Engineering
Stephen Choi
William Gazeley
KELM
26
2
0
19 Apr 2024
Which questions should I answer? Salience Prediction of Inquisitive
  Questions
Which questions should I answer? Salience Prediction of Inquisitive Questions
Yating Wu
Ritika Mangla
A. Dimakis
Greg Durrett
Junyi Jessy Li
22
2
0
16 Apr 2024
Resilience of Large Language Models for Noisy Instructions
Resilience of Large Language Models for Noisy Instructions
Bin Wang
Chengwei Wei
Zhengyuan Liu
Geyu Lin
Nancy F. Chen
39
11
0
15 Apr 2024
Previous
123456
Next