ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.06550
  4. Cited By
LLM360: Towards Fully Transparent Open-Source LLMs

LLM360: Towards Fully Transparent Open-Source LLMs

11 December 2023
Zhengzhong Liu
Aurick Qiao
W. Neiswanger
Hongyi Wang
Bowen Tan
Tianhua Tao
Junbo Li
Yuqi Wang
Suqi Sun
Omkar Pangarkar
Richard Fan
Yi Gu
Victor Miller
Yonghao Zhuang
Guowei He
Haonan Li
Fajri Koto
Liping Tang
Nikhil Ranjan
Zhiqiang Shen
Xuguang Ren
Roberto Iriondo
Cun Mu
Zhiting Hu
Mark Schulze
Preslav Nakov
Timothy Baldwin
Eric P. Xing
ArXivPDFHTML

Papers citing "LLM360: Towards Fully Transparent Open-Source LLMs"

50 / 61 papers shown
Title
Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages
Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages
Xabier de Zuazo
Eva Navas
Ibon Saratxaga
Inma Hernáez Rioja
37
0
0
30 Mar 2025
ImF: Implicit Fingerprint for Large Language Models
ImF: Implicit Fingerprint for Large Language Models
Wu jiaxuan
Peng Wanli
Fu hang
Xue Yiming
Wen juan
24
0
0
25 Mar 2025
Won: Establishing Best Practices for Korean Financial NLP
Won: Establishing Best Practices for Korean Financial NLP
Guijin Son
Hyunwoo Ko
Haneral Jung
Chami Hwang
44
0
0
23 Mar 2025
Bridging the LLM Accessibility Divide? Performance, Fairness, and Cost of Closed versus Open LLMs for Automated Essay Scoring
Bridging the LLM Accessibility Divide? Performance, Fairness, and Cost of Closed versus Open LLMs for Automated Essay Scoring
Kezia Oketch
John P. Lalor
Yi Yang
Ahmed Abbasi
ELM
42
0
0
14 Mar 2025
Can Small Language Models Reliably Resist Jailbreak Attacks? A Comprehensive Evaluation
Wenhui Zhang
Huiyu Xu
Zhibo Wang
Zeqing He
Ziqi Zhu
Kui Ren
AAML
PILM
67
0
0
09 Mar 2025
Triple Phase Transitions: Understanding the Learning Dynamics of Large Language Models from a Neuroscience Perspective
Triple Phase Transitions: Understanding the Learning Dynamics of Large Language Models from a Neuroscience Perspective
Yuko Nakagi
Keigo Tada
Sota Yoshino
Shinji Nishimoto
Yu Takagi
LRM
37
0
0
28 Feb 2025
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
Longxu Dou
Qian Liu
Fan Zhou
Changyu Chen
Zili Wang
...
Tianyu Pang
Chao Du
Xinyi Wan
Wei Lu
Min Lin
82
1
0
18 Feb 2025
Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM
Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM
Qingshui Gu
Shu Li
Tianyu Zheng
Zhaoxiang Zhang
113
0
0
10 Feb 2025
The potential -- and the pitfalls -- of using pre-trained language models as cognitive science theories
The potential -- and the pitfalls -- of using pre-trained language models as cognitive science theories
Raj Sanjay Shah
Sashank Varma
LRM
80
0
0
22 Jan 2025
Aligning Instruction Tuning with Pre-training
Aligning Instruction Tuning with Pre-training
Yiming Liang
Tianyu Zheng
Xinrun Du
Ge Zhang
J. Liu
...
Zhaoxiang Zhang
Wenhao Huang
Jiajun Zhang
Xiang Yue
Jiajun Zhang
78
1
0
16 Jan 2025
ConTrans: Weak-to-Strong Alignment Engineering via Concept Transplantation
ConTrans: Weak-to-Strong Alignment Engineering via Concept Transplantation
Weilong Dong
Xinwei Wu
Renren Jin
Shaoyang Xu
Deyi Xiong
47
6
0
31 Dec 2024
Bi-Mamba: Towards Accurate 1-Bit State Space Models
Shengkun Tang
Liqun Ma
H. Li
Mingjie Sun
Zhiqiang Shen
Mamba
62
3
0
18 Nov 2024
Crystal: Illuminating LLM Abilities on Language and Code
Crystal: Illuminating LLM Abilities on Language and Code
Tianhua Tao
Junbo Li
Bowen Tan
Hongyi Wang
William Marshall
...
Joel Hestness
Natalia Vassilieva
Zhiqiang Shen
Eric P. Xing
Zhengzhong Liu
40
4
0
06 Nov 2024
Reducing Hyperparameter Tuning Costs in ML, Vision and Language Model
  Training Pipelines via Memoization-Awareness
Reducing Hyperparameter Tuning Costs in ML, Vision and Language Model Training Pipelines via Memoization-Awareness
Abdelmajid Essofi
Ridwan Salahuddeen
Munachiso Nwadike
Elnura Zhalieva
Kun Zhang
Eric P. Xing
W. Neiswanger
Qirong Ho
VLM
29
0
0
06 Nov 2024
Toxicity of the Commons: Curating Open-Source Pre-Training Data
Toxicity of the Commons: Curating Open-Source Pre-Training Data
Catherine Arnett
Eliot Jones
Ivan P. Yamshchikov
Pierre-Carl Langlais
28
2
0
29 Oct 2024
UTF:Undertrained Tokens as Fingerprints A Novel Approach to LLM
  Identification
UTF:Undertrained Tokens as Fingerprints A Novel Approach to LLM Identification
Jiacheng Cai
Jiahao Yu
Yangguang Shao
Yuhang Wu
Xinyu Xing
WaLM
23
0
0
16 Oct 2024
Scaling Laws for Predicting Downstream Performance in LLMs
Scaling Laws for Predicting Downstream Performance in LLMs
Yangyi Chen
Binxuan Huang
Yifan Gao
Zhengyang Wang
Jingfeng Yang
Heng Ji
LRM
41
7
0
11 Oct 2024
Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection
Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection
T. Chen
Zhentao Tan
Tao Gong
Yue Wu
Qi Chu
Bin Liu
Jieping Ye
Nenghai Yu
KELM
47
2
0
03 Oct 2024
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
Tung-Yu Wu
Pei-Yu Lo
ReLM
LRM
40
2
0
02 Oct 2024
Creating a Gen-AI based Track and Trace Assistant MVP (SuperTracy) for
  PostNL
Creating a Gen-AI based Track and Trace Assistant MVP (SuperTracy) for PostNL
Mohammad Reshadati
30
0
0
04 Sep 2024
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and
  Deduplication by Introducing a Competitive Large Language Model Baseline
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline
Guosheng Dong
Da Pan
Yiding Sun
Shusen Zhang
Zheng Liang
...
Bingning Wang
Wentao Zhang
Jiaxin Mao
Zenan Zhou
Weipeng Chen
ALM
25
2
0
27 Aug 2024
FuxiTranyu: A Multilingual Large Language Model Trained with Balanced
  Data
FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data
Haoran Sun
Renren Jin
Shaoyang Xu
Leiyu Pan
Supryadi
...
Lei Yang
Ling Shi
Juesi Xiao
Shaolin Zhu
Deyi Xiong
50
1
0
12 Aug 2024
u-$\mu$P: The Unit-Scaled Maximal Update Parametrization
u-μ\muμP: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
C. Eichenberg
Josef Dean
Lukas Balles
Luke Y. Prince
Bjorn Deiseroth
Andres Felipe Cruz Salinas
Carlo Luschi
Samuel Weinbach
Douglas Orr
46
9
0
24 Jul 2024
FuLG: 150B Romanian Corpus for Language Model Pretraining
FuLG: 150B Romanian Corpus for Language Model Pretraining
Vlad-Andrei Bădoiu
Mihai-Valentin Dumitru
Alexandru M. Gherghescu
Alexandru Agache
C. Raiciu
43
0
0
18 Jul 2024
LLM Circuit Analyses Are Consistent Across Training and Scale
LLM Circuit Analyses Are Consistent Across Training and Scale
Curt Tigges
Michael Hanna
Qinan Yu
Stella Biderman
28
10
0
15 Jul 2024
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive
  Distillation
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
Liqun Ma
Mingjie Sun
Zhiqiang Shen
21
6
0
09 Jul 2024
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for
  Interleaved Image-Text Generation
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation
Ethan Chern
Jiadi Su
Yan Ma
Pengfei Liu
MLLM
21
26
0
08 Jul 2024
KV Cache Compression, But What Must We Give in Return? A Comprehensive
  Benchmark of Long Context Capable Approaches
KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches
Jiayi Yuan
Hongyi Liu
Shaochen
Zhong
Yu-Neng Chuang
...
Hongye Jin
V. Chaudhary
Zhaozhuo Xu
Zirui Liu
Xia Hu
34
17
0
01 Jul 2024
Development of Cognitive Intelligence in Pre-trained Language Models
Development of Cognitive Intelligence in Pre-trained Language Models
Raj Sanjay Shah
Khushi Bhardwaj
Sashank Varma
23
2
0
01 Jul 2024
Towards Robust Speech Representation Learning for Thousands of Languages
Towards Robust Speech Representation Learning for Thousands of Languages
William Chen
Wangyou Zhang
Yifan Peng
Xinjian Li
Jinchuan Tian
Jiatong Shi
Xuankai Chang
Soumi Maiti
Karen Livescu
Shinji Watanabe
ELM
23
6
0
30 Jun 2024
IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying
  and Reweighting Context-Aware Neurons
IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying and Reweighting Context-Aware Neurons
Dan Shi
Renren Jin
Tianhao Shen
Weilong Dong
Xinwei Wu
Deyi Xiong
23
2
0
26 Jun 2024
Uncovering Latent Memories: Assessing Data Leakage and Memorization
  Patterns in Frontier AI Models
Uncovering Latent Memories: Assessing Data Leakage and Memorization Patterns in Frontier AI Models
Sunny Duan
Mikail Khona
Abhiram Iyer
Rylan Schaeffer
Ila R Fiete
35
5
0
20 Jun 2024
Investigating the Pre-Training Dynamics of In-Context Learning: Task
  Recognition vs. Task Learning
Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning
Xiaolei Wang
Xinyu Tang
Wayne Xin Zhao
Ji-Rong Wen
16
2
0
20 Jun 2024
Pandora: Towards General World Model with Natural Language Actions and
  Video States
Pandora: Towards General World Model with Natural Language Actions and Video States
Jiannan Xiang
Guangyi Liu
Yi Gu
Qiyue Gao
Yuting Ning
...
Shibo Hao
Yemin Shi
Zhengzhong Liu
Eric P. Xing
Zhiting Hu
VGen
48
35
0
12 Jun 2024
Metaheuristics and Large Language Models Join Forces: Toward an Integrated Optimization Approach
Metaheuristics and Large Language Models Join Forces: Toward an Integrated Optimization Approach
Camilo Chacón Sartori
Christian Blum
Filippo Bistaffa
Guillem Rodríguez Corominas
AIFin
42
3
0
28 May 2024
Zamba: A Compact 7B SSM Hybrid Model
Zamba: A Compact 7B SSM Hybrid Model
Paolo Glorioso
Quentin G. Anthony
Yury Tokpanov
James Whittington
Jonathan Pilault
Adam Ibrahim
Beren Millidge
17
7
0
26 May 2024
AstroPT: Scaling Large Observation Models for Astronomy
AstroPT: Scaling Large Observation Models for Astronomy
Michael J. Smith
Ryan J. Roberts
E. Angeloudi
M. Huertas-Company
32
1
0
23 May 2024
Risks and Opportunities of Open-Source Generative AI
Risks and Opportunities of Open-Source Generative AI
Francisco Eiras
Aleksander Petrov
Bertie Vidgen
Christian Schroeder
Fabio Pizzati
...
Matthew Jackson
Phillip H. S. Torr
Trevor Darrell
Y. Lee
Jakob N. Foerster
31
18
0
14 May 2024
Near to Mid-term Risks and Opportunities of Open-Source Generative AI
Near to Mid-term Risks and Opportunities of Open-Source Generative AI
Francisco Eiras
Aleksandar Petrov
Bertie Vidgen
Christian Schroeder de Witt
Fabio Pizzati
...
Paul Röttger
Philip H. S. Torr
Trevor Darrell
Y. Lee
Jakob N. Foerster
28
1
0
25 Apr 2024
SHED: Shapley-Based Automated Dataset Refinement for Instruction
  Fine-Tuning
SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning
Yexiao He
Ziyao Wang
Zheyu Shen
Guoheng Sun
Yucong Dai
Yongkai Wu
Hongyi Wang
Ang Li
26
11
0
23 Apr 2024
OpenELM: An Efficient Language Model Family with Open Training and
  Inference Framework
OpenELM: An Efficient Language Model Family with Open Training and Inference Framework
Sachin Mehta
Mohammad Hossein Sekhavat
Qingqing Cao
Maxwell Horton
Yanzi Jin
...
Iman Mirzadeh
Mahyar Najibi
Dmitry Belenko
Peter Zatloukal
Mohammad Rastegari
OSLM
AIFin
38
49
0
22 Apr 2024
RAM: Towards an Ever-Improving Memory System by Learning from
  Communications
RAM: Towards an Ever-Improving Memory System by Learning from Communications
Jiaqi Li
Xiaobo Wang
Wentao Ding
Zihao Wang
Yipeng Kang
Zixia Jia
Zilong Zheng
37
1
0
18 Apr 2024
Lossless and Near-Lossless Compression for Foundation Models
Lossless and Near-Lossless Compression for Foundation Models
Moshik Hershcovitch
Leshem Choshen
Andrew Wood
Ilias Enmouri
Peter Chin
S. Sundararaman
Danny Harnik
34
3
0
05 Apr 2024
IndoCulture: Exploring Geographically-Influenced Cultural Commonsense
  Reasoning Across Eleven Indonesian Provinces
IndoCulture: Exploring Geographically-Influenced Cultural Commonsense Reasoning Across Eleven Indonesian Provinces
Fajri Koto
Rahmad Mahendra
Nurul Aisyah
Timothy Baldwin
LRM
59
16
0
02 Apr 2024
Latxa: An Open Language Model and Evaluation Suite for Basque
Latxa: An Open Language Model and Evaluation Suite for Basque
Julen Etxaniz
Oscar Sainz
Naiara Pérez
Itziar Aldabe
German Rigau
Eneko Agirre
Aitor Ormazabal
Mikel Artetxe
A. Soroa
ELM
31
22
0
29 Mar 2024
Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period
  of Large Language Models
Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models
Chao Qian
Jie M. Zhang
Wei Yao
Dongrui Liu
Zhen-fei Yin
Yu Qiao
Yong Liu
Jing Shao
LLMSV
LRM
37
13
0
29 Feb 2024
Do Large Language Models Mirror Cognitive Language Processing?
Do Large Language Models Mirror Cognitive Language Processing?
Yuqi Ren
Renren Jin
Tongxuan Zhang
Deyi Xiong
22
4
0
28 Feb 2024
tinyBenchmarks: evaluating LLMs with fewer examples
tinyBenchmarks: evaluating LLMs with fewer examples
Felipe Maia Polo
Lucas Weber
Leshem Choshen
Yuekai Sun
Gongjun Xu
Mikhail Yurochkin
ELM
18
72
0
22 Feb 2024
Coercing LLMs to do and reveal (almost) anything
Coercing LLMs to do and reveal (almost) anything
Jonas Geiping
Alex Stein
Manli Shu
Khalid Saifullah
Yuxin Wen
Tom Goldstein
AAML
26
43
0
21 Feb 2024
Analysing The Impact of Sequence Composition on Language Model
  Pre-Training
Analysing The Impact of Sequence Composition on Language Model Pre-Training
Yu Zhao
Yuanbin Qu
Konrad Staniszewski
Szymon Tworkowski
Wei Liu
Piotr Milo's
Yuxiang Wu
Pasquale Minervini
21
13
0
21 Feb 2024
12
Next