ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.12372
  4. Cited By
RedPajama: an Open Dataset for Training Large Language Models

RedPajama: an Open Dataset for Training Large Language Models

19 November 2024
Maurice Weber
Daniel Y. Fu
Quentin Anthony
Yonatan Oren
S. Adams
Anton Alexandrov
Xiaozhong Lyu
Huu Nguyen
Xiaozhe Yao
Virginia Adams
Ben Athiwaratkun
Rahul Chalamala
Kezhen Chen
Max Ryabinin
Tri Dao
Percy Liang
Christopher Ré
Irina Rish
Ce Zhang
ArXivPDFHTML

Papers citing "RedPajama: an Open Dataset for Training Large Language Models"

8 / 8 papers shown
Title
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
X. Zhang
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
57
0
0
05 May 2025
Forecasting from Clinical Textual Time Series: Adaptations of the Encoder and Decoder Language Model Families
Forecasting from Clinical Textual Time Series: Adaptations of the Encoder and Decoder Language Model Families
Shahriar Noroozizadeh
Sayantan Kumar
Jeremy C. Weiss
AI4TS
16
0
0
14 Apr 2025
The Lucie-7B LLM and the Lucie Training Dataset: Open resources for multilingual language generation
The Lucie-7B LLM and the Lucie Training Dataset: Open resources for multilingual language generation
Olivier Gouvert
Julie Hunter
Jérôme Louradour
Christophe Cerisara
Evan Dufraisse
Yaya Sy
Laura Rivière
Jean-Pierre Lorré
OpenLLM-France community
66
0
0
15 Mar 2025
Citrus: Leveraging Expert Cognitive Pathways in a Medical Language Model for Advanced Medical Decision Support
Citrus: Leveraging Expert Cognitive Pathways in a Medical Language Model for Advanced Medical Decision Support
G. Wang
Minyu Gao
Shuai Yang
Ya Zhang
Lizhi He
...
Yexuan Zhang
Wanyue Li
Lu Chen
Jintao Fei
Xin Li
56
1
0
25 Feb 2025
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models
Xingrun Xing
Zheng Liu
Shitao Xiao
Boyan Gao
Yiming Liang
Wanpeng Zhang
Haokun Lin
Guoqi Li
Jiajun Zhang
LRM
49
1
0
10 Feb 2025
Rethinking Evaluation of Sparse Autoencoders through the Representation of Polysemous Words
Rethinking Evaluation of Sparse Autoencoders through the Representation of Polysemous Words
Gouki Minegishi
Hiroki Furuta
Yusuke Iwasawa
Y. Matsuo
49
1
0
09 Jan 2025
Retro-li: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization
Retro-li: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift Generalization
Gentiana Rashiti
G. Karunaratne
Mrinmaya Sachan
Abu Sebastian
Abbas Rahimi
RALM
27
0
0
12 Sep 2024
A Survey on Large Language Models from General Purpose to Medical
  Applications: Datasets, Methodologies, and Evaluations
A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations
Jinqiang Wang
Huansheng Ning
Yi Peng
Qikai Wei
Daniel Tesfai
Wenwei Mao
Tao Zhu
Runhe Huang
LM&MA
AI4MH
ELM
33
4
0
14 Jun 2024
1