ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.14907
  4. Cited By
GneissWeb: Preparing High Quality Data for LLMs at Scale

GneissWeb: Preparing High Quality Data for LLMs at Scale

19 February 2025
Hajar Emami-Gohari
S. Kadhe
Syed Yousaf Shah. Constantin Adam
Abdulhamid A. Adebayo
Praneet Adusumilli
Farhan Ahmed
Nathalie Baracaldo Angel
Santosh Borse
Yuan Chi Chang
Xuan-Hong Dang
N. Desai
Ravital Eres
Ran Iwamoto
Alexei Karve
Yan Koyfman
Wei-Han Lee
Changchang Liu
Boris Lublinsky
Takuyo Ohko
Pablo Pesce
Maroun Touma
Shiqiang Wang
Shalisha Witherspoon
Herbert Woisetschläger
D. Wood
Kun-Lung Wu
Issei Yoshida
Syed Zawad
Petros Zerfos
Yi Zhou
Bishwaranjan Bhattacharjee
ArXivPDFHTML

Papers citing "GneissWeb: Preparing High Quality Data for LLMs at Scale"

1 / 1 papers shown
Title
Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training
Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training
Y. Chen
Hao Peng
Tong Zhang
Heng Ji
VLM
28
0
0
13 May 2025
1