ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.02697
  4. Cited By
Efficient GPT Model Pre-training using Tensor Train Matrix
  Representation

Efficient GPT Model Pre-training using Tensor Train Matrix Representation

5 June 2023
V. Chekalina
Georgii Sergeevich Novikov
Julia Gusak
Ivan V. Oseledets
Alexander Panchenko
ArXivPDFHTML

Papers citing "Efficient GPT Model Pre-training using Tensor Train Matrix Representation"

8 / 8 papers shown
Title
CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation
CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation
Z. Liu
Ruijie Zhang
Zhilin Wang
Zi Yang
Paul Hovland
Bogdan Nicolae
Franck Cappello
Z. Zhang
49
0
0
16 Feb 2025
Improved Off-policy Reinforcement Learning in Biological Sequence Design
Improved Off-policy Reinforcement Learning in Biological Sequence Design
H. Kim
Minsu Kim
Taeyoung Yun
Sanghyeok Choi
Emmanuel Bengio
Alex Hernández-García
Jinkyoo Park
OffRL
15
3
0
06 Oct 2024
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor
  Factorization for Compression of Generative Language Models
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models
Mingxue Xu
Sadia Sharmin
Danilo P. Mandic
29
2
0
03 Oct 2024
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Shikai Qiu
Andres Potapczynski
Marc Finzi
Micah Goldblum
Andrew Gordon Wilson
40
11
0
10 Jun 2024
Tensor Networks Meet Neural Networks: A Survey and Future Perspectives
Tensor Networks Meet Neural Networks: A Survey and Future Perspectives
Maolin Wang
Y. Pan
Zenglin Xu
Xiangli Yang
Guangxi Li
A. Cichocki
Andrzej Cichocki
53
19
0
22 Jan 2023
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
261
1,996
0
31 Dec 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
258
4,489
0
23 Jan 2020
Teaching Machines to Read and Comprehend
Teaching Machines to Read and Comprehend
Karl Moritz Hermann
Tomás Kociský
Edward Grefenstette
L. Espeholt
W. Kay
Mustafa Suleyman
Phil Blunsom
175
3,510
0
10 Jun 2015
1