v1v2 (latest)

Pretrained Transformers as Universal Computation Engines

9 March 2021

Kevin Lu

Aditya Grover

Pieter Abbeel

Igor Mordatch

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Papers citing "Pretrained Transformers as Universal Computation Engines"

50 / 151 papers shown

Energy-Efficient Domain-Specific Artificial Intelligence Models and Agents: Pathways and Paradigms

411

24 Oct 2025

203

30 Aug 2025

T3Time: Tri-Modal Time Series Forecasting via Adaptive Multi-Head Alignment and Residual Fusion

Abdul Monaf Chowdhury

Rabeya Akter

S. Arib

AI4TS

126

06 Aug 2025

Transfer of Structural Knowledge from Synthetic Languages

Mikhail Budnikov

Ivan Yamshchikov

216

21 May 2025

Large Language Models Implicitly Learn to See and Hear Just By Reading

Prateek Verma

Mert Pilanci

390

20 May 2025

An empirical study of task and feature correlations in the reuse of pre-trained models

Jama Hussein Mohamud

Willie Brink

172

15 May 2025

HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific InsightsPlatform for Advanced Scientific Computing Conference (PASC), 2025

...

231

07 May 2025

Shape Modeling of Longitudinal Medical Images: From Diffeomorphic Metric Mapping to Deep Learning

524

27 Mar 2025

General Intelligence Requires Reward-based Pretraining

824

26 Feb 2025

ECG-Expert-QA: A Benchmark for Evaluating Medical Large Language Models in Heart Disease Diagnosis

Xu Wang

Jiaju Kang

Puyu Han

Yubao Zhao

Qian Liu

Liwenfei He

Lingqiong Zhang

Lingyun Dai

Yongcheng Wang

Jie Tao

LM&MA

493

16 Feb 2025

Better Prompt Compression Without Multi-Layer Perceptrons

941

12 Jan 2025

OneLLM: One Framework to Align All Modalities with LanguageComputer Vision and Pattern Recognition (CVPR), 2023

577

198

10 Jan 2025

The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and ModalitiesInternational Conference on Learning Representations (ICLR), 2024

533

07 Nov 2024

Tri-Level Navigator: LLM-Empowered Tri-Level Learning for Time Series OOD GeneralizationNeural Information Processing Systems (NeurIPS), 2024

418

09 Oct 2024

ESQA: Event Sequences Question Answering

227

03 Jul 2024

From CNNs to Transformers in Multimodal Human Action Recognition: A Survey

Muhammad Bilal Shaikh

Syed Mohammed Shamsul Islam

Douglas Chai

Naveed Akhtar

347

22 May 2024

Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and OpportunitiesIEEE Communications Surveys and Tutorials (COMST), 2024

Yufei Cui

...

Xue Liu

324

188

17 May 2024

The Platonic Representation HypothesisInternational Conference on Machine Learning (ICML), 2024

883

240

13 May 2024

What explains the success of cross-modal fine-tuning with ORCA?

Paloma García-de-Herreros

243

20 Mar 2024

In-context Exploration-Exploitation for Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2024

220

11 Mar 2024

TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models

325

04 Mar 2024

LSTPrompt: Large Language Models as Zero-Shot Time Series Forecasters by Long-Short-Term Prompting

Haoxin Liu

Zhiyuan Zhao

Jindong Wang

Harshavardhan Kamarthi

B. A. Prakash

AI4TS LRM VLM

252

25 Feb 2024

MORE-3S:Multimodal-based Offline Reinforcement Learning with Shared Semantic Spaces

Ge Zhang

231

20 Feb 2024

Do Large Language Models Understand Logic or Just Mimick Context?

Junbing Yan

Chengyu Wang

Junyuan Huang

Wei Zhang

ReLM ELM LRM

215

19 Feb 2024

Show Me How It's Done: The Role of Explanations in Fine-Tuning Language ModelsAsian Conference on Machine Learning (ACML), 2024

280

12 Feb 2024

Empowering Time Series Analysis with Large Language Models: A SurveyInternational Joint Conference on Artificial Intelligence (IJCAI), 2024

383

05 Feb 2024

How Can Large Language Models Understand Spatial-Temporal Data?

326

25 Jan 2024

Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification

Anirudh S. Sundar

Chao-Han Huck Yang

David M. Chan

Shalini Ghosh

Venkatesh Ravichandran

P. S. Nidadavolu

MoMe

295

22 Dec 2023

How to guess a gradient

Utkarsh Singhal

Brian Cheung

Kartik Chandra

Jonathan Ragan-Kelley

Joshua B. Tenenbaum

Tomaso Poggio

Stella X. Yu

ODL

145

07 Dec 2023

Guided Flows for Generative Modeling and Decision Making

329

22 Nov 2023

Unified machine learning tasks and datasets for enhancing renewable energy

208

12 Nov 2023

The Distributional Hypothesis Does Not Fully Explain the Benefits of Masked Language Model PretrainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Ting-Rui Chiang

Dani Yogatama

169

25 Oct 2023

UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting

Xu Liu

Bryan Hooi

Roger Zimmermann

AI4TS

297

150

15 Oct 2023

Data-Centric Financial Large Language Models

...

318

07 Oct 2023

One for All: Towards Training One Graph Model for All Classification TasksInternational Conference on Learning Representations (ICLR), 2023

Hao Liu

511

211

29 Sep 2023

Towards Green AI in Fine-tuning Large Language Models via Adaptive BackpropagationInternational Conference on Learning Representations (ICLR), 2023

264

22 Sep 2023

The first step is the hardest: Pitfalls of Representing and Tokenizing Temporal Data for Large Language Models

Dimitris Spathis

F. Kawsar

AI4TS

201

12 Sep 2023

Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated LearningInternational Conference on Learning Representations (ICLR), 2023

248

22 Aug 2023

Can Language Models Learn to Listen?IEEE International Conference on Computer Vision (ICCV), 2023

275

21 Aug 2023

V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation ModelsAAAI Conference on Artificial Intelligence (AAAI), 2023

Heng Wang

397

18 Aug 2023

FoodSAM: Any Food SegmentationIEEE transactions on multimedia (IEEE TMM), 2023

Xing Lan

Jian Xue

285

11 Aug 2023

OpenProteinSet: Training data for structural biology at scaleNeural Information Processing Systems (NeurIPS), 2023

Mohammed AlQuraishi

232

10 Aug 2023

Multimodal Neurons in Pretrained Text-Only Transformers

Antonio Torralba

272

03 Aug 2023

Transformers are Universal Predictors

Sourya Basu

Moulik Choraria

Lav Varshney

150

15 Jul 2023

Large Language Models as General Pattern MachinesConference on Robot Learning (CoRL), 2023

Montse Gonzalez Arenas

Kanishka Rao

Dorsa Sadigh

Andy Zeng

LLMAG

330

260

10 Jul 2023

Mx2M: Masked Cross-Modality Modeling in Domain Adaptation for 3D Semantic SegmentationAAAI Conference on Artificial Intelligence (AAAI), 2023

219

09 Jul 2023

Investigating Pre-trained Language Models on Cross-Domain Datasets, a Step Closer to General AI

151

21 Jun 2023

Opening the Black Box: Analyzing Attention Weights and Hidden States in Pre-trained Language Models for Non-language Tasks

120

21 Jun 2023

Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability

Jiacheng Ye

Xijia Tao

Lingpeng Kong

LRM

222

11 Jun 2023

Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners

305

24 May 2023