ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.15830
  4. Cited By
A-JEPA: Joint-Embedding Predictive Architecture Can Listen
v1v2v3 (latest)

A-JEPA: Joint-Embedding Predictive Architecture Can Listen

27 November 2023
Zhengcong Fei
Mingyuan Fan
Junshi Huang
ArXiv (abs)PDFHTML

Papers citing "A-JEPA: Joint-Embedding Predictive Architecture Can Listen"

24 / 24 papers shown
Title
CrossJEPA: Cross-Modal Joint-Embedding Predictive Architecture for Efficient 3D Representation Learning from 2D Images
CrossJEPA: Cross-Modal Joint-Embedding Predictive Architecture for Efficient 3D Representation Learning from 2D Images
Avishka Perera
Kumal Hewagamage
Saeedha Nazar
Kavishka Abeywardana
Hasitha Gallella
Ranga Rodrigo
Mohamed Afham
3DV
121
0
0
23 Nov 2025
Unsupervised Transformer Pre-Training for Images: Self-Distillation, Mean Teachers, and Random Crops
Unsupervised Transformer Pre-Training for Images: Self-Distillation, Mean Teachers, and Random Crops
Mattia Scardecchia
ViT
93
0
0
04 Oct 2025
WavJEPA: Semantic learning unlocks robust audio foundation models for raw waveforms
WavJEPA: Semantic learning unlocks robust audio foundation models for raw waveforms
Goksenin Yuksel
Pierre Guetschel
Michael Tangermann
Marcel van Gerven
Kiki van der Heijden
AI4TS
108
0
0
27 Sep 2025
Embodied AI: From LLMs to World Models
Embodied AI: From LLMs to World Models
Tongtong Feng
Xin Wang
Yu Jiang
Wenwu Zhu
LM&Ro
289
7
0
24 Sep 2025
Discrete JEPA: Learning Discrete Token Representations without Reconstruction
Discrete JEPA: Learning Discrete Token Representations without Reconstruction
Junyeob Baek
Hosung Lee
Christopher Hoang
Mengye Ren
Sungjin Ahn
187
0
0
17 Jun 2025
SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes
SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic SoundscapesInternational Conference on Learning Representations (ICLR), 2025
Tony Alex
S. Ahmed
A. Mustafa
Muhammad Awais
Philip J. B. Jackson
129
7
0
13 Jun 2025
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
Sifei Li
Mining Tan
Feier Shen
Minyan Luo
Zijiao Yin
Fan Tang
Weiming Dong
Changsheng Xu
249
1
0
17 Apr 2025
SkyReels-A2: Compose Anything in Video Diffusion Transformers
SkyReels-A2: Compose Anything in Video Diffusion Transformers
Zhengcong Fei
Didong Li
Di Qiu
Jiadong Wang
Yikun Dou
...
Jinfeng Xu
Mingyuan Fan
Guibin Chen
Yang Li
Yahui Zhou
DiffMVGen
280
31
0
03 Apr 2025
Chirp Localization via Fine-Tuned Transformer Model: A Proof-of-Concept Study
Chirp Localization via Fine-Tuned Transformer Model: A Proof-of-Concept Study
N. Bahador
M. Lankarany
241
3
0
24 Mar 2025
Predict, Cluster, Refine: A Joint Embedding Predictive Self-Supervised Framework for Graph Representation Learning
Predict, Cluster, Refine: A Joint Embedding Predictive Self-Supervised Framework for Graph Representation Learning
Srinitish Srinivasan
Omkumar CU
SSLBDL
399
0
0
02 Feb 2025
Video Diffusion Transformers are In-Context Learners
Video Diffusion Transformers are In-Context Learners
Zhengcong Fei
Di Qiu
Changqian Yu
Debang Li
Mingyuan Fan
VGenDiffM
763
7
0
14 Dec 2024
Sparsh: Self-supervised touch representations for vision-based tactile
  sensing
Sparsh: Self-supervised touch representations for vision-based tactile sensingConference on Robot Learning (CoRL), 2024
Carolina Higuera
Akash Sharma
Chaithanya Krishna Bodduluri
Taosha Fan
Patrick E. Lancaster
...
Michael Kaess
Byron Boots
Mike Lambeta
Tingfan Wu
Mustafa Mukadam
218
45
0
31 Oct 2024
Learning Latent Wireless Dynamics from Channel State Information
Learning Latent Wireless Dynamics from Channel State InformationIEEE Wireless Communications Letters (WCL), 2024
Charbel Bou Chaaya
Abanoub M. Girgis
Mehdi Bennis
182
6
0
16 Sep 2024
FLUX that Plays Music
FLUX that Plays Music
Zhengcong Fei
Mingyuan Fan
Changqian Yu
Junshi Huang
252
16
0
01 Sep 2024
Aligning Cyber Space with Physical World: A Comprehensive Survey on
  Embodied AI
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI
Zehua Wang
Weixing Chen
Yongjie Bai
Xiaodan Liang
Guanbin Li
Wen Gao
Liang Lin
LM&RoSyDaAI4CE
385
167
0
09 Jul 2024
Time-Series JEPA for Predictive Remote Control under Capacity-Limited Networks
Time-Series JEPA for Predictive Remote Control under Capacity-Limited Networks
Abanoub M. Girgis
Álvaro Valcarce
Mehdi Bennis
AI4TS
167
5
0
07 Jun 2024
LaT-PFN: A Joint Embedding Predictive Architecture for In-context
  Time-series Forecasting
LaT-PFN: A Joint Embedding Predictive Architecture for In-context Time-series Forecasting
Stijn Verdenius
Andrea Zerio
Roy L.M. Wang
BDLAI4TSAI4CE
251
4
0
16 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGenLM&Ro
270
74
0
06 May 2024
Music Consistency Models
Music Consistency Models
Zhengcong Fei
Mingyuan Fan
Junshi Huang
DiffM
177
7
0
20 Apr 2024
World Models for Autonomous Driving: An Initial Survey
World Models for Autonomous Driving: An Initial Survey
Yanchen Guan
Haicheng Liao
Zhenning Li
Jia Hu
Runze Yuan
Yunjian Li
Guohui Zhang
Chengzhong Xu
366
74
0
05 Mar 2024
Prospective Role of Foundation Models in Advancing Autonomous Vehicles
Prospective Role of Foundation Models in Advancing Autonomous Vehicles
Jianhua Wu
B. Gao
Jincheng Gao
Jianhao Yu
Hongqing Chu
...
Xun Gong
Yi Chang
H. E. Tseng
Hong Chen
Jie Chen
269
15
0
08 Dec 2023
Graph-level Representation Learning with Joint-Embedding Predictive Architectures
Graph-level Representation Learning with Joint-Embedding Predictive Architectures
Geri Skenderi
Hang Li
Shucheng Zhou
Marco Cristani
AI4TSGNN
444
10
0
27 Sep 2023
Unsupervised Learning of Visual Features by Contrasting Cluster
  Assignments
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments
Mathilde Caron
Ishan Misra
Julien Mairal
Priya Goyal
Piotr Bojanowski
Armand Joulin
OCLSSL
1.1K
4,602
0
17 Jun 2020
Mockingjay: Unsupervised Speech Representation Learning with Deep
  Bidirectional Transformer Encoders
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer EncodersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Andy T. Liu
Shu-Wen Yang
Po-Han Chi
Po-Chun Hsu
Hung-yi Lee
SSL
392
388
0
25 Oct 2019
1