ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.09272
  4. Cited By
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric
  Videos
v1v2v3 (latest)

Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos

13 June 2024
Changan Chen
Puyuan Peng
Ami Baid
Zihui Xue
Wei-Ning Hsu
David Harwath
Kristen Grauman
    VGen
ArXiv (abs)PDFHTMLGithub (313★)

Papers citing "Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos"

13 / 13 papers shown
Audio-Visual World Models: Towards Multisensory Imagination in Sight and Sound
Audio-Visual World Models: Towards Multisensory Imagination in Sight and Sound
Jiahua Wang
Shannan Yan
Leqi Zheng
Jialong Wu
VGen
220
3
0
30 Nov 2025
Segmenting Collision Sound Sources in Egocentric Videos
Segmenting Collision Sound Sources in Egocentric Videos
Kranti Parida
Omar Emara
Hazel Doughty
Dima Damen
VOS
335
0
0
17 Nov 2025
CAVER: Curious Audiovisual Exploring Robot
CAVER: Curious Audiovisual Exploring Robot
Luca Macesanu
Boueny Folefack
Samik Singh
Ruchira Ray
Ben Abbatematteo
R. M. Martin
187
0
0
10 Nov 2025
Beyond Grid-Locked Voxels: Neural Response Functions for Continuous Brain Encoding
Beyond Grid-Locked Voxels: Neural Response Functions for Continuous Brain Encoding
Haomiao Chen
K. Jamison
M. Sabuncu
Amy Kuceyeski
229
0
0
07 Oct 2025
Clink! Chop! Thud! -- Learning Object Sounds from Real-World Interactions
Clink! Chop! Thud! -- Learning Object Sounds from Real-World Interactions
Mengyu Yang
Yiming Chen
Haozheng Pei
Siddhant Agarwal
Arun Balajee Vasudevan
James Hays
158
0
0
02 Oct 2025
EGOILLUSION: Benchmarking Hallucinations in Egocentric Video Understanding
EGOILLUSION: Benchmarking Hallucinations in Egocentric Video Understanding
Ashish Seth
Utkarsh Tyagi
Ramaneswaran Selvakumar
Nishit Anand
Sonal Kumar
Sreyan Ghosh
R. Duraiswami
Chirag Agarwal
Dinesh Manocha
MLLMHILMVLM
283
7
0
18 Aug 2025
Sonify Anything: Towards Context-Aware Sonic Interactions in AR
Sonify Anything: Towards Context-Aware Sonic Interactions in AR
Laura Schütz
Sasan Matinfar
U. Eck
Daniel Roth
Nassir Navab
165
1
0
03 Aug 2025
EgoTrigger: Toward Audio-Driven Image Capture for Human Memory Enhancement in All-Day Energy-Efficient Smart Glasses
EgoTrigger: Toward Audio-Driven Image Capture for Human Memory Enhancement in All-Day Energy-Efficient Smart GlassesIEEE Transactions on Visualization and Computer Graphics (TVCG), 2025
Akshay Paruchuri
Sinan Hersek
Lavisha Aggarwal
Qiao Yang
Xin Liu
Achin Kulshrestha
Andrea Colaco
Henry Fuchs
Ishan Chatterjee
EgoV
279
3
0
03 Aug 2025
Step-by-Step Video-to-Audio Synthesis via Negative Audio Guidance
Step-by-Step Video-to-Audio Synthesis via Negative Audio Guidance
Akio Hayakawa
Masato Ishii
Takashi Shibuya
Yuki Mitsufuji
DiffMVGen
419
1
0
26 Jun 2025
Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities
Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities
Sreyan Ghosh
Zhifeng Kong
Sonal Kumar
S. Sakshi
Jaehyeon Kim
Ming-Yu Liu
Rafael Valle
Dinesh Manocha
Bryan Catanzaro
MLLMAuLLMLRM
422
121
0
06 Mar 2025
Generative AI for Cel-Animation: A Survey
Generative AI for Cel-Animation: A Survey
Yunlong Tang
Junjia Guo
Pinxin Liu
Zhiyuan Wang
Hang Hua
...
Jing Bi
Mingqian Feng
Xuzhao Li
Zeliang Zhang
Chenliang Xu
VGen
875
23
0
08 Jan 2025
MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation
MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound GenerationInternational Conference on Learning Representations (ICLR), 2024
T. Pham
Tri Ton
Chang D. Yoo
389
8
0
03 Oct 2024
Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound
Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley SoundIEEE Transactions on Audio, Speech, and Language Processing (IEEE TASLP), 2024
Junwon Lee
Jaekwon Im
Dabin Kim
Juhan Nam
VGen
547
21
0
21 Aug 2024
1
Page 1 of 1