ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.13561
  4. Cited By
Project Aria: A New Tool for Egocentric Multi-Modal AI Research

Project Aria: A New Tool for Egocentric Multi-Modal AI Research

24 August 2023
Jakob Engel
Kiran Somasundaram
Michael Goesele
Albert Sun
Alexander Gamino
Andrew Turner
Arjang Talattof
Arnie Yuan
Bilal Souti
Brighid Meredith
Cheng Peng
Chris Sweeney
Cole Wilson
Dan Barnes
Daniel DeTone
David Caruso
Derek Valleroy
Dinesh Ginjupalli
Duncan Frost
Edward Miller
Elias Mueggler
Evgeniy Oleinik
Fan Zhang
Guruprasad Somasundaram
Gustavo Solaira
Harry Lanaras
Henry Howard-Jenkins
Huixuan Tang
Hyo Jin Kim
Jaime Rivera
Ji Luo
Jing Dong
Julian Straub
Kevin Bailey
Kevin Eckenhoff
Lingni Ma
Luis Pesqueira
Mark Schwesinger
Maurizio Monge
Nan Yang
Nick Charron
Nikhil Raina
Omkar M. Parkhi
Peter Borschowa
Pierre Moulon
Prince Gupta
Raul Mur-Artal
Robbie Pennington
Sachin Kulkarni
Sagar Miglani
Santosh Gondi
Saransh Solanki
Sean Diener
Shangyi Cheng
Simon Green
Steve Saarinen
Suvam Patra
Tassos Mourikis
Thomas Whelan
Tripti Singh
Vasileios Balntas
Vijay Baiyya
Wilson Dreewes
Xiaqing Pan
Yang Lou
Yipu Zhao
Yusuf Mansour
Yuyang Zou
Zhaoyang Lv
Zijian Wang
Mingfei Yan
Carl Ren
R. D. Nardi
Richard A. Newcombe
    EgoV
ArXivPDFHTML

Papers citing "Project Aria: A New Tool for Egocentric Multi-Modal AI Research"

50 / 57 papers shown
Title
FoodTrack: Estimating Handheld Food Portions with Egocentric Video
FoodTrack: Estimating Handheld Food Portions with Egocentric Video
Ervin Wang
Yuhao Chen
EgoV
46
0
0
07 May 2025
EgoCHARM: Resource-Efficient Hierarchical Activity Recognition using an Egocentric IMU Sensor
EgoCHARM: Resource-Efficient Hierarchical Activity Recognition using an Egocentric IMU Sensor
Akhil Padmanabha
Saravanan Govindarajan
Hwanmun Kim
Sergio Ortiz
Rahul Rajan
Doruk Senkal
Sneha Kadetotad
30
0
0
24 Apr 2025
Imaging for All-Day Wearable Smart Glasses
Imaging for All-Day Wearable Smart Glasses
Michael Goesele
Daniel Andersen
Yujia Chen
Simon Green
Eddy Ilg
Chao Li
Johnson Liu
Grace Kuo
Logan Wan
Richard A. Newcombe
26
0
0
17 Apr 2025
The Invisible EgoHand: 3D Hand Forecasting through EgoBody Pose Estimation
The Invisible EgoHand: 3D Hand Forecasting through EgoBody Pose Estimation
Masashi Hatano
Zhifan Zhu
Hideo Saito
Dima Damen
EgoV
24
0
0
11 Apr 2025
Digital Twin Catalog: A Large-Scale Photorealistic 3D Object Digital Twin Dataset
Digital Twin Catalog: A Large-Scale Photorealistic 3D Object Digital Twin Dataset
Z. Dong
Ka Chen
Zhaoyang Lv
Hong-Xing Yu
Yunzhi Zhang
...
Xiaqing Pan
Mingfei Yan
Jiajun Wu
Carl Ren
Richard A. Newcombe
39
1
0
11 Apr 2025
REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning
REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning
Jihyun Lee
Weipeng Xu
Alexander Richard
S. Wei
Shunsuke Saito
Shaojie Bai
Te-Li Wang
Minhyuk Sung
Tae-Kyun Kim
Jason M. Saragih
DiffM
VGen
33
0
0
07 Apr 2025
Learning Predictive Visuomotor Coordination
Learning Predictive Visuomotor Coordination
Wenqi Jia
Bolin Lai
Miao Liu
Danfei Xu
James M. Rehg
47
0
0
30 Mar 2025
ST-VLM: Kinematic Instruction Tuning for Spatio-Temporal Reasoning in Vision-Language Models
ST-VLM: Kinematic Instruction Tuning for Spatio-Temporal Reasoning in Vision-Language Models
Dohwan Ko
S. Kim
Yumin Suh
Vijay Kumar B.G
Minseo Yoon
Manmohan Chandraker
Hyunwoo J. Kim
LRM
38
0
0
25 Mar 2025
Digitally Prototype Your Eye Tracker: Simulating Hardware Performance using 3D Synthetic Data
Digitally Prototype Your Eye Tracker: Simulating Hardware Performance using 3D Synthetic Data
Esther Y. H. Lin
Yimin Ding
Jogendra Kundu
Yatong An
Mohamed T. El-Haddad
Alexander Fix
34
0
0
20 Mar 2025
UniK3D: Universal Camera Monocular 3D Estimation
UniK3D: Universal Camera Monocular 3D Estimation
Luigi Piccinelli
Christos Sakaridis
Mattia Segu
Y. Yang
Siyuan Li
Wim Abbeloos
Luc Van Gool
MDE
40
0
0
20 Mar 2025
Human-in-the-Loop Local Corrections of 3D Scene Layouts via Infilling
Human-in-the-Loop Local Corrections of 3D Scene Layouts via Infilling
Christopher Xie
A. Avetisyan
Henry Howard-Jenkins
Yawar Siddiqui
Julian Straub
Richard A. Newcombe
Vasileios Balntas
Jakob Julian Engel
3DH
3DV
59
0
0
14 Mar 2025
Eye on the Target: Eye Tracking Meets Rodent Tracking
Emil Mededovic
Yuli Wu
Henning Konermann
Marcin Kopaczka
Mareike Schulz
René Tolba
Johannes Stegmaier
52
0
0
13 Mar 2025
FAM-HRI: Foundation-Model Assisted Multi-Modal Human-Robot Interaction Combining Gaze and Speech
FAM-HRI: Foundation-Model Assisted Multi-Modal Human-Robot Interaction Combining Gaze and Speech
Yuzhi Lai
Shenghai Yuan
Boya Zhang
Benjamin Kiefer
Peizheng Li
Andreas Zell
36
1
0
11 Mar 2025
DIV-FF: Dynamic Image-Video Feature Fields For Environment Understanding in Egocentric Videos
Lorenzo Mur-Labadia
Josechu Guerrero
Ruben Martinez-Cantin
VGen
56
0
0
11 Mar 2025
VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation
VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation
Hanzhi Chen
Boyang Sun
Anran Zhang
Marc Pollefeys
Stefan Leutenegger
LM&Ro
63
0
0
10 Mar 2025
EgoLife: Towards Egocentric Life Assistant
Jingkang Yang
Shuai Liu
Hongming Guo
Yuhao Dong
X. Zhang
...
Joerg Widmer
Francesco Gringoli
Lei Yang
Bo Li
Z. Liu
EgoV
49
2
0
05 Mar 2025
egoPPG: Heart Rate Estimation from Eye-Tracking Cameras in Egocentric Systems to Benefit Downstream Vision Tasks
egoPPG: Heart Rate Estimation from Eye-Tracking Cameras in Egocentric Systems to Benefit Downstream Vision Tasks
Björn Braun
Rayan Armani
Manuel Meier
Max Moebus
Christian Holz
EgoV
33
0
0
28 Feb 2025
LIFT-GS: Cross-Scene Render-Supervised Distillation for 3D Language Grounding
LIFT-GS: Cross-Scene Render-Supervised Distillation for 3D Language Grounding
Ang Cao
Sergio Arnaud
Oleksandr Maksymets
Jianing Yang
Ayush Jain
...
Aravind Rajeswaran
Franziska Meier
Justin Johnson
Jeong Joon Park
Alexander Sax
59
0
0
27 Feb 2025
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Reinhold Haeb-Umbach
Tomohiro Nakatani
Marc Delcroix
Christoph Boeddeker
Tsubasa Ochiai
31
0
0
13 Jan 2025
EgoCast: Forecasting Egocentric Human Pose in the Wild
EgoCast: Forecasting Egocentric Human Pose in the Wild
María Escobar
Juanita Puentes
Cristhian Forigua
Jordi Pont-Tuset
Kevis-Kokitsi Maninis
Pablo Arbeláez
EgoV
68
2
0
03 Dec 2024
Online Episodic Memory Visual Query Localization with Egocentric
  Streaming Object Memory
Online Episodic Memory Visual Query Localization with Egocentric Streaming Object Memory
Zaira Manigrasso
Matteo Dunnhofer
Antonino Furnari
Moritz Nottebaum
Antonio Finocchiaro
Davide Marana
G. Farinella
C. Micheloni
70
1
0
25 Nov 2024
Human-inspired Perspectives: A Survey on AI Long-term Memory
Human-inspired Perspectives: A Survey on AI Long-term Memory
Zihong He
Weizhe Lin
Hao Zheng
Fan Zhang
Matt Jones
Laurence Aitchison
X. Xu
Miao Liu
Per Ola Kristensson
Junxiao Shen
67
2
0
01 Nov 2024
The USTC-NERCSLIP Systems for the CHiME-8 MMCSG Challenge
The USTC-NERCSLIP Systems for the CHiME-8 MMCSG Challenge
Ya Jiang
Hongbo Lan
Jun Du
Qing Wang
Shutong Niu
30
1
0
08 Oct 2024
LiteVLoc: Map-Lite Visual Localization for Image Goal Navigation
LiteVLoc: Map-Lite Visual Localization for Image Goal Navigation
Jianhao Jiao
Jinhao He
Changkun Liu
Sebastian Aegidius
Xiangcheng Hu
Tristan Braud
Dimitrios Kanoulas
41
4
0
06 Oct 2024
Estimating Body and Hand Motion in an Ego-sensed World
Estimating Body and Hand Motion in an Ego-sensed World
Brent Yi
Vickie Ye
Maya Zheng
Lea Müller
Georgios Pavlakos
Yi Ma
Jitendra Malik
Angjoo Kanazawa
DiffM
42
6
0
04 Oct 2024
EgoLM: Multi-Modal Language Model of Egocentric Motions
EgoLM: Multi-Modal Language Model of Egocentric Motions
Fangzhou Hong
Vladimir Guzov
Hyo Jin Kim
Yuting Ye
Richard A. Newcombe
Ziwei Liu
Lingni Ma
23
5
0
26 Sep 2024
HMD^2: Environment-aware Motion Generation from Single Egocentric Head-Mounted Device
HMD^2: Environment-aware Motion Generation from Single Egocentric Head-Mounted Device
Vladimir Guzov
Yifeng Jiang
Fangzhou Hong
Gerard Pons-Moll
Richard A. Newcombe
C. Karen Liu
Yuting Ye
Lingni Ma
33
6
0
20 Sep 2024
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
Yufeng Yang
Desh Raj
Ju Lin
Niko Moritz
J. Jia
...
Egor Lakomkin
Yiteng Huang
Jacob Donley
Jay Mahadeokar
Ozlem Kalinli
13
2
0
17 Sep 2024
Event-based Mosaicing Bundle Adjustment
Event-based Mosaicing Bundle Adjustment
Shuang Guo
Guillermo Gallego
41
0
0
11 Sep 2024
Audio-Visual Speaker Diarization: Current Databases, Approaches and
  Challenges
Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges
Victoria Mingote
Alfonso Ortega
A. Miguel
Eduardo Lleida
22
0
0
09 Sep 2024
EMHI: A Multimodal Egocentric Human Motion Dataset with HMD and
  Body-Worn IMUs
EMHI: A Multimodal Egocentric Human Motion Dataset with HMD and Body-Worn IMUs
Zhen Fan
Peng Dai
Zhuo Su
Xu Gao
Zheng Lv
Jiarui Zhang
Tianyuan Du
Guidong Wang
Yang Zhang
36
3
0
30 Aug 2024
User-in-the-loop Evaluation of Multimodal LLMs for Activity Assistance
User-in-the-loop Evaluation of Multimodal LLMs for Activity Assistance
Mrinal Verghese
Brian Chen
H. Eghbalzadeh
Tushar Nagarajan
Ruta Desai
LRM
40
1
0
04 Aug 2024
Simultaneous Localization and Affordance Prediction for Tasks in
  Egocentric Video
Simultaneous Localization and Affordance Prediction for Tasks in Egocentric Video
Zachary Chavis
Hyun Soo Park
Stephen J. Guy
EgoV
20
0
0
18 Jul 2024
Masked Video and Body-worn IMU Autoencoder for Egocentric Action
  Recognition
Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition
Mingfang Zhang
Yifei Huang
Ruicong Liu
Yoichi Sato
26
4
0
09 Jul 2024
EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D
  Gaussian Splatting
EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting
Daiwei Zhang
Gengyan Li
Jiajie Li
Mickael Bressieux
Otmar Hilliges
Marc Pollefeys
Luc Van Gool
Xi Wang
35
9
0
28 Jun 2024
Nymeria: A Massive Collection of Multimodal Egocentric Daily Motion in
  the Wild
Nymeria: A Massive Collection of Multimodal Egocentric Daily Motion in the Wild
Lingni Ma
Yuting Ye
Fangzhou Hong
Vladimir Guzov
Yifeng Jiang
...
C. Karen Liu
Ziwei Liu
Jakob Engel
R. D. Nardi
Richard A. Newcombe
32
21
0
14 Jun 2024
Introducing HOT3D: An Egocentric Dataset for 3D Hand and Object Tracking
Introducing HOT3D: An Egocentric Dataset for 3D Hand and Object Tracking
Prithviraj Banerjee
Sindi Shkodrani
Pierre Moulon
Shreyas Hampali
Fan Zhang
...
Selen Basol
Richard A. Newcombe
Robert Y. Wang
Jakob Julian Engel
Tomás Hodan
44
12
0
13 Jun 2024
Multimodal Cross-Domain Few-Shot Learning for Egocentric Action
  Recognition
Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition
Masashi Hatano
Ryo Hachiuma
Ryoske Fujii
Hideo Saito
EgoV
21
4
0
30 May 2024
Implicit gaze research for XR systems
Implicit gaze research for XR systems
Naveen Sendhilnathan
Ajoy S. Fernandes
Michael J. Proulx
Tanya R. Jonker
21
2
0
22 May 2024
General Place Recognition Survey: Towards Real-World Autonomy
General Place Recognition Survey: Towards Real-World Autonomy
Peng Yin
Jianhao Jiao
Shiqi Zhao
Lingyun Xu
Guoquan Huang
Howie Choset
Sebastian A. Scherer
Jianda Han
42
6
0
08 May 2024
Splat-MOVER: Multi-Stage, Open-Vocabulary Robotic Manipulation via
  Editable Gaussian Splatting
Splat-MOVER: Multi-Stage, Open-Vocabulary Robotic Manipulation via Editable Gaussian Splatting
O. Shorinwa
Johnathan Tucker
Aliyah Smith
Aiden Swann
Timothy Chen
Roya Firoozi
Monroe Kennedy
Mac Schwager
29
20
0
07 May 2024
CrossScore: Towards Multi-View Image Evaluation and Scoring
CrossScore: Towards Multi-View Image Evaluation and Scoring
Zirui Wang
Wenjing Bian
Omkar M. Parkhi
Yuheng Ren
V. Prisacariu
37
1
0
22 Apr 2024
TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading
  Assistance Using Large Language Model
TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model
Wiktor Mucha
Florin Cuconasu
Naome A. Etori
Valia Kalokyri
Giovanni Trappolini
26
4
0
14 Apr 2024
EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World
EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World
Yifei Huang
Guo Chen
Jilan Xu
Mingfang Zhang
Lijin Yang
...
Hongjie Zhang
Lu Dong
Yali Wang
Limin Wang
Yu Qiao
EgoV
52
32
0
24 Mar 2024
SceneScript: Reconstructing Scenes With An Autoregressive Structured
  Language Model
SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model
A. Avetisyan
Christopher Xie
Henry Howard-Jenkins
Tsun-Yi Yang
Samir Aroudj
...
Campbell Orme
Jakob Engel
Edward Miller
Richard A. Newcombe
Vasileios Balntas
LM&Ro
25
17
0
19 Mar 2024
Real-Time Simulated Avatar from Head-Mounted Sensors
Real-Time Simulated Avatar from Head-Mounted Sensors
Zhengyi Luo
Jinkun Cao
Rawal Khirodkar
Alexander W. Winkler
Jing Huang
Kris M. Kitani
Weipeng Xu
34
7
0
11 Mar 2024
Put Myself in Your Shoes: Lifting the Egocentric Perspective from
  Exocentric Videos
Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos
Mi Luo
Zihui Xue
Alex Dimakis
Kristen Grauman
EgoV
DiffM
32
10
0
11 Mar 2024
Aria Everyday Activities Dataset
Aria Everyday Activities Dataset
Zhaoyang Lv
Nickolas Charron
Pierre Moulon
Alexander Gamino
Cheng Peng
...
Yuyang Zou
Richard A. Newcombe
Jakob Julian Engel
Xiaqing Pan
Carl Ren
21
10
0
20 Feb 2024
GazeGPT: Augmenting Human Capabilities using Gaze-contingent Contextual
  AI for Smart Eyewear
GazeGPT: Augmenting Human Capabilities using Gaze-contingent Contextual AI for Smart Eyewear
Robert Konrad
Nitish Padmanaban
J. G. Buckmaster
Kevin C. Boyle
Gordon Wetzstein
25
11
0
30 Jan 2024
AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition
AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition
Ju Lin
Niko Moritz
Yiteng Huang
Ruiming Xie
Ming Sun
Christian Fuegen
Frank Seide
14
4
0
18 Jan 2024
12
Next