ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.13807
  4. Cited By
AlanaVLM: A Multimodal Embodied AI Foundation Model for Egocentric Video
  Understanding

AlanaVLM: A Multimodal Embodied AI Foundation Model for Egocentric Video Understanding

19 June 2024
Alessandro Suglia
Claudio Greco
Katie Baker
Jose L. Part
Ioannis Papaioannou
Arash Eshghi
Ioannis Konstas
Oliver Lemon
ArXivPDFHTML

Papers citing "AlanaVLM: A Multimodal Embodied AI Foundation Model for Egocentric Video Understanding"

7 / 7 papers shown
Title
Ego4o: Egocentric Human Motion Capture and Understanding from Multi-Modal Input
Ego4o: Egocentric Human Motion Capture and Understanding from Multi-Modal Input
Jian Wang
Rishabh Dabral
D. Luvizon
Zhe Cao
Lingjie Liu
Thabo Beeler
Christian Theobalt
EgoV
45
0
0
11 Apr 2025
ProbRes: Probabilistic Jump Diffusion for Open-World Egocentric Activity Recognition
ProbRes: Probabilistic Jump Diffusion for Open-World Egocentric Activity Recognition
Sanjoy Kundu
Shanmukha Vellamchetti
Sathyanarayanan N. Aakur
EgoV
50
0
0
04 Apr 2025
ANNEXE: Unified Analyzing, Answering, and Pixel Grounding for Egocentric Interaction
ANNEXE: Unified Analyzing, Answering, and Pixel Grounding for Egocentric Interaction
Yuejiao Su
Yi Wang
Qiongyang Hu
Chuang Yang
Lap-Pui Chau
45
0
0
02 Apr 2025
UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces
Baining Zhao
Jianjie Fang
Zichao Dai
Z. Wang
Jirong Zha
...
Chen Gao
Y. Wang
Jinqiang Cui
Xinlei Chen
Y. Li
51
2
0
08 Mar 2025
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics
Chan Hee Song
Valts Blukis
Jonathan Tremblay
Stephen Tyree
Yu-Chuan Su
Stan Birchfield
83
5
0
25 Nov 2024
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
224
1,017
0
13 Oct 2021
Visually Grounded Reasoning across Languages and Cultures
Visually Grounded Reasoning across Languages and Cultures
Fangyu Liu
Emanuele Bugliarello
E. Ponti
Siva Reddy
Nigel Collier
Desmond Elliott
VLM
LRM
92
167
0
28 Sep 2021
1