ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1807.00612
16
17

Multi-modal Egocentric Activity Recognition using Audio-Visual Features

2 July 2018
Mehmet Ali Arabaci
Fatih Özkan
Elif Surer
P. Jančovič
A. Temi̇zel
    EgoV
ArXivPDFHTML
Abstract

Egocentric activity recognition in first-person videos has an increasing importance with a variety of applications such as lifelogging, summarization, assisted-living and activity tracking. Existing methods for this task are based on interpretation of various sensor information using pre-determined weights for each feature. In this work, we propose a new framework for egocentric activity recognition problem based on combining audio-visual features with multi-kernel learning (MKL) and multi-kernel boosting (MKBoost). For that purpose, firstly grid optical-flow, virtual-inertia feature, log-covariance, cuboid are extracted from the video. The audio signal is characterized using a "supervector", obtained based on Gaussian mixture modelling of frame-level features, followed by a maximum a-posteriori adaptation. Then, the extracted multi-modal features are adaptively fused by MKL classifiers in which both the feature and kernel selection/weighing and recognition tasks are performed together. The proposed framework was evaluated on a number of egocentric datasets. The results showed that using multi-modal features with MKL outperforms the existing methods.

View on arXiv
Comments on this paper