ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.03423
27
0

DML-RAM: Deep Multimodal Learning Framework for Robotic Arm Manipulation using Pre-trained Models

4 April 2025
Sathish Kumar
Swaroop Damodaran
Naveen Kumar Kuruba
S. Jha
Arvind Ramanathan
ArXivPDFHTML
Abstract

This paper presents a novel deep learning framework for robotic arm manipulation that integrates multimodal inputs using a late-fusion strategy. Unlike traditional end-to-end or reinforcement learning approaches, our method processes image sequences with pre-trained models and robot state data with machine learning algorithms, fusing their outputs to predict continuous action values for control. Evaluated on BridgeData V2 and Kuka datasets, the best configuration (VGG16 + Random Forest) achieved MSEs of 0.0021 and 0.0028, respectively, demonstrating strong predictive performance and robustness. The framework supports modularity, interpretability, and real-time decision-making, aligning with the goals of adaptive, human-in-the-loop cyber-physical systems.

View on arXiv
@article{kumar2025_2504.03423,
  title={ DML-RAM: Deep Multimodal Learning Framework for Robotic Arm Manipulation using Pre-trained Models },
  author={ Sathish Kumar and Swaroop Damodaran and Naveen Kumar Kuruba and Sumit Jha and Arvind Ramanathan },
  journal={arXiv preprint arXiv:2504.03423},
  year={ 2025 }
}
Comments on this paper