Object detection, a cornerstone of computer vision powered by advancements in Convolutional Neural Networks (CNNs), plays a crucial role in enabling robots to perceive and interact with their surroundings, particularly in complex applications such as rehabilitation robotics. This paper investigates the viability of integrating the real-time-capable YOLOv11 architecture into the TIAGo robot for object detection tasks relevant to rehabilitation settings. Given the limitations of TIAGo's LiDAR - especially its fixed height, which hinders obstacle detection - this study explores whether YOLOv11 applied to RGB data from the robot's onboard camera can compensate for such perceptual gaps. We apply transfer learning to fine-tune various YOLOv11 variants (n, s, m, l, x) using a publicly available dataset of indoor scenes acquired with RGB-D camera, sensor frequently on board on assistive robot. Each model is evaluated in terms of detection accuracy (mAP50), inference time, and memory usage to assess its suitability for deployment under real-time constraints imposed by TIAGo's 30 Hz RGB-D camera. Considering the technical specifications of TIAGo, our results show that YOLOv11-s achieves the highest mAP50 (96.4%) but exceeds the frame rate requirement with an inference time of 34.6 ms, suggesting that optimization would be necessary for robotic purpose. The analysis highlights trade-offs between accuracy and computational efficiency and supports the feasibility of future integration of object detection within TIAGo's navigation framework for safe and effective rehabilitation assistance.

Preliminary Investigation of Real-Time Object Detection for Safe Robotic Navigation in Rehabilitation Scenarios

Guerra B. M. V.;Sozzi S.;Soldi R.;Russo L.;Schmid M.;Ramat S.
2025-01-01

Abstract

Object detection, a cornerstone of computer vision powered by advancements in Convolutional Neural Networks (CNNs), plays a crucial role in enabling robots to perceive and interact with their surroundings, particularly in complex applications such as rehabilitation robotics. This paper investigates the viability of integrating the real-time-capable YOLOv11 architecture into the TIAGo robot for object detection tasks relevant to rehabilitation settings. Given the limitations of TIAGo's LiDAR - especially its fixed height, which hinders obstacle detection - this study explores whether YOLOv11 applied to RGB data from the robot's onboard camera can compensate for such perceptual gaps. We apply transfer learning to fine-tune various YOLOv11 variants (n, s, m, l, x) using a publicly available dataset of indoor scenes acquired with RGB-D camera, sensor frequently on board on assistive robot. Each model is evaluated in terms of detection accuracy (mAP50), inference time, and memory usage to assess its suitability for deployment under real-time constraints imposed by TIAGo's 30 Hz RGB-D camera. Considering the technical specifications of TIAGo, our results show that YOLOv11-s achieves the highest mAP50 (96.4%) but exceeds the frame rate requirement with an inference time of 34.6 ms, suggesting that optimization would be necessary for robotic purpose. The analysis highlights trade-offs between accuracy and computational efficiency and supports the feasibility of future integration of object detection within TIAGo's navigation framework for safe and effective rehabilitation assistance.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11571/1530419
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact