Object detection, a cornerstone of computer vision powered by advancements in Convolutional Neural Networks (CNNs), plays a crucial role in enabling robots to perceive and interact with their surroundings, particularly in complex applications such as rehabilitation robotics. This paper investigates the viability of integrating the real-time-capable YOLOv11 architecture into the TIAGo robot for object detection tasks relevant to rehabilitation settings. Given the limitations of TIAGo's LiDAR - especially its fixed height, which hinders obstacle detection - this study explores whether YOLOv11 applied to RGB data from the robot's onboard camera can compensate for such perceptual gaps. We apply transfer learning to fine-tune various YOLOv11 variants (n, s, m, l, x) using a publicly available dataset of indoor scenes acquired with RGB-D camera, sensor frequently on board on assistive robot. Each model is evaluated in terms of detection accuracy (mAP50), inference time, and memory usage to assess its suitability for deployment under real-time constraints imposed by TIAGo's 30 Hz RGB-D camera. Considering the technical specifications of TIAGo, our results show that YOLOv11-s achieves the highest mAP50 (96.4%) but exceeds the frame rate requirement with an inference time of 34.6 ms, suggesting that optimization would be necessary for robotic purpose. The analysis highlights trade-offs between accuracy and computational efficiency and supports the feasibility of future integration of object detection within TIAGo's navigation framework for safe and effective rehabilitation assistance.
Preliminary Investigation of Real-Time Object Detection for Safe Robotic Navigation in Rehabilitation Scenarios
Guerra B. M. V.;Sozzi S.;Soldi R.;Russo L.;Schmid M.;Ramat S.
2025-01-01
Abstract
Object detection, a cornerstone of computer vision powered by advancements in Convolutional Neural Networks (CNNs), plays a crucial role in enabling robots to perceive and interact with their surroundings, particularly in complex applications such as rehabilitation robotics. This paper investigates the viability of integrating the real-time-capable YOLOv11 architecture into the TIAGo robot for object detection tasks relevant to rehabilitation settings. Given the limitations of TIAGo's LiDAR - especially its fixed height, which hinders obstacle detection - this study explores whether YOLOv11 applied to RGB data from the robot's onboard camera can compensate for such perceptual gaps. We apply transfer learning to fine-tune various YOLOv11 variants (n, s, m, l, x) using a publicly available dataset of indoor scenes acquired with RGB-D camera, sensor frequently on board on assistive robot. Each model is evaluated in terms of detection accuracy (mAP50), inference time, and memory usage to assess its suitability for deployment under real-time constraints imposed by TIAGo's 30 Hz RGB-D camera. Considering the technical specifications of TIAGo, our results show that YOLOv11-s achieves the highest mAP50 (96.4%) but exceeds the frame rate requirement with an inference time of 34.6 ms, suggesting that optimization would be necessary for robotic purpose. The analysis highlights trade-offs between accuracy and computational efficiency and supports the feasibility of future integration of object detection within TIAGo's navigation framework for safe and effective rehabilitation assistance.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


