Ambient Assisted Living is crucial for improving independence, safety and quality of life in frail individuals. Human Activity Recognition (HAR) can be a game-changing tool to monitor daily life habits, and the progress of rehabilitation that frail individuals undergo. Here we investigated whether deep learning (DL) networks are suitable to identify and quantify the activities that a person performs. We used two Azure Kinect cameras, which enable 3D body tracking while assuring the privacy of the person recorded to acquire a dedicated dataset. The acquisition of the dataset involved twenty healthy subjects performing selected tasks in a set of mixed items including daily living activities and rehabilitation-related tasks. The final dataset was composed of 299 acquisitions of 30 Hz skeletal data. A second version of the dataset was obtained by subsampling the data at 10 Hz to allow more time for processing individual frames. We then implemented three different DL networks: a Bidirectional Long Short-Term Memory (BILSTM), a Temporal Convolutional Network (TCN) and an attention based one for HAR. Finally, we proposed an algorithm processing networks' output to quantify the identified activities in terms of the number of repetitions and time spent performing them. The best network achieves 94% accuracy in HAR and 81% in repetition counting.
Monitoring and Quantification of Activity in Frail Individuals: A Proposed Approach
Soldi R.;Guerra B. M. V.;Sozzi S.;Russo L.;Schmid M.;Ramat S.
2025-01-01
Abstract
Ambient Assisted Living is crucial for improving independence, safety and quality of life in frail individuals. Human Activity Recognition (HAR) can be a game-changing tool to monitor daily life habits, and the progress of rehabilitation that frail individuals undergo. Here we investigated whether deep learning (DL) networks are suitable to identify and quantify the activities that a person performs. We used two Azure Kinect cameras, which enable 3D body tracking while assuring the privacy of the person recorded to acquire a dedicated dataset. The acquisition of the dataset involved twenty healthy subjects performing selected tasks in a set of mixed items including daily living activities and rehabilitation-related tasks. The final dataset was composed of 299 acquisitions of 30 Hz skeletal data. A second version of the dataset was obtained by subsampling the data at 10 Hz to allow more time for processing individual frames. We then implemented three different DL networks: a Bidirectional Long Short-Term Memory (BILSTM), a Temporal Convolutional Network (TCN) and an attention based one for HAR. Finally, we proposed an algorithm processing networks' output to quantify the identified activities in terms of the number of repetitions and time spent performing them. The best network achieves 94% accuracy in HAR and 81% in repetition counting.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


