This doctoral dissertation investigates computational methodologies for healthcare data analysis spanning clinical, imaging, and academic domains. The research develops innovative machine learning approaches including ensemble methods for survival analysis, quantitative MRI analysis for neuromuscular diseases, graph-based clustering algorithms for complex data structures, and analytical frameworks for assessing pandemic impacts on educational systems. The dissertation encompasses three primary research domains: clinical prediction modeling, medical image analysis, and healthcare impact assessment. The clinical prediction modeling domain focuses on developing advanced ensemble methods for survival analysis, particularly addressing the challenges posed by highly correlated covariates in healthcare datasets. This research introduces the "CovBootTree" method, which extends the Proper Bayesian Bootstrap approach to survival data analysis by incorporating multivariate distribution structures through Cholesky decomposition. The proposed framework generates synthetic observations that preserve the covariance structure of clinical variables, enabling more robust survival predictions while accounting for the complex inter-dependencies commonly found in medical data. Through comprehensive simulation studies across varying sample sizes, this work demonstrates superior predictive stability compared to traditional survival models like Cox regression and Random Survival Forests. The technique proved particularly effective with small datasets, successfully managing complex relationships between correlated health variables while providing more reliable predictions for clinical decision-making. The second project focuses on the computational approaches for unsupervised machine learning for complex data structure identification and pattern recognition in diverse analytical contexts. This research introduces the "Cli-DSP" algorithm, a novel graph-based clustering methodology that integrates min-max clique detection with density peak assignment through optimized shortest path analysis. The proposed approach addresses fundamental limitations of traditional clustering methods by capturing local connectivity patterns within datasets characterized by irregular cluster shapes, overlapping regions, and varying density distributions. The methodology demonstrates superior performance on both synthetic benchmarks and real-world applications, including biomedical data, establishing its effectiveness for complex data clustering challenges across healthcare research domains. The medical image analysis domain explores the application of advanced neuroimaging techniques.The study aims to Investigate magnetization transfer ratio (MTR) as an early biomarker for muscle involvement in patients with late-onset Pompe disease (LOPD) at various disease stages compared to healthy controls. We employed quantitative analysis with Multi-echo Spin-echo (MESE) T2-weighted imaging, Multi-echo Gradient echo sequences for fat fraction (FF) assessment, and Multi-Parametric Mapping for MTR quantification. We found significant differences in MTR and FF between mild and moderate/severe LOPD patients versus healthy controls. MTI demonstrated high sensitivity in detecting mild muscle fiber damage before fat replacement occurs, making it a promising biomarker for monitoring early disease signs, progression, and treatment efficacy. Under the healthcare impact assessment domain, we examine the broader implications of healthcare-related events on academic and educational systems through comprehensive data analysis methodologies. This research specifically investigates the effects of the COVID-19 pandemic on university education using mixed-effects modeling approaches to understand learning patterns and academic performance variations. The work employs longitudinal data analysis techniques to assess how major healthcare crises influence educational outcomes.
This doctoral dissertation investigates computational methodologies for healthcare data analysis spanning clinical, imaging, and academic domains. The research develops innovative machine learning approaches including ensemble methods for survival analysis, quantitative MRI analysis for neuromuscular diseases, graph-based clustering algorithms for complex data structures, and analytical frameworks for assessing pandemic impacts on educational systems. The dissertation encompasses three primary research domains: clinical prediction modeling, medical image analysis, and healthcare impact assessment. The clinical prediction modeling domain focuses on developing advanced ensemble methods for survival analysis, particularly addressing the challenges posed by highly correlated covariates in healthcare datasets. This research introduces the "CovBootTree" method, which extends the Proper Bayesian Bootstrap approach to survival data analysis by incorporating multivariate distribution structures through Cholesky decomposition. The proposed framework generates synthetic observations that preserve the covariance structure of clinical variables, enabling more robust survival predictions while accounting for the complex inter-dependencies commonly found in medical data. Through comprehensive simulation studies across varying sample sizes, this work demonstrates superior predictive stability compared to traditional survival models like Cox regression and Random Survival Forests. The technique proved particularly effective with small datasets, successfully managing complex relationships between correlated health variables while providing more reliable predictions for clinical decision-making. The second project focuses on the computational approaches for unsupervised machine learning for complex data structure identification and pattern recognition in diverse analytical contexts. This research introduces the "Cli-DSP" algorithm, a novel graph-based clustering methodology that integrates min-max clique detection with density peak assignment through optimized shortest path analysis. The proposed approach addresses fundamental limitations of traditional clustering methods by capturing local connectivity patterns within datasets characterized by irregular cluster shapes, overlapping regions, and varying density distributions. The methodology demonstrates superior performance on both synthetic benchmarks and real-world applications, including biomedical data, establishing its effectiveness for complex data clustering challenges across healthcare research domains. The medical image analysis domain explores the application of advanced neuroimaging techniques.The study aims to Investigate magnetization transfer ratio (MTR) as an early biomarker for muscle involvement in patients with late-onset Pompe disease (LOPD) at various disease stages compared to healthy controls. We employed quantitative analysis with Multi-echo Spin-echo (MESE) T2-weighted imaging, Multi-echo Gradient echo sequences for fat fraction (FF) assessment, and Multi-Parametric Mapping for MTR quantification. We found significant differences in MTR and FF between mild and moderate/severe LOPD patients versus healthy controls. MTI demonstrated high sensitivity in detecting mild muscle fiber damage before fat replacement occurs, making it a promising biomarker for monitoring early disease signs, progression, and treatment efficacy. Under the healthcare impact assessment domain, we examine the broader implications of healthcare-related events on academic and educational systems through comprehensive data analysis methodologies. This research specifically investigates the effects of the COVID-19 pandemic on university education using mixed-effects modeling approaches to understand learning patterns and academic performance variations. The work employs longitudinal data analysis techniques to assess how major healthcare crises influence educational outcomes.
Computational and Statistical Methods for Biomedical Data Analysis
NAZ, FARAH
2026-05-26
Abstract
This doctoral dissertation investigates computational methodologies for healthcare data analysis spanning clinical, imaging, and academic domains. The research develops innovative machine learning approaches including ensemble methods for survival analysis, quantitative MRI analysis for neuromuscular diseases, graph-based clustering algorithms for complex data structures, and analytical frameworks for assessing pandemic impacts on educational systems. The dissertation encompasses three primary research domains: clinical prediction modeling, medical image analysis, and healthcare impact assessment. The clinical prediction modeling domain focuses on developing advanced ensemble methods for survival analysis, particularly addressing the challenges posed by highly correlated covariates in healthcare datasets. This research introduces the "CovBootTree" method, which extends the Proper Bayesian Bootstrap approach to survival data analysis by incorporating multivariate distribution structures through Cholesky decomposition. The proposed framework generates synthetic observations that preserve the covariance structure of clinical variables, enabling more robust survival predictions while accounting for the complex inter-dependencies commonly found in medical data. Through comprehensive simulation studies across varying sample sizes, this work demonstrates superior predictive stability compared to traditional survival models like Cox regression and Random Survival Forests. The technique proved particularly effective with small datasets, successfully managing complex relationships between correlated health variables while providing more reliable predictions for clinical decision-making. The second project focuses on the computational approaches for unsupervised machine learning for complex data structure identification and pattern recognition in diverse analytical contexts. This research introduces the "Cli-DSP" algorithm, a novel graph-based clustering methodology that integrates min-max clique detection with density peak assignment through optimized shortest path analysis. The proposed approach addresses fundamental limitations of traditional clustering methods by capturing local connectivity patterns within datasets characterized by irregular cluster shapes, overlapping regions, and varying density distributions. The methodology demonstrates superior performance on both synthetic benchmarks and real-world applications, including biomedical data, establishing its effectiveness for complex data clustering challenges across healthcare research domains. The medical image analysis domain explores the application of advanced neuroimaging techniques.The study aims to Investigate magnetization transfer ratio (MTR) as an early biomarker for muscle involvement in patients with late-onset Pompe disease (LOPD) at various disease stages compared to healthy controls. We employed quantitative analysis with Multi-echo Spin-echo (MESE) T2-weighted imaging, Multi-echo Gradient echo sequences for fat fraction (FF) assessment, and Multi-Parametric Mapping for MTR quantification. We found significant differences in MTR and FF between mild and moderate/severe LOPD patients versus healthy controls. MTI demonstrated high sensitivity in detecting mild muscle fiber damage before fat replacement occurs, making it a promising biomarker for monitoring early disease signs, progression, and treatment efficacy. Under the healthcare impact assessment domain, we examine the broader implications of healthcare-related events on academic and educational systems through comprehensive data analysis methodologies. This research specifically investigates the effects of the COVID-19 pandemic on university education using mixed-effects modeling approaches to understand learning patterns and academic performance variations. The work employs longitudinal data analysis techniques to assess how major healthcare crises influence educational outcomes.| File | Dimensione | Formato | |
|---|---|---|---|
|
THESIS.pdf
accesso aperto
Descrizione: Computational and Statistical Methods for Biomedical Data Analysis
Tipologia:
Tesi di dottorato
Dimensione
6.14 MB
Formato
Adobe PDF
|
6.14 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


