The complexity of gene expression regulation relies on the synergic nature underlying the molecular interplay among its principal actors, transcription factors (TFs). Exerting a spatiotemporal control on their target genes, they define transcriptional programs across the genome, which are strongly perturbed in a disease context. In order to gain a more comprehensive picture of these complex dynamics, a data fusion approach, aimed at performing the integration of heterogeneous -omics data is fundamental. Bayesian Networks provide a natural framework for integrating different sources of data and knowledge through the priors’ use. In this work, we developed an hybrid structure-learning algorithm with the aim of exploiting TF ChIP-seq and gene expression (GE) data to investigate disease-specific transcriptional regulations in a genome-wide perspective. TF ChIP seq profiles were firstly used for structure learning and then integrated in the model as a prior probability. GE panels were employed to learn the model parameters, trying to find the best heuristic transcriptional network. We applied our approach to a specific pathological case, the chronic myeloid leukemia (CML), a myeloproliferative disorder, whose transcriptional mechanisms have not yet been deeply elucidated. The proposed data-driven method allows to investigate transcriptional signatures, highlighting in the obtained probabilistic network a three-layered hierarchy, as a different TFs influence on gene expression cellular programs.
Data fusion approach for learning transcriptional Bayesian networks
Sauta E.;Demartini A.;Vitali F.;Bellazzi R.
2017-01-01
Abstract
The complexity of gene expression regulation relies on the synergic nature underlying the molecular interplay among its principal actors, transcription factors (TFs). Exerting a spatiotemporal control on their target genes, they define transcriptional programs across the genome, which are strongly perturbed in a disease context. In order to gain a more comprehensive picture of these complex dynamics, a data fusion approach, aimed at performing the integration of heterogeneous -omics data is fundamental. Bayesian Networks provide a natural framework for integrating different sources of data and knowledge through the priors’ use. In this work, we developed an hybrid structure-learning algorithm with the aim of exploiting TF ChIP-seq and gene expression (GE) data to investigate disease-specific transcriptional regulations in a genome-wide perspective. TF ChIP seq profiles were firstly used for structure learning and then integrated in the model as a prior probability. GE panels were employed to learn the model parameters, trying to find the best heuristic transcriptional network. We applied our approach to a specific pathological case, the chronic myeloid leukemia (CML), a myeloproliferative disorder, whose transcriptional mechanisms have not yet been deeply elucidated. The proposed data-driven method allows to investigate transcriptional signatures, highlighting in the obtained probabilistic network a three-layered hierarchy, as a different TFs influence on gene expression cellular programs.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.