Next generation sequencing (NGS) technologies, often referred to as massively parallel sequencing, are having a huge impact on genomics and clinical applications. These technologies generate billions of short sequences (reads) that are consequently mapped to their corresponding reference genome to find out known and/or novel genomic variants potentially correlated to patients phenotype. DNA fragment library is usually derived from a diploid genome: we refer to genotyping on NGS data as the analytical process to assign the zygosity of identified variants. Current algorithms typically rely on data of the single genomic locus where variants have been called and are based on the condition of independence between variant locus and reads. These strong assumptions might bring to possible inaccuracies throughout the genotyping process. We have therefore developed an efficient assumption-free algorithm based on a kinetic model approach and distance geometry (Kimimila) that delivers the belonging allele for each read using the inference provided by the measure of differences (i.e. variants) among overlapping reads.

Kimimila: A new model to classify ngs short reads by their allele origin

MARINONI, ANDREA;RIZZO, ETTORE;GAMBA, PAOLO ETTORE;BELLAZZI, RICCARDO;LIMONGELLI, IVAN
2014-01-01

Abstract

Next generation sequencing (NGS) technologies, often referred to as massively parallel sequencing, are having a huge impact on genomics and clinical applications. These technologies generate billions of short sequences (reads) that are consequently mapped to their corresponding reference genome to find out known and/or novel genomic variants potentially correlated to patients phenotype. DNA fragment library is usually derived from a diploid genome: we refer to genotyping on NGS data as the analytical process to assign the zygosity of identified variants. Current algorithms typically rely on data of the single genomic locus where variants have been called and are based on the condition of independence between variant locus and reads. These strong assumptions might bring to possible inaccuracies throughout the genotyping process. We have therefore developed an efficient assumption-free algorithm based on a kinetic model approach and distance geometry (Kimimila) that delivers the belonging allele for each read using the inference provided by the measure of differences (i.e. variants) among overlapping reads.
2014
9781479957019
9781479957019
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11571/1127098
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact