The centromere is a fundamental structure required for faithful chromosome segregation during cell division. While its protein component is highly conserved, the underlying DNA component is extremely variable between closely related species and even between different chromosomes of the same organism. The solution of this paradox was obtained from large body of evidence demonstrating that centromeres are epigenetically defined and do not depend on the underlying DNA sequence. Usually mammalian centromeres are embedded in large arrays of satellite DNA, thus impeding their detailed molecular and functional analysis. We have demonstrated that the rapid evolution of the species of the genus Equus (horses, asses, zebras) was marked by an exceptionally high frequency of centromere repositioning events. In our laboratory, using this genus as model system, the complete lack of satellite sequences at the centromere of horse chromosome 11 was demonstrated [Wade et al. 2009] and recently 16 satellite-less centromeres were identified in the donkey [Nergadze et al. 2017, attached at the end of the thesis]. Recently we reported that the DNA regions where the satellite-less centromeres of horse and donkey lies are enriched in H3K9me3 marker [Gamba et al. unpublished]. In this work, thanks to a ChIP-seq with an antibody against H3K9me3 marker, we identified many non-centromeric H3K9me3 enriched regions that may act as centromere-seeding points during evolution. The analysis of RNA-seq data from horse and donkey fibroblasts allowed us to conclude that satellite-less centromeres lie in gene desert regions signifying that centromeres are functionally and/or structurally incompatible with the presence of coding genes. Taking advantage of the ChIP-seq approach we identified 11 satellite-less centromeres in the Burchell’s zebra, 9 satellite-less centromeres in the Grevy’s zebra, 10 satellite-less centromeres in the Hartmann’s zebra and 14 sat satellite-less centromeres in the kiang. The karyotype of these species is the result of many fusions of ancestral chromosomes [Musilova et al. 2013] and we are particularly interested in the satellite-less centromeres that formed in corrispondence of centromere-to-centromere fusion of ancestral chromosomes with loss of satellite sequences at the centromeric locus. Looking at the fusion point at DNA sequence level may provide hints on what happen during the formation of an evolutionary new centromere. The enrichment profiles of the ChIP-seq reads mapped on the horse reference genome showed different shapes, some showed a Gaussian-like regular shape, other peaks were irregular, contained gaps or exhibited a narrow, spike-like distribution. As the irregular shapes suggested the presence of rearrangements in comparison to the horse orthologous regions, in order to define the DNA sequence composition and organization of the satellite-less centromeres of these four species we decided to de-novo assemble these regions. The de novo assembly was accomplished thanks to a combination of the genome assembler software SPAdes, to obtain large contigs, and a chromosome walking approach using the ChIP-seq reads to join the contigs. We carried out a detailed analysis of the forty-four satellite-less sequences in comparison with the corresponding horse orthologous regions. The percentage of SINEs, LINEs, LTR-derived sequences and transposable DNA elements at the Burchell’s, Grevy’s, Hartmann’s zebra and kiang’s centromeric domains did not differ from the orthologous horse sequences. The GC content at these loci was also similar in the two species. Although sequence rearrangements were observed at several satellite-less centromeres in comparison to the horse orthologous sequences a precise role of rearrangements in centromere formation and evolution could not be inferred.

The centromere is a fundamental structure required for faithful chromosome segregation during cell division. While its protein component is highly conserved, the underlying DNA component is extremely variable between closely related species and even between different chromosomes of the same organism. The solution of this paradox was obtained from large body of evidence demonstrating that centromeres are epigenetically defined and do not depend on the underlying DNA sequence. Usually mammalian centromeres are embedded in large arrays of satellite DNA, thus impeding their detailed molecular and functional analysis. We have demonstrated that the rapid evolution of the species of the genus Equus (horses, asses, zebras) was marked by an exceptionally high frequency of centromere repositioning events. In our laboratory, using this genus as model system, the complete lack of satellite sequences at the centromere of horse chromosome 11 was demonstrated [Wade et al. 2009] and recently 16 satellite-less centromeres were identified in the donkey [Nergadze et al. 2017, attached at the end of the thesis]. Recently we reported that the DNA regions where the satellite-less centromeres of horse and donkey lies are enriched in H3K9me3 marker [Gamba et al. unpublished]. In this work, thanks to a ChIP-seq with an antibody against H3K9me3 marker, we identified many non-centromeric H3K9me3 enriched regions that may act as centromere-seeding points during evolution. The analysis of RNA-seq data from horse and donkey fibroblasts allowed us to conclude that satellite-less centromeres lie in gene desert regions signifying that centromeres are functionally and/or structurally incompatible with the presence of coding genes. Taking advantage of the ChIP-seq approach we identified 11 satellite-less centromeres in the Burchell’s zebra, 9 satellite-less centromeres in the Grevy’s zebra, 10 satellite-less centromeres in the Hartmann’s zebra and 14 sat satellite-less centromeres in the kiang. The karyotype of these species is the result of many fusions of ancestral chromosomes [Musilova et al. 2013] and we are particularly interested in the satellite-less centromeres that formed in corrispondence of centromere-to-centromere fusion of ancestral chromosomes with loss of satellite sequences at the centromeric locus. Looking at the fusion point at DNA sequence level may provide hints on what happen during the formation of an evolutionary new centromere. The enrichment profiles of the ChIP-seq reads mapped on the horse reference genome showed different shapes, some showed a Gaussian-like regular shape, other peaks were irregular, contained gaps or exhibited a narrow, spike-like distribution. As the irregular shapes suggested the presence of rearrangements in comparison to the horse orthologous regions, in order to define the DNA sequence composition and organization of the satellite-less centromeres of these four species we decided to de-novo assemble these regions. The de novo assembly was accomplished thanks to a combination of the genome assembler software SPAdes, to obtain large contigs, and a chromosome walking approach using the ChIP-seq reads to join the contigs. We carried out a detailed analysis of the forty-four satellite-less sequences in comparison with the corresponding horse orthologous regions. The percentage of SINEs, LINEs, LTR-derived sequences and transposable DNA elements at the Burchell’s, Grevy’s, Hartmann’s zebra and kiang’s centromeric domains did not differ from the orthologous horse sequences. The GC content at these loci was also similar in the two species. Although sequence rearrangements were observed at several satellite-less centromeres in comparison to the horse orthologous sequences a precise role of rearrangements in centromere formation and evolution could not be inferred.

Molecular organization of satellite-less centromeres in the genus Equus

GOZZO, FRANCESCO
2018-01-16

Abstract

The centromere is a fundamental structure required for faithful chromosome segregation during cell division. While its protein component is highly conserved, the underlying DNA component is extremely variable between closely related species and even between different chromosomes of the same organism. The solution of this paradox was obtained from large body of evidence demonstrating that centromeres are epigenetically defined and do not depend on the underlying DNA sequence. Usually mammalian centromeres are embedded in large arrays of satellite DNA, thus impeding their detailed molecular and functional analysis. We have demonstrated that the rapid evolution of the species of the genus Equus (horses, asses, zebras) was marked by an exceptionally high frequency of centromere repositioning events. In our laboratory, using this genus as model system, the complete lack of satellite sequences at the centromere of horse chromosome 11 was demonstrated [Wade et al. 2009] and recently 16 satellite-less centromeres were identified in the donkey [Nergadze et al. 2017, attached at the end of the thesis]. Recently we reported that the DNA regions where the satellite-less centromeres of horse and donkey lies are enriched in H3K9me3 marker [Gamba et al. unpublished]. In this work, thanks to a ChIP-seq with an antibody against H3K9me3 marker, we identified many non-centromeric H3K9me3 enriched regions that may act as centromere-seeding points during evolution. The analysis of RNA-seq data from horse and donkey fibroblasts allowed us to conclude that satellite-less centromeres lie in gene desert regions signifying that centromeres are functionally and/or structurally incompatible with the presence of coding genes. Taking advantage of the ChIP-seq approach we identified 11 satellite-less centromeres in the Burchell’s zebra, 9 satellite-less centromeres in the Grevy’s zebra, 10 satellite-less centromeres in the Hartmann’s zebra and 14 sat satellite-less centromeres in the kiang. The karyotype of these species is the result of many fusions of ancestral chromosomes [Musilova et al. 2013] and we are particularly interested in the satellite-less centromeres that formed in corrispondence of centromere-to-centromere fusion of ancestral chromosomes with loss of satellite sequences at the centromeric locus. Looking at the fusion point at DNA sequence level may provide hints on what happen during the formation of an evolutionary new centromere. The enrichment profiles of the ChIP-seq reads mapped on the horse reference genome showed different shapes, some showed a Gaussian-like regular shape, other peaks were irregular, contained gaps or exhibited a narrow, spike-like distribution. As the irregular shapes suggested the presence of rearrangements in comparison to the horse orthologous regions, in order to define the DNA sequence composition and organization of the satellite-less centromeres of these four species we decided to de-novo assemble these regions. The de novo assembly was accomplished thanks to a combination of the genome assembler software SPAdes, to obtain large contigs, and a chromosome walking approach using the ChIP-seq reads to join the contigs. We carried out a detailed analysis of the forty-four satellite-less sequences in comparison with the corresponding horse orthologous regions. The percentage of SINEs, LINEs, LTR-derived sequences and transposable DNA elements at the Burchell’s, Grevy’s, Hartmann’s zebra and kiang’s centromeric domains did not differ from the orthologous horse sequences. The GC content at these loci was also similar in the two species. Although sequence rearrangements were observed at several satellite-less centromeres in comparison to the horse orthologous sequences a precise role of rearrangements in centromere formation and evolution could not be inferred.
16-gen-2018
The centromere is a fundamental structure required for faithful chromosome segregation during cell division. While its protein component is highly conserved, the underlying DNA component is extremely variable between closely related species and even between different chromosomes of the same organism. The solution of this paradox was obtained from large body of evidence demonstrating that centromeres are epigenetically defined and do not depend on the underlying DNA sequence. Usually mammalian centromeres are embedded in large arrays of satellite DNA, thus impeding their detailed molecular and functional analysis. We have demonstrated that the rapid evolution of the species of the genus Equus (horses, asses, zebras) was marked by an exceptionally high frequency of centromere repositioning events. In our laboratory, using this genus as model system, the complete lack of satellite sequences at the centromere of horse chromosome 11 was demonstrated [Wade et al. 2009] and recently 16 satellite-less centromeres were identified in the donkey [Nergadze et al. 2017, attached at the end of the thesis]. Recently we reported that the DNA regions where the satellite-less centromeres of horse and donkey lies are enriched in H3K9me3 marker [Gamba et al. unpublished]. In this work, thanks to a ChIP-seq with an antibody against H3K9me3 marker, we identified many non-centromeric H3K9me3 enriched regions that may act as centromere-seeding points during evolution. The analysis of RNA-seq data from horse and donkey fibroblasts allowed us to conclude that satellite-less centromeres lie in gene desert regions signifying that centromeres are functionally and/or structurally incompatible with the presence of coding genes. Taking advantage of the ChIP-seq approach we identified 11 satellite-less centromeres in the Burchell’s zebra, 9 satellite-less centromeres in the Grevy’s zebra, 10 satellite-less centromeres in the Hartmann’s zebra and 14 sat satellite-less centromeres in the kiang. The karyotype of these species is the result of many fusions of ancestral chromosomes [Musilova et al. 2013] and we are particularly interested in the satellite-less centromeres that formed in corrispondence of centromere-to-centromere fusion of ancestral chromosomes with loss of satellite sequences at the centromeric locus. Looking at the fusion point at DNA sequence level may provide hints on what happen during the formation of an evolutionary new centromere. The enrichment profiles of the ChIP-seq reads mapped on the horse reference genome showed different shapes, some showed a Gaussian-like regular shape, other peaks were irregular, contained gaps or exhibited a narrow, spike-like distribution. As the irregular shapes suggested the presence of rearrangements in comparison to the horse orthologous regions, in order to define the DNA sequence composition and organization of the satellite-less centromeres of these four species we decided to de-novo assemble these regions. The de novo assembly was accomplished thanks to a combination of the genome assembler software SPAdes, to obtain large contigs, and a chromosome walking approach using the ChIP-seq reads to join the contigs. We carried out a detailed analysis of the forty-four satellite-less sequences in comparison with the corresponding horse orthologous regions. The percentage of SINEs, LINEs, LTR-derived sequences and transposable DNA elements at the Burchell’s, Grevy’s, Hartmann’s zebra and kiang’s centromeric domains did not differ from the orthologous horse sequences. The GC content at these loci was also similar in the two species. Although sequence rearrangements were observed at several satellite-less centromeres in comparison to the horse orthologous sequences a precise role of rearrangements in centromere formation and evolution could not be inferred.
File in questo prodotto:
File Dimensione Formato  
TESI DEFINITIVA con allegato.pdf

accesso aperto

Descrizione: tesi di dottorato
Dimensione 46.91 MB
Formato Adobe PDF
46.91 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11571/1214834
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact