We analyze a class of probability distributions for family sizes in a duplication, loss and change (DLC) model of genome evolution, recently introduced by Tiuryn, Wójtowicz and Rudnicki. After providing expressions for the generating functions of the density p and of the right-tail Q of the above distributions, we obtain closed forms for p and Q in terms of Gauss hypergeometric functions. Then, by resorting to the literature about special functions and their approximations, we provide an asymptotic expression for Q, which depends on parameters connected with the strengths of duplication and change. This shows that the DLC model yields a rich statistical model for the size distribution, whose elements are characterized by a composition of a power component with a negative exponential one. We also study the limiting distributions, as the parameters are made arbitrarily close to points of the boundary of their natural domain. In addition to the geometric distribution and to the unit mass at 1, the limiting class contains the distributions with the "longest" tails in the DLC model. A characterization of these probability laws is given.
ABOUT THE GENE FAMILIES SIZE DISTRIBUTION IN A RECENT MODEL OF GENOME EVOLUTION
GABETTA, ESTER;REGAZZINI, EUGENIO
2010-01-01
Abstract
We analyze a class of probability distributions for family sizes in a duplication, loss and change (DLC) model of genome evolution, recently introduced by Tiuryn, Wójtowicz and Rudnicki. After providing expressions for the generating functions of the density p and of the right-tail Q of the above distributions, we obtain closed forms for p and Q in terms of Gauss hypergeometric functions. Then, by resorting to the literature about special functions and their approximations, we provide an asymptotic expression for Q, which depends on parameters connected with the strengths of duplication and change. This shows that the DLC model yields a rich statistical model for the size distribution, whose elements are characterized by a composition of a power component with a negative exponential one. We also study the limiting distributions, as the parameters are made arbitrarily close to points of the boundary of their natural domain. In addition to the geometric distribution and to the unit mass at 1, the limiting class contains the distributions with the "longest" tails in the DLC model. A characterization of these probability laws is given.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.