Benford’s Law is a statistical regularity of a large number of datasets; assessing the compliance of a large dataset with the Benford’s Law is a theme of remarkable relevance, mainly for its practical consequences. Such a task can be faced by introducing a statistical distance concept between the empirical distribution of the data and the random variable associated with Benford’s Law. This paper deals with the problem of measuring the compliance of a random variable – which can be seen as describing the empirical distribution of a collection of data – with the Benford’s Law. It proposes a statistical methodology for detecting the critical values related to conformity/nonconformity with Benford’s Law in some well-established cases of statistical distance. The followed approach is grounded on the proper selection of a family of parametric random variables – the lognormal distribution, in our case – and of a reference statistical distance concept – mean absolute deviation. A discussion of the obtained results is carried out on the ground of the existing literature. Moreover, some open problems are also presented.

### Data validity and statistical conformity with Benford's Law

#### Abstract

Benford’s Law is a statistical regularity of a large number of datasets; assessing the compliance of a large dataset with the Benford’s Law is a theme of remarkable relevance, mainly for its practical consequences. Such a task can be faced by introducing a statistical distance concept between the empirical distribution of the data and the random variable associated with Benford’s Law. This paper deals with the problem of measuring the compliance of a random variable – which can be seen as describing the empirical distribution of a collection of data – with the Benford’s Law. It proposes a statistical methodology for detecting the critical values related to conformity/nonconformity with Benford’s Law in some well-established cases of statistical distance. The followed approach is grounded on the proper selection of a family of parametric random variables – the lognormal distribution, in our case – and of a reference statistical distance concept – mean absolute deviation. A discussion of the obtained results is carried out on the ground of the existing literature. Moreover, some open problems are also presented.
##### Scheda breve Scheda completa Scheda completa (DC)
2021
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: `https://hdl.handle.net/11571/1394596`
• ND
• ND
• ND