Freedman–Diaconis rule

(Redirected from Freedman-Diaconics rule)
Jump to: navigation, search

In statistics, the Freedman–Diaconis rule can be used to select the size of the bins to be used in a histogram.[1] It is named after David A. Freedman and Persi Diaconis.

For a set empirical measurements sampled from some probability distribution, the Freedman-Diaconis rule is designed to minimize the difference between the area under the empirical probability distribution and the area under the theoretical probability distribution.[clarification needed]

The general equation for the rule is:

where is the interquartile range of the data and is the number of observations in the sample

Other approaches

Another approach is to use Sturges' rule: use a bin so large that there are about non-empty bins (Scott, 2009).[2] This works well for n under 200, but was found to be inaccurate for large n.[3] For a discussion and an alternative approach, see Birgé and Rozenholc.[4]

References

  1. ^ Freedman, David; Diaconis, Persi (December 1981). "On the histogram as a density estimator: L2 theory" (PDF). Probability Theory and Related Fields. Heidelberg: Springer Berlin. 57 (4): 453–476. doi:10.1007/BF01025868. ISSN 0178-8051. Retrieved 2009-01-06. 
  2. ^ Scott, D.W. (2009). "Sturges' rule". WIREs Computational Statistics. 1: 303–306. doi:10.1002/wics.35. 
  3. ^ Hyndman, R.J. (1995). "The problem with Sturges' rule for constructing histograms" (PDF). 
  4. ^ Birgé, L.; Rozenholc, Y. (2006). "How many bins should be put in a regular histogram". ESAIM: Probability and Statistics. 10: 24–45. doi:10.1051/ps:2006001.