most bell curves have thick tails
Jul. 8th, 2006 01:47 pm"Most bell curves have thick tails", by Bart Kosko (via MR) is a must-read. Main points:
* a great deal of science and engineering assumes a normal (Gaussian) distribution far too quickly.
* there are many bell curves. The Gaussian is rather thin-tailed, when compared with real-world distributions. It is the thinnest-tailed in the family of stable distributions.
* I quote:
* Standard deviation as a measure of dispersion is a dogma. Squaring means you weigh outliers too heavily.
* a great deal of science and engineering assumes a normal (Gaussian) distribution far too quickly.
* there are many bell curves. The Gaussian is rather thin-tailed, when compared with real-world distributions. It is the thinnest-tailed in the family of stable distributions.
* I quote:
the classical central limit theorem [link mine] result rests on a critical assumption that need not hold and that often does not hold in practice. The theorem assumes that the random dispersion about the mean is so comparatively slight that a particular measure of this dispersion — the variance or the standard deviation — is finite or does not blow up to infinity in a mathematical sense. Most bell curves have infinite or undefined variance even though they have a finite dispersion about their center point. The error is not in the bell curves but in the two-hundred-year-old assumption that variance equals dispersion. It does not in general.
* Standard deviation as a measure of dispersion is a dogma. Squaring means you weigh outliers too heavily.
(no subject)
Date: 2006-07-08 07:30 pm (UTC)(no subject)
Date: 2006-07-08 07:53 pm (UTC)Yes, great topic!
NN Taleb (link) is fantastic on this topic. It's maybe the strongest theme in his work.
(no subject)
Date: 2006-07-08 08:19 pm (UTC)Why am I supposed to believe that "most" unimodal distributions have thick tails, and why is the assumption of finite variance for the iid variables going into the central-limit-theorem sum so unreasonable for the zillions of real-world applications where the variance really is finite?
(no subject)
Date: 2006-07-08 09:21 pm (UTC)What kind of evidence could possibly convince you?
(no subject)
Date: 2006-07-08 09:29 pm (UTC)I think my objection is that the use of the word "most" implicitly involves Kosko's notion of which distributions count for more. So I would claim that the burden is really on the other side of the argument from me, that Kosko should make his assertion more clear before I'm expected to believe it.
Though actually, is there even one nice example you know of where people used to approximate things with a Gaussian where it's (a) clearly a pretty bad approximation for infinite-variance reasons like he describes and (b) there's a better approximation that is still feasible to work with?
It sounds like Kosko has a bunch of such examples in mind, and thinks that they are widespread, but for space reasons hasn't listed any in particular; I might still quibble after they were given about how centrally important they are, but really I am curious what these cases look like.
As a converse example where Gaussian distributions are totally appropriate, I could point to something basic like the number of heads in N fair coin flips.
(no subject)
Date: 2006-07-08 09:49 pm (UTC)I personally don't mind numerical methods, but there are people out there who think that everything should be nice and analytic.
(no subject)
Date: 2006-07-08 09:53 pm (UTC)(no subject)
Date: 2006-07-08 09:35 pm (UTC)One reason why the (special) central limit theorem is nice is that it easily explains why we might find gaussian distributions all around us; if at the bottom we have a lot of extremely simple processes (let's say they have not only finite variance, but finitely many different outcomes) and we sum them up, we'll get a gaussian.
So exactly what stability means is that if we sum up a bunch of little processes that have a stable distribution, the result will, too — but I wonder what kind of basic processes (or other ways of aggregating besides summation) have these Levy distributions in the first place?
(no subject)
Date: 2006-07-08 09:56 pm (UTC)(no subject)
Date: 2006-07-09 12:33 am (UTC)(no subject)
Date: 2006-07-09 02:39 am (UTC)(no subject)
Date: 2006-07-10 12:06 am (UTC)What is this "dispersion" of which the author speaks? Is it rigorously defined somewhere? He doesn't write as though it is.
I am kind of ignorant of statistics. Naively, one might use the mean absolute-value deviation instead of the root-mean-square. Why, uh, doesn't one?
(no subject)
Date: 2006-07-11 01:12 am (UTC)See
http://en.wikipedia.org/wiki/Statistical_dispersion
http://www.quickmba.com/stats/dispersion/
(no subject)
Date: 2006-07-11 02:21 am (UTC)Measures of dispersion would seem to correspond to families of metrics: you have n samples, so you subtract the mean from each one, treat that as a point in Rn, and consider its distance from the origin. Each family of metrics is scaled so that adding an identical sample doesn't change the "distance", so the uncorrected standard deviation is just the scaled Euclidean metric.
Each metric then presumably yields a central limit theorem. So to me, the interesting question to take from this is, is there a single "natural" metric that is better than the Euclidean (which, of course, was the first that came to mind?), or is the metric different for each problem, and if so how do you find it? Which I suppose is more or less what the author of the original paper was asking, but I wish they'd asked it better.