block by tophtucker 33da030200d475fd5af97b44f205da3e

Collapsing high-dimensional normal distributions

Full Screen

My friend Colin has been trying to give me a better intuition for higher dimensions. His first fun fact was that the volume of an n-dimensional hypersphere peaks somewhere around n=6 or so, I forget, and then approaches 0 as n approaches infinity. Damn! Meanwhile, of course the volume of a hypercube just diverges to infinity, as you’d expect. So if you inscribe a hypersphere in a hypercube, as dimension increases, more and more of the volume is in “the corners” — ultimately, almost all of it.

That’s not what I’m trying to show here, it’s just cool. This is somewhat different.

Colin also pointed out that, in a high-dimensional multivariate random normal distribution (with identity covariance matrix), all the mass ends up coming to be found in a sort of donut at some distance from the middle. There’s very little mass in the middle. Of course the origin is still the mean/median/mode. The problem is that the middle is just so dang small, and there’s SO MUCH SPACE as you go a little further out. So if you’re just looking at the distribution of distances from the middle, most are a ways out — more so the more dimensions you’re in.

This is a lot like the old joke that the average family has 2.2 kids, but no single family has 2.2 kids, God willing. Now consider lots of other attributes (dimensions) too: their ages, genders, heights, locations, professions, pets, hobbies, politics, faiths, vices, material possessions, favorite books, dreams, crimes, secrets, loves, etc. Of course people are mostly normal. But almost no person is normal.

Unlike the incredible shrinking hypersphere, this is evident even in low human-scale dimensions, like 2, and 3.


  1. Mouse around to admire the lil parallax that shows you that the third blob of points is 3D.

  2. Drag “3D” to “2D” to flatten the third dimension, while keeping every point the same distance from the origin, i.e. the same radius. Imagine every point is on a fixed arm that can rotate around the origin, but can’t extend or shrink. (That’d be a good thing to visualize if I had more time!)

  3. Drag “2D” to “1D” to flatten the second dimension, again keeping every point the same distance from the origin. Like sweeping up every point on its little invisible fixed arm and collecting them so they’re all facing the same way. (This also sweeps up the 1D points with a negative radius around to the positive side.)

When you let go on 1D, the points will “relax” a bit to show you their distribution of distances from the origin.

Notice that on the left (1D), the mode (fattest part of the distribution) is right at the origin. But in the middle (2D), it’s a bit removed. And on the right (3D) it’s farther still.

Hopefully you can kinda imagine how, every time you add a dimension, it contributes some non-negative component to the radius. Like, when you add a third dimension perpendicular to the screen, you can’t initially see the point’s displacement along that axis — but it is probably displaced somewhat, and when you rotate that onto the visible plane, it’ll appear to get a little further away from the origin. Like you add a perpendicular leg to a line and take the hypotenuse and that’s gotta be longer than the original line.

Where does it end??

colin’s python & chart etc