# Antarctica’s Centroid explained

Full Screen

The orange dot is the correct centroid (or center of mass) of the spherical shape of Antarctica, as computed by D3.

For comparison, the black dot is the naïve planar centroid of the shape’s coordinates. It is obviously too much to the North. Alas, many GIS software seem to use a flawed formula such as this one.

Here’s how D3’s algorithm works.

D3 computes the centroid of geographic features with a three-dimensional model, in which every bit of land is given a weight proportional to its area.

The algorithm was established by Jason Davies, after this discussion on stackexchange. It is based on a formula published by J. E. Brock in 1974. (See also Brock’s research paper The centroid and inertia tensor for a spherical triangle.)

Brock established that the centre of mass of a spherical triangle on the surface of the unit sphere can be computed as the sum on all edges (`A->B`) of

`OA x OB / norm(OA x OB) * angle(AB) / 2e`

where:

• `OA` is the cartesian vector from `O` (centre of the sphere) to `A`

• `x` denotes cross product

• `angle(AB)` is the length of the great circle route from `A` to `B`.

• `e` is the excess angle of the triangle ABC : sum(angles) - 180°.

Expressed in radians, the excess angle `e` is equal to the triangle’s area. To compute a centroid, we are seeking the barycenter of several such triangles’ centres of mass, weighted by their surfaces. So, the formula above must be multiplied by `e`, which yields a very nice simplification.

The weighted barycenter of a collection of such triangles amounts to the sum of `OA x OB / norm(OA x OB) * angle(AB)` for all edges, divided by the sum of the triangles’ areas. But, as we will in the end reproject the barycenter towards the sphere’s surface, we don’t even need to keep track of the total areas. (For the same reason, we can also drop the `/2` factor.)

Lastly, when two triangles share a common edge, triangle 1 running it clockwise and triangle 2 running it counter-clockwise, this edge’s opposite contributions cancel each other. As a consequence, we only need to compute the sum of this formula for the outer edges of our multipolygon.

This happens in d3-geo/centroid.js: it is called once for each edge of our multipolygon, following an edge from the previous point `A=(x0,y0,z0)` to the new point `B=(x,y,z)`, computed from the spherical coordinates.

`(cx,cy,cz)` is the cross product `OA x OB`, and `(X2,Y2,Z2)` is the 3D vector that accumulates the sum. `m` is the norm of the cross product, and the angle is computed as the arcsine of this norm.

In the same loop we also add to the barycentre of line weights proportional to their lengths `(X1,Y1,Z1)`, so that in the degenerate case where the centroid falls on `O`, we can fall back to the multiline centroid. We also keep track with `centroidPointCartesian` of the point barycentre, for the case where the total line length is 0.