Raincloud Plots, a concise and powerful new way to visualize distributions of quantitative data, were introduced in March 2018 by Micah Allen, Davide Poggiali, Kirstie Whitaker, Tom Rhys Marshall and Rogier Kievit. They combine elements of violin plots and box plots and then aggressively maximize Tufte’s “data-ink ratio” to provide multiple ways of viewing the same distribution in a minimal amount of space.
The data is drawn in three different ways simultaneously:
This project is written in a heavily annotated style called literate programming. The code blocks from this Markdown document are being executed as JavaScript by lit-web.
First, an anonymous function to guard the rest of the code in this script within a closure.
(() => {
Configuration variables, mostly used for positioning.
const height = 480
const total_width = 960
const grid = 10
const margin = {
top: grid * 2,
right: grid,
bottom: grid,
left: grid
}
const width = total_width - margin.left - margin.right
const segment = height * 0.25
const size = segment * 0.5
Each raincloud plot consists of three distinct sections, and each section can be encapsulated in a separate function for the sake of clarity. Wrapping these into a single parent function makes it easier to render the whole thing in one shot, and it means the data only needs to be provided as a parameter once.
const raincloud = (selection, data) => {
selection
.call(curve, data)
.call(dots, data)
.call(boxplot, data)
}
Hold on! For the purposes of this demonstration, we’ll need to first generate some data points to plot. This step is a detour which probably won’t actually be necessary in a real-world scenario where the chart is plotting something useful.
const generate = (count = 800) => {
const spread = d3.randomUniform(10, 50)()
const center = d3.randomNormal(500, spread)()
const jitter = d3.randomUniform(10, 100)
const direction = () => Math.random() > 0.5 ? 1 : -1
const base = d3.randomNormal(center, spread)
const random = () => Math.round(base() + jitter() * direction())
const data = Array.from({length: count})
.fill(null)
.map(random)
.sort(d3.ascending)
return data
}
To plot the curve section, first group values into a histogram and then bind it to a path drawn with d3.area()
or d3.line()
.
const curve = (selection, data) => {
const histogram = d3.histogram()
.thresholds(20)
(data)
.map(bin => bin.length)
const x = d3.scaleLinear()
.domain([0, histogram.length])
.range([0, width])
const y = d3.scaleLinear()
.domain([0, d3.max(histogram)])
.range([size, 0])
const area = d3.area()
.y0(y)
.y1(size)
.x((d, i) => x(i))
.curve(d3.curveBasis)
selection.append('g')
.classed('curve', true)
.datum(histogram)
.append('path')
.attr('d', area)
}
The box plot section is graphically simple but can be tedious to draw because you need to carefully juggle a bunch of line segments. For this step d3.quantile
does most of the heavy lifting.
const boxplot = (selection, data) => {
const bar = grid
const x = d3.scaleLinear()
.domain(d3.extent(data))
.range([0, width])
const plot = selection
.append('g')
.classed('boxplot', true)
.attr('transform', `translate(0,${segment * 0.75 - grid})`)
plot
.append('line')
.attr('x1', x(d3.quantile(data, 0.5)))
.attr('x2', x(d3.quantile(data, 0.5)))
.attr('y1', 0)
.attr('y2', bar)
plot
.append('line')
.attr('x1', x(d3.quantile(data, 0.1)))
.attr('x2', x(d3.quantile(data, 0.25)))
.attr('y1', bar * 0.5)
.attr('y2', bar * 0.5)
plot
.append('line')
.attr('x1', x(d3.quantile(data, 0.9)))
.attr('x2', x(d3.quantile(data, 0.75)))
.attr('y1', bar * 0.5)
.attr('y2', bar * 0.5)
plot.append('rect')
.attr('x', x(d3.quantile(data, 0.25)))
.attr('y', 0)
.attr('height', bar)
.attr('width', x(d3.quantile(data, 0.75)) - x(d3.quantile(data, 0.25)))
}
The raw data section plots the data according to the primary quantitative axis; the second value can be randomized, or else the points can be stacked ordinally. The DOM API can be a performance bottleneck if there’s a very large number of data points, in which case it may be worthwhile to try rendering this part using the canvas
element.
const dots = (selection, data) => {
const x = d3.scaleLinear()
.domain(d3.extent(data))
.range([0, width])
selection
.append('g')
.classed('dots', true)
.attr('transform', `translate(0,${segment * 0.5 + grid})`)
.selectAll('circle')
.data(data)
.enter()
.append('circle')
.attr('r', 2)
.attr('cx', x)
.attr('cy', () => Math.random() * size * 0.5)
}
A small helper function to add the SVG and complete the initial DOM setup.
const dom = selection => {
selection
.append('svg')
.attr('height', height)
.attr('width', width)
}
Set up the SVG and run the raincloud plot function several times.
const svg = d3.select('main')
.call(dom)
.select('svg')
for (let i = 0; i < 4; i++) {
svg
.append('g')
.classed('raincloud', true)
.attr('transform', `translate(${margin.left},${i * segment + margin.top})`)
.call(raincloud, generate())
}
Close the anonymous function opened at the very beginning.
})()
That’s it! Enjoy your raincloud plots!