block by vijithassar c60dafea4431f292660d6f5e0487e470

Raincloud Plots

Full Screen

Summary

Raincloud Plots, a concise and powerful new way to visualize distributions of quantitative data, were introduced in March 2018 by Micah Allen, Davide Poggiali, Kirstie Whitaker, Tom Rhys Marshall and Rogier Kievit. They combine elements of violin plots and box plots and then aggressively maximize Tufte’s “data-ink ratio” to provide multiple ways of viewing the same distribution in a minimal amount of space.

The data is drawn in three different ways simultaneously:

This project is written in a heavily annotated style called literate programming. The code blocks from this Markdown document are being executed as JavaScript by lit-web.

Implementation

First, an anonymous function to guard the rest of the code in this script within a closure.

(() => {

Configuration

Configuration variables, mostly used for positioning.

  const height = 480
  const total_width = 960
  const grid = 10
  const margin = {
    top: grid * 2,
    right: grid,
    bottom: grid,
    left: grid
  }
  const width = total_width - margin.left - margin.right
  const segment = height * 0.25
  const size = segment * 0.5

Wrapper

Each raincloud plot consists of three distinct sections, and each section can be encapsulated in a separate function for the sake of clarity. Wrapping these into a single parent function makes it easier to render the whole thing in one shot, and it means the data only needs to be provided as a parameter once.

  const raincloud = (selection, data) => {
    selection
      .call(curve, data)
      .call(dots, data)
      .call(boxplot, data)
  }

Data

Hold on! For the purposes of this demonstration, we’ll need to first generate some data points to plot. This step is a detour which probably won’t actually be necessary in a real-world scenario where the chart is plotting something useful.

  const generate = (count = 800) => {
    const spread = d3.randomUniform(10, 50)()
    const center = d3.randomNormal(500, spread)()
    const jitter = d3.randomUniform(10, 100)
    const direction = () => Math.random() > 0.5 ? 1 : -1
    const base = d3.randomNormal(center, spread)
    const random = () => Math.round(base() + jitter() * direction())
    const data = Array.from({length: count})
      .fill(null)
      .map(random)
      .sort(d3.ascending)
    return data
  }

Curve

To plot the curve section, first group values into a histogram and then bind it to a path drawn with d3.area() or d3.line().

  const curve = (selection, data) => {
    const histogram = d3.histogram()
      .thresholds(20)
      (data)
      .map(bin => bin.length)
    const x = d3.scaleLinear()
      .domain([0, histogram.length])
      .range([0, width])
    const y = d3.scaleLinear()
      .domain([0, d3.max(histogram)])
      .range([size, 0])
    const area = d3.area()
      .y0(y)
      .y1(size)
      .x((d, i) => x(i))
      .curve(d3.curveBasis)
    selection.append('g')
      .classed('curve', true)
      .datum(histogram)
      .append('path')
      .attr('d', area)
  }

Box Plot

The box plot section is graphically simple but can be tedious to draw because you need to carefully juggle a bunch of line segments. For this step d3.quantile does most of the heavy lifting.

  const boxplot = (selection, data) => {
    const bar = grid
    const x = d3.scaleLinear()
      .domain(d3.extent(data))
      .range([0, width])
    const plot = selection
      .append('g')
      .classed('boxplot', true)
      .attr('transform', `translate(0,${segment * 0.75 - grid})`)
    plot
      .append('line')
      .attr('x1', x(d3.quantile(data, 0.5)))
      .attr('x2', x(d3.quantile(data, 0.5)))
      .attr('y1', 0)
      .attr('y2', bar)
    plot
      .append('line')
      .attr('x1', x(d3.quantile(data, 0.1)))
      .attr('x2', x(d3.quantile(data, 0.25)))
      .attr('y1', bar * 0.5)
      .attr('y2', bar * 0.5)
    plot
      .append('line')
      .attr('x1', x(d3.quantile(data, 0.9)))
      .attr('x2', x(d3.quantile(data, 0.75)))
      .attr('y1', bar * 0.5)
      .attr('y2', bar * 0.5)
    plot.append('rect')
      .attr('x', x(d3.quantile(data, 0.25)))
      .attr('y', 0)
      .attr('height', bar)
      .attr('width', x(d3.quantile(data, 0.75)) - x(d3.quantile(data, 0.25)))
  }

Raw Data

The raw data section plots the data according to the primary quantitative axis; the second value can be randomized, or else the points can be stacked ordinally. The DOM API can be a performance bottleneck if there’s a very large number of data points, in which case it may be worthwhile to try rendering this part using the canvas element.

  const dots = (selection, data) => {
    const x = d3.scaleLinear()
      .domain(d3.extent(data))
      .range([0, width])
    selection
      .append('g')
      .classed('dots', true)
      .attr('transform', `translate(0,${segment * 0.5 + grid})`)
      .selectAll('circle')
      .data(data)
      .enter()
      .append('circle')
      .attr('r', 2)
      .attr('cx', x)
      .attr('cy', () => Math.random() * size * 0.5)
  }

DOM

A small helper function to add the SVG and complete the initial DOM setup.

  const dom = selection => {
    selection
      .append('svg')
      .attr('height', height)
      .attr('width', width)
  }

Loop

Set up the SVG and run the raincloud plot function several times.

  const svg = d3.select('main')
    .call(dom)
    .select('svg')
  for (let i = 0; i < 4; i++) {
    svg
      .append('g')
      .classed('raincloud', true)
      .attr('transform', `translate(${margin.left},${i * segment + margin.top})`)
      .call(raincloud, generate())
  }

Close the anonymous function opened at the very beginning.

})()

That’s it! Enjoy your raincloud plots!

index.html

style.css