block by seemantk a75c6cf01cd341b748d4bd2f9692435a

Bar Chart I: Causes of Death in Zambia

Full Screen

Visualizing causes of death in Zambia: I

From the 2015 Living Conditions Survey

Introduction

In August, I presented a workshop called d3.fundamentals() at BongoHive in Lusaka, Zambia. The Zambian Government’s Central Statistical Office ran a survey in 2015, to determine the Living Conditions of people throughout the country. They graciously provided anonymized datasets for us to use during the workshop.

Chinyanta Mwenya, a Zambian hacker, collaborated with me to prepare the datasets for the workshop. Chinyanta cleaned the dataset and ran statistical analysis using both Excel and R. We took some time to find a smallish dataset that was easy to understand. We settled on the unfortunately morbid Causes of Death dataset.

For the workshop I assumed a minimal knowledge of HTML and CSS, and no knowledge of Javascript (let alone any knowledge of d3). So, the workshop turned out to be, essentially From Zero to Bar Chart with d3.

Rather than a bunch of pre-done, canned examples, I opted to live-code my way through the agenda. The journey went as follows:

This bl.ock is the result of that demo. It is presented here with minimal cleanup (spaces/indentations and moving variable declarations near the top).

The goal with this first visualization was to introduce d3 concepts while live-coding. So we built a bar chart showing the number of people who died from each cause listed.

Fundamentals

  1. The <style> section contains styling for the axis:
    • It cleans up the lines, making them fine and sharp.
  2. We use d3.csv() to read in the data file:
    • the data file has one record per row
    • the data is read in as an array of objects (one object per record or row) that look something like:
      • {CAUSE:"FEVER/MALARIA", AGE:"38", OTHER:"", SEX:"1"}
  3. We use d3.nest() to group the array of objects by CAUSE
    • We use nest.key() to do the actual grouping. It takes a callback function that tells it how to get an object’s CAUSE field.
    • We use nest.entries() to get an array of objects that look something like:
      • { key: "FEVER/MALARIA", values: [] }
        • values is an array of objects with the CAUSE specified by key
    • We use nest.rollup() to calculate the length of the values array in each object.
      • The length of thevalues array is the # of people that died of CAUSE, or key.
  4. We use scales to map from data space to screen space.
    • Causes along the x-axis using an ordinal scale
    • The bar heights correspond to the number of deaths, so we use a linear scale there.
  5. We draw an SVG <rect> for each bar in the bar chart.
    • Remember: SVG coordinates start at the top left.
    • The origin is at the top left corner of the SVG,
      • Positive x-axis points to the right.
      • Positive y-axis points down.
    • We use scale.range() to flip the y-axis (scale.range([height,0])). This allows us to present the data in a more intuitive way (with origin at bottom left).
    • We draw a <rect> for each bar, so we need to specify:
    • coordinate of top left corner of the bar:
      • x-coordinate represents the cause of deaths
      • y-coordinate represents the number of deaths
    • width of the bar:
    • height of the bar:
      • The base of each rectangle will be at the bottom of the SVG (at height in SVG (y)coordinates). So, the height is the difference between the y-coordinate of the top-left corner and y-coordinate of the base
  6. We label the bar by using the axis labels of a d3.svg.axis()
    • The tick labels are the Causes.
    • We use transform#rotate() to rotate the labels so they don’t overlap each other.
    • The labels are rotated clockwise by 90 degrees (-90) to flow down-to-up

Analysis

This is a great start to visualizing this dataset. We can easily tell the relative seriousness of each cause of death. We can see that FEVER/MALARIA kills about 2 times more people as the next cause of death (OTHER, or miscellaneous, causes).

Improvements

It’s a basic chart that gives us a basic insight into the dataset. By making some small changes, we can vastly improve our insight.

Next, let’s make the chart easier to read.

index.html