We left off with a basic bar chart in d3v4 showing the number of people who died in Zambia last year by given causes. Since our dataset is richer than that, we’ll add gender to the visualization. So, we’ll show:
As before, the total number of people who died by each cause.
The total number of females who died by each cause.
The total number of males who died by each cause.
The first thing to note in this chart is that the radio buttons above the chart enable the user to choose which subset of the data they want to see. The heights of the bars change, or flash, to a new height representing the user’s choice. So, if you choose “Male”, the bars will change to show the number of male deaths for each cause.
For this example, we only need to know this: at any given time, the user can choose “Male”, “Female”, or “All” and the chart must change to correspond. We’ll discover how that happens in the next example.
Let’s dive deeper into d3.nest()
and d3.csv()
. Please take the time to review the linked tutorial.
Previously, we partitioned the array by CAUSE
, and simply had to get the length of the resulting sub-array belonging to each CAUSE
.
Now we need to partition twice before counting. As before we partition the array by CAUSE
. Then for each CAUSE, we partition its resulting array by SEX
.
Now, we simply get the length of the resulting sub-array belonging to each SEX
in each CAUSE
.
Something to note. The dataset has SEX
encoded as either “1” or “2”. In the questionnaire used by the survey takers, they provide the following interpretation: 1 = Male, 2 = Female.
We need to change “1” to “Male” and “2” to “Female” in the SEX column for each record.
function row
to decode the gender field. We need to partition the data twice. So, we use nest.key()
twice:
d3.nest()
.key(function(d) { return d.CAUSE; })
.key(function(d) { return d.SEX; })
.rollup(function(leaves) { return leaves.length; })
.entries(data)
This looks just like the previous examples, but with one extra nest.key()
step. We would wind up with an array of objects that look something like:
{
key: "FEVER/MALARIA"
, values: [{key: "male", value: 113}, {key: "female", value: 95}]
}
This is a great start. We show the all causes (keys) on the x-axis, so it makes sense to have that as the top-level key. Let’s think through how we’ll use the values
part of the data object.
In the above example, values
is an array of two objects, one object represents male deaths, and the other object represents female deaths. Since we want the user to be able to choose which subset of the data they want to see
d3.nest()
.key(function(d) { return d.CAUSE; })
.key(function(d) { return d.SEX; })
.rollup(function(leaves) { return leaves.length; })
.entries(data)
.map(function(d) {
var obj = {
cause: d.key
, all: d3.sum(d.values, function(v) { return v.value; })
}
;
d.values.forEach(function(v) { obj[v.key] = v.value; });
return obj;
})
And this results in an array of objects that look like:
{
cause: "FEVER/MALARIA"
, all: 208
, female: 95
, male: 113
}
This is a nice, straightforward data format for us to use to power all three bar charts.
Next time, we’ll explore how to cause the HTML elements to trigger change in the chart, to show our new enriched dataset.