Clone of Henrik Lindberg’s chart which he created using R (see his tweet). Just curious how to implement this type of chart in D3.
His description of the chart:
Peak time of day for sports and leisure
Number of participants throughout the day compared to peak popularity. Note the morning-and-evening everyday workouts, the midday hobbies, and the evenings/late nights out.
The data and script to process the data came from Henrik’s GitHub repo. The data source is the American Time Use Survey at the Bureau of Labor Statistics.
forked from armollica‘s block: Joyplot tests
forked from fogonwater‘s block: Joyplot tests
<!DOCTYPE html>
<html>
<head>
<style>
svg {
display: block;
margin: 0 auto;
}
.axis .domain {
display: none;
}
.axis--x text {
fill: #999;
}
.axis--x line {
stroke: #aaa;
}
.axis--activity .tick line {
display: none;
}
.axis--activity text {
font-size: 12px;
fill: #777;
}
.axis--activity .tick:nth-child(odd) text {
fill: #222;
}
.line {
fill: none;
stroke: #fff;
}
.area {
fill: #448cab;
}
</style>
</head>
<body>
<script src="https://d3js.org/d3.v4.min.js"></script>
<script>
var margin = { top: 30, right: 10, bottom: 30, left: 300 },
width = 700 - margin.left - margin.right,
height = 760 - margin.top - margin.bottom;
// Percent two area charts can overlap
var overlap = 0.6;
var formatTime = d3.timeFormat('%I %p');
var svg = d3.select('body').append('svg')
.attr('width', width + margin.left + margin.right)
.attr('height', height + margin.top + margin.bottom)
.append('g')
.attr('transform', 'translate(' + margin.left + ',' + margin.top + ')');
var x = function(d) { return d.time; },
xScale = d3.scaleTime().range([0, width]),
xValue = function(d) { return xScale(x(d)); },
xAxis = d3.axisBottom(xScale).tickFormat(formatTime);
var y = function(d) { return d.value; },
yScale = d3.scaleLinear(),
yValue = function(d) { return yScale(y(d)); };
var activity = function(d) { return d.key; },
activityScale = d3.scaleBand().range([0, height]),
activityValue = function(d) { return activityScale(activity(d)); },
activityAxis = d3.axisLeft(activityScale);
var area = d3.area()
.x(xValue)
.y1(yValue);
var line = area.lineY1();
function parseTime(offset) {
var date = new Date(2017, 0, 1); // chose an arbitrary day
return d3.timeMinute.offset(date, offset);
}
function row(d) {
return {
activity: d.activity,
time: parseTime(d.time),
value: +d.p_smooth
};
}
d3.tsv('data.tsv', row, function(error, dataFlat) {
if (error) throw error;
// Sort by time
dataFlat.sort(function(a, b) { return a.time - b.time; });
var data = d3.nest()
.key(function(d) { return d.activity; })
.entries(dataFlat);
// Sort activities by peak activity time
function peakTime(d) {
var i = d3.scan(d.values, function(a, b) { return y(b) - y(a); });
return d.values[i].time;
};
data.sort(function(a, b) { return peakTime(b) - peakTime(a); });
xScale.domain(d3.extent(dataFlat, x));
activityScale.domain(data.map(function(d) { return d.key; }));
var areaChartHeight = (1 + overlap) * (height / activityScale.domain().length);
yScale
.domain(d3.extent(dataFlat, y))
.range([areaChartHeight, 0]);
area.y0(yScale(0));
svg.append('g').attr('class', 'axis axis--x')
.attr('transform', 'translate(0,' + height + ')')
.call(xAxis);
svg.append('g').attr('class', 'axis axis--activity')
.call(activityAxis);
var gActivity = svg.append('g').attr('class', 'activities')
.selectAll('.activity').data(data)
.enter().append('g')
.attr('class', function(d) { return 'activity activity--' + d.key; })
.attr('transform', function(d) {
var ty = activityValue(d) - activityScale.bandwidth() + 5;
return 'translate(0,' + ty + ')';
});
gActivity.append('path').attr('class', 'area')
.datum(function(d) { return d.values; })
.attr('d', area);
gActivity.append('path').attr('class', 'line')
.datum(function(d) { return d.values; })
.attr('d', line);
});
</script>
</body>
</html>
data.tsv: activity.tsv
Rscript process-activity.R $< > $@
#!/usr/bin/env Rscript
# Adapted from https://github.com/halhen/viz-pub/blob/master/sports-time-of-day/2_gen_chart.R
library(tidyverse)
filename <- commandArgs(trailingOnly = TRUE)[1]
df <- read_tsv(filename)
df %>%
group_by(activity) %>%
filter(max(p) > 3e-04, # Keep the most popular ones
!grepl('n\\.e\\.c', activity)) %>% # Remove n.e.c. (not elsewhere classified)
arrange(time) %>%
mutate(p_peak = p / max(p), # Normalize as percentage of peak popularity
p_smooth = (lag(p_peak) + p_peak + lead(p_peak)) / 3, # Moving average
p_smooth = coalesce(p_smooth, p_peak)) %>% # When there's no lag or lead, we get NA. Use the pointwise data
ungroup() %>%
do({ # 24:00:00 is missing from the source data; add for a complete cycle
rbind(.,
filter(., time == 0) %>%
mutate(time = 24*60))
}) %>%
mutate(time = ifelse(time < 3 * 60, time + 24 * 60, time)) %>% # Set start of chart to 03:00; few things overlap this hour
mutate(activity = reorder(activity, p_peak, FUN=which.max)) %>% # order by peak time
format_tsv() %>%
cat()