block by timelyportfolio 58f8fe3e167ef47145d0

R + d3.js Parallel Coordinates of partykit ver 2 with interactive querying

Full Screen

This example builds on the first example by adding the ability to explore the partykit / rpart splits by clicking on the node information. When clicked, the parallel coordinates will be brushed corresponding to the query from the clicked node.

Introduction

This d3.js parallel coordinates plot is another experiment in how we might use interactive plots in Javascript to represent a partykit / rpart object from R. The example builds on this d3.js collapsible tree plot. Eventually, it would be nice to combine the tree and parallel coordinates into a layout with synced interactivity.


### Almost No [`rCharts`](http://rcharts.io) Also, this is fairly different from most of my interactive plots from R in that it almost completely avoids [`rCharts`](http://rcharts.io) (almost because I did use its `publish` to make this gist). I chose to exclude `rCharts` for two reasons:
  1. demo how we can use htmltools to build html/js in R
  2. help users understand some of the things rCharts does for us, such as dependency management, rendering, sharing, multi-format publishing, etc.

### Reproduce I would love for others to reproduce, fork, and improve this.

code

live example


### Thanks

It is impossible to make this list complete, but I would like to thank

index.html

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8"/>
<script src="//d3js.org/d3.v3.js"></script>
<link href="d3.parcoords.css" rel="stylesheet" />
<script src="d3.parcoords.js"></script>
<link href="//cdnjs.cloudflare.com/ajax/libs/intro.js/0.5.0/introjs.css" rel="stylesheet" />
<script src="//cdnjs.cloudflare.com/ajax/libs/intro.js/0.5.0/intro.min.js"></script>

</head>
<body>
<div style="width:100%;">
  <pre id="partykit_info" style="width:100%;" data-step="1" data-intro="click on node info to query the chart below">
Model formula:
hp ~ cyl + disp + mpg + drat + wt + qsec + vs + am + gear + carb

Fitted party:
<span class = 'querynode'>[1] root</span>
<span class = 'querynode'>|   [2] cyl < 7</span>
<span class = 'querynode'>|   |   [3] mpg >= 21.45</span>
<span class = 'querynode'>|   |   |   [4] disp < 87.05: 62.250 (n = 4, err = 140.8)</span>
<span class = 'querynode'>|   |   |   [5] disp >= 87.05: 91.833 (n = 6, err = 1376.8)</span>
<span class = 'querynode'>|   |   [6] mpg < 21.45</span>
<span class = 'querynode'>|   |   |   [7] qsec >= 15.98: 112.857 (n = 7, err = 306.9)</span>
<span class = 'querynode'>|   |   |   [8] qsec < 15.98: 175.000 (n = 1, err = 0.0)</span>
<span class = 'querynode'>|   [9] cyl >= 7</span>
<span class = 'querynode'>|   |   [10] drat < 3.18</span>
<span class = 'querynode'>|   |   |   [11] mpg >= 12.8: 170.000 (n = 7, err = 1150.0)</span>
<span class = 'querynode'>|   |   |   [12] mpg < 12.8: 210.000 (n = 2, err = 50.0)</span>
<span class = 'querynode'>|   |   [13] drat >= 3.18</span>
<span class = 'querynode'>|   |   |   [14] carb < 6: 246.000 (n = 4, err = 582.0)</span>
<span class = 'querynode'>|   |   |   [15] carb >= 6: 335.000 (n = 1, err = 0.0)</span>

Number of inner nodes:    7
Number of terminal nodes: 8</pre>
  <div id="par_container" class="parcoords" style="height:400px;width:100%;"></div>
  <script>  var data = [{"$row":"Mazda RX4","(fitted)":7,"(response)":110,"hp":110,"cyl":6,"mpg":21,"disp":160,"drat":3.9,"wt":2.62,"qsec":16.46,"am":1,"carb":4,"gear":4,"vs":0},{"$row":"Mazda RX4 Wag","(fitted)":7,"(response)":110,"hp":110,"cyl":6,"mpg":21,"disp":160,"drat":3.9,"wt":2.875,"qsec":17.02,"am":1,"carb":4,"gear":4,"vs":0},{"$row":"Datsun 710","(fitted)":5,"(response)":93,"hp":93,"cyl":4,"mpg":22.8,"disp":108,"drat":3.85,"wt":2.32,"qsec":18.61,"am":1,"carb":1,"gear":4,"vs":1},{"$row":"Hornet 4 Drive","(fitted)":7,"(response)":110,"hp":110,"cyl":6,"mpg":21.4,"disp":258,"drat":3.08,"wt":3.215,"qsec":19.44,"am":0,"carb":1,"gear":3,"vs":1},{"$row":"Hornet Sportabout","(fitted)":11,"(response)":175,"hp":175,"cyl":8,"mpg":18.7,"disp":360,"drat":3.15,"wt":3.44,"qsec":17.02,"am":0,"carb":2,"gear":3,"vs":0},{"$row":"Valiant","(fitted)":7,"(response)":105,"hp":105,"cyl":6,"mpg":18.1,"disp":225,"drat":2.76,"wt":3.46,"qsec":20.22,"am":0,"carb":1,"gear":3,"vs":1},{"$row":"Duster 360","(fitted)":14,"(response)":245,"hp":245,"cyl":8,"mpg":14.3,"disp":360,"drat":3.21,"wt":3.57,"qsec":15.84,"am":0,"carb":4,"gear":3,"vs":0},{"$row":"Merc 240D","(fitted)":5,"(response)":62,"hp":62,"cyl":4,"mpg":24.4,"disp":146.7,"drat":3.69,"wt":3.19,"qsec":20,"am":0,"carb":2,"gear":4,"vs":1},{"$row":"Merc 230","(fitted)":5,"(response)":95,"hp":95,"cyl":4,"mpg":22.8,"disp":140.8,"drat":3.92,"wt":3.15,"qsec":22.9,"am":0,"carb":2,"gear":4,"vs":1},{"$row":"Merc 280","(fitted)":7,"(response)":123,"hp":123,"cyl":6,"mpg":19.2,"disp":167.6,"drat":3.92,"wt":3.44,"qsec":18.3,"am":0,"carb":4,"gear":4,"vs":1},{"$row":"Merc 280C","(fitted)":7,"(response)":123,"hp":123,"cyl":6,"mpg":17.8,"disp":167.6,"drat":3.92,"wt":3.44,"qsec":18.9,"am":0,"carb":4,"gear":4,"vs":1},{"$row":"Merc 450SE","(fitted)":11,"(response)":180,"hp":180,"cyl":8,"mpg":16.4,"disp":275.8,"drat":3.07,"wt":4.07,"qsec":17.4,"am":0,"carb":3,"gear":3,"vs":0},{"$row":"Merc 450SL","(fitted)":11,"(response)":180,"hp":180,"cyl":8,"mpg":17.3,"disp":275.8,"drat":3.07,"wt":3.73,"qsec":17.6,"am":0,"carb":3,"gear":3,"vs":0},{"$row":"Merc 450SLC","(fitted)":11,"(response)":180,"hp":180,"cyl":8,"mpg":15.2,"disp":275.8,"drat":3.07,"wt":3.78,"qsec":18,"am":0,"carb":3,"gear":3,"vs":0},{"$row":"Cadillac Fleetwood","(fitted)":12,"(response)":205,"hp":205,"cyl":8,"mpg":10.4,"disp":472,"drat":2.93,"wt":5.25,"qsec":17.98,"am":0,"carb":4,"gear":3,"vs":0},{"$row":"Lincoln Continental","(fitted)":12,"(response)":215,"hp":215,"cyl":8,"mpg":10.4,"disp":460,"drat":3,"wt":5.424,"qsec":17.82,"am":0,"carb":4,"gear":3,"vs":0},{"$row":"Chrysler Imperial","(fitted)":14,"(response)":230,"hp":230,"cyl":8,"mpg":14.7,"disp":440,"drat":3.23,"wt":5.345,"qsec":17.42,"am":0,"carb":4,"gear":3,"vs":0},{"$row":"Fiat 128","(fitted)":4,"(response)":66,"hp":66,"cyl":4,"mpg":32.4,"disp":78.7,"drat":4.08,"wt":2.2,"qsec":19.47,"am":1,"carb":1,"gear":4,"vs":1},{"$row":"Honda Civic","(fitted)":4,"(response)":52,"hp":52,"cyl":4,"mpg":30.4,"disp":75.7,"drat":4.93,"wt":1.615,"qsec":18.52,"am":1,"carb":2,"gear":4,"vs":1},{"$row":"Toyota Corolla","(fitted)":4,"(response)":65,"hp":65,"cyl":4,"mpg":33.9,"disp":71.1,"drat":4.22,"wt":1.835,"qsec":19.9,"am":1,"carb":1,"gear":4,"vs":1},{"$row":"Toyota Corona","(fitted)":5,"(response)":97,"hp":97,"cyl":4,"mpg":21.5,"disp":120.1,"drat":3.7,"wt":2.465,"qsec":20.01,"am":0,"carb":1,"gear":3,"vs":1},{"$row":"Dodge Challenger","(fitted)":11,"(response)":150,"hp":150,"cyl":8,"mpg":15.5,"disp":318,"drat":2.76,"wt":3.52,"qsec":16.87,"am":0,"carb":2,"gear":3,"vs":0},{"$row":"AMC Javelin","(fitted)":11,"(response)":150,"hp":150,"cyl":8,"mpg":15.2,"disp":304,"drat":3.15,"wt":3.435,"qsec":17.3,"am":0,"carb":2,"gear":3,"vs":0},{"$row":"Camaro Z28","(fitted)":14,"(response)":245,"hp":245,"cyl":8,"mpg":13.3,"disp":350,"drat":3.73,"wt":3.84,"qsec":15.41,"am":0,"carb":4,"gear":3,"vs":0},{"$row":"Pontiac Firebird","(fitted)":11,"(response)":175,"hp":175,"cyl":8,"mpg":19.2,"disp":400,"drat":3.08,"wt":3.845,"qsec":17.05,"am":0,"carb":2,"gear":3,"vs":0},{"$row":"Fiat X1-9","(fitted)":4,"(response)":66,"hp":66,"cyl":4,"mpg":27.3,"disp":79,"drat":4.08,"wt":1.935,"qsec":18.9,"am":1,"carb":1,"gear":4,"vs":1},{"$row":"Porsche 914-2","(fitted)":5,"(response)":91,"hp":91,"cyl":4,"mpg":26,"disp":120.3,"drat":4.43,"wt":2.14,"qsec":16.7,"am":1,"carb":2,"gear":5,"vs":0},{"$row":"Lotus Europa","(fitted)":5,"(response)":113,"hp":113,"cyl":4,"mpg":30.4,"disp":95.1,"drat":3.77,"wt":1.513,"qsec":16.9,"am":1,"carb":2,"gear":5,"vs":1},{"$row":"Ford Pantera L","(fitted)":14,"(response)":264,"hp":264,"cyl":8,"mpg":15.8,"disp":351,"drat":4.22,"wt":3.17,"qsec":14.5,"am":1,"carb":4,"gear":5,"vs":0},{"$row":"Ferrari Dino","(fitted)":8,"(response)":175,"hp":175,"cyl":6,"mpg":19.7,"disp":145,"drat":3.62,"wt":2.77,"qsec":15.5,"am":1,"carb":6,"gear":5,"vs":0},{"$row":"Maserati Bora","(fitted)":15,"(response)":335,"hp":335,"cyl":8,"mpg":15,"disp":301,"drat":3.54,"wt":3.57,"qsec":14.6,"am":1,"carb":8,"gear":5,"vs":0},{"$row":"Volvo 142E","(fitted)":7,"(response)":109,"hp":109,"cyl":4,"mpg":21.4,"disp":121,"drat":4.11,"wt":2.78,"qsec":18.6,"am":1,"carb":2,"gear":4,"vs":1}]

  
  //sort our data by fitted or the assigned group
  data = data.sort(function(a,b){
    return d3.ascending(a["(fitted)"],b["(fitted)"])
  });
  

  var colorgen = d3.scale.category10();
  var colors = {};
  data.map(function(d,i){
    colors[d["(fitted)"]] = colorgen(d["(fitted)"])
  });
  var color = function(d) { return colors[d["(fitted)"]]; };
    

  var parcoords = d3.parcoords()("#par_container")
      .color(color)
      .alpha(0.4)
      .data(data)
      //.bundlingStrength(0.8) // set bundling strength
      //.smoothness(0.15)
    	//.bundleDimension("rtn_rank")
    	.showControlPoints(false)
      .margin({ top: 100, left: 150, bottom: 12, right: 20 })
      .render()
      .brushable()  // enable brushing
      .reorderable()
      .interactive()  // command line mode
      
  //remove rownames (first) label for axis
  d3.select(".dimension .axis > text").remove();

  //highlight paths on hover of rownames / label
  d3.selectAll("#par_container > svg > g > g:nth-child(1) > g.axis > g > text")
    .on("mouseover", highlight )
    .on("mouseout", unhighlight )
    .style("fill",function(d){
      return colors[d];
    })

  function highlight(e){
    var that = this;
    
    var tohighlight = data.filter(function(row){
        return row["$row"] == d3.select(that).datum();
      });
      
    parcoords.highlight(
      tohighlight
    );
  }
  
  function unhighlight(e){
    var that = this;
    parcoords.unhighlight(
      data.filter(function(row){
        return row["$row"] == d3.select(that).datum();
      })
    );
  }
  
  
  introJs().start();
  
  
  // add interactivity for the node information to query / brush the parcoords
  // we classed these as querynode
  // however ignore root by doing nth-child(n+2)
  d3.selectAll("#partykit_info > .querynode:nth-child(n+2) ")
    .style("cursor","pointer")
    .on("click",queryNode)
    .each(function(d){
      var that = d3.select(this);
      that.datum(getQuery( that ).split(/[<,>,=]/)[0].replace(/\s/g,""));
      return that;
    })
    
  function queryNode(){
  
    var node = d3.select(this)
    
    var queried = !node.classed("queried")
    
    // get the query
    var q = getQuery(node);
    
    if(queried){
      // to eliminate extra css do the bolding here
      // eventually though move to css style file
  
      drawBrush( q );

      node
          .style("font-size","125%")
          .style("font-weight","bold")        
          .classed("queried",queried)
      
    } else {
      // clear the query
      
      node
          .style("font-size","")
          .style("font-weight","")
          .classed("queried",queried)          
          
      clearBrush( q )
    }
  }
  
  // function to strip the query out of the text
  function getQuery( s ){

    // for now text will be the text contained in the span
    // we'll use some regex to strip out the query
    var q = s.text().replace(/\|/g,"").split(/[\],:]/)[1]
    
    return q;
  }
  
  function drawBrush( q ){
  
    // our variable will be before <,>,=
    var queryVar = q.split(/[<,>,=]/)[0].replace(/\s/g,"");
    
    // if brush already defined on this variable then remove it
    //  actually just remove the queried class and style
    //  new brush will supersede old brushed points
    // not ideal behavior but joint brushes will get very complex
    d3.selectAll("#partykit_info > .querynode:nth-child(n+2) ").filter(function(d){
      return d == queryVar
    }).style("font-size","")
      .style("font-weight","")
      .classed("queried",false)
      
    
    var queryBrush = parcoords.yscale[queryVar].brush
        .on("brushstart", function() {});

    // define our brush extent to be from the split up or down to top of axis
    // if we find a < then draw down so extent min will be bottom of axis
    // and extent max will be our condition
    if(q.match(/</)){
      queryBrush.extent([ 
        parcoords.yscale[queryVar].domain()[0] ,
        q.split(/[<,>,=,:]/).slice(+q.split(/[<,>,=,:]/).length-1)[0].replace(/\s/g,"")
      ])
    } else {
      queryBrush.extent([
        q.split(/[<,>,=,:]/).slice(+q.split(/[<,>,=,:]/).length-1)[0].replace(/\s/g,""),
        parcoords.yscale[queryVar].domain()[1] 
      ])
    }
    

    // now draw the brush to match our extent
    // use transition to slow it down so we can see what is happening
    // remove transition so just d3.select(".brush") to just draw
    queryBrush(d3.selectAll(".brush").filter(function(b){return b == queryVar}).transition());

    // now fire the brushstart, brushmove, and brushend events
    // remove transition so just d3.select(".brush") to just draw
    queryBrush.event(d3.selectAll(".brush").filter(function(b){return b == queryVar}).transition())
  }
  
  function clearBrush( q ){
  
    // our variable will be before <,>,=
    var queryVar = q.split(/[<,>,=]/)[0].replace(/\s/g,"");
    
    var queryBrush = parcoords.yscale[queryVar].brush
        
    queryBrush.extent([parcoords.yscale[queryVar].domain()[1],parcoords.yscale[queryVar].domain()[1]])
    
    // now draw the brush to match our extent
    // use transition to slow it down so we can see what is happening
    // remove transition so just d3.select(".brush") to just draw
    queryBrush(d3.selectAll(".brush").filter(function(b){return b == queryVar}).transition());
    
    
    // now fire the brushstart, brushmove, and brushend events
    // remove transition so just d3.select(".brush") to just draw
    queryBrush.event(d3.selectAll(".brush").filter(function(b){return b == queryVar}).transition())
    
  }</script>
</div>
</body>
</html>

code.R

 library(htmltools)
library(partykit)
library(rpart)
library(pipeR)
library(rlist)
library(whisker)


#key is to define how to handle the data
#for the parallel coordinates
#expect the data = list ( partykit object, original data )
rpart_Parcoords <- function( pk = NULL, data = NULL  ){
  # transform our data in the way we would like
  # to send it to JSON and plug into our template
  # since this will be parallel coordinates
  # data should be in the records form (jsonlite default)
  # 1) combine the fitted data from partykit with the original data  
  #    which allow us to see which groups each row belongs
  #    we'll try to be smart and sort column order by the order of splits
  colorder = rapply(pk,unclass,how="unlist") %>>%  #unclass and unlist the partykit
    list.match("varid") %>>%                        #get all varids
    unique %>>% unlist  %>>%                        #squash into a vector
    ( names(attr(pk$terms,"dataClasses"))[c(1,.)] ) %>>%  #match them with column names
    ( unique( c(.,colnames(data) ) ) )              #get other column names from data
  # get the column name of the varids from above
  data = jsonlite::toJSON( cbind( pk$fitted, data[colorder] ) )
  
  t <- tagList(
    tags$div( style = "width:100%;"
      # try to get the split / node information to interact with the parcoords brush
      ,tags$pre( id = "partykit_info", style = "width:100%;"
         # add intro.js so people know nodes are clickable
        ,'data-step' = "1", 'data-intro' = "click on node info to query the chart below"
        ,capture.output( pk %>>% print ) %>>%
          (
            gsub(
              x = .
              , pattern = "(.*)(\\[)([0-9]*)(\\])(.*)"
              , replacement = "<span class = 'querynode'>\\1\\2\\3\\4\\5</span>"
            )
          ) %>>%
          paste0(collapse="\n") %>>% HTML
      )

      ,tags$div( id = "par_container", class = "parcoords", style = "height:400px;width:100%;" )
      
      ,tags$script(
        whisker.render( readLines("./layouts/chart_parcoords.html") ) %>>% HTML
        #whisker.render( readLines("http://timelyportfolio.github.io/rCharts_rpart/layouts/chart_parcoords.html") ) %>>% HTML
      )
    )
    
  ) %>>%
  attachDependencies(list(
    htmlDependency(
      name="d3"
      ,version="3.0"
      ,src=c("href"="http://d3js.org/")
      ,script="d3.v3.js"
    )
    ,htmlDependency(
      name="pc"
      ,version="0.4.0"
      ,src=c("href"="http://syntagmatic.github.com/parallel-coordinates/")
      ,script="d3.parcoords.js"
      ,stylesheet="d3.parcoords.css"
    )
    ,htmlDependency(
      name="intro"
      ,version="0.5.0"
      ,src=c("href"="http://cdnjs.cloudflare.com/ajax/libs/intro.js/0.5.0/")
      ,script="intro.min.js"
      ,stylesheet="introjs.css"
    )
  ))
  
  return(t)
}

#set up a little rpart as an example
rp <- rpart(
  hp ~ cyl + disp + mpg + drat + wt + qsec + vs + am + gear + carb,
  method = "anova",
  data = mtcars,
  control = rpart.control(minsplit = 4)
)

str(rp)

rpk <- as.party(rp)

#now make it a parallel coordinates
#with our rpart_Parcoords function

rpart_Parcoords( rpk, mtcars ) %>>% html_print() -> fpath
#rCharts:::publish_.gist(fpath,description="R + d3.js Parallel Coordinates of partykit ver 2 with interactive querying",id=NULL)

d3.parcoords.css

.parcoords > svg, .parcoords > canvas { 
  font: 14px sans-serif;
  position: absolute;
}
.parcoords > canvas {
  pointer-events: none;
}

.parcoords text.label {
  cursor: default;
}

.parcoords rect.background {
  fill: transparent;
}
.parcoords rect.background:hover {
  fill: rgba(120,120,120,0.2);
}
.parcoords .resize rect {
  fill: rgba(0,0,0,0.1);
}
.parcoords rect.extent {
  fill: rgba(255,255,255,0.25);
  stroke: rgba(0,0,0,0.6);
}
.parcoords .axis line, .parcoords .axis path {
  fill: none;
  stroke: #222;
  shape-rendering: crispEdges;
}
.parcoords canvas {
  opacity: 1;
  -moz-transition: opacity 0.3s;
  -webkit-transition: opacity 0.3s;
  -o-transition: opacity 0.3s;
}
.parcoords canvas.faded {
  opacity: 0.25;
}
.parcoords {
	-webkit-touch-callout: none;
	-webkit-user-select: none;
	-khtml-user-select: none;
	-moz-user-select: none;
	-ms-user-select: none;
	user-select: none;
}