Data mapping and iterating in D3: The 3 week Stacked Bar Graph Experiment

One phrase that comes up in most of the Stack Overflow threads I’ve seen (which is a lot; I think Stack Overflow has become my visited website since starting this course) is “the d3 way of doing things”.

The name of the game in D3.JS is array manipulation. Most if not all d3 functions, follow a similar format; tell d3 what html element to look at, tell d3 to use some array, iterate over said array and create a new html element for each entry.

This is the double edged sword of using d3. On one hand, this eliminates most of the need to create iterative javascript functions to use your dataset. A typical iterative may look something like

Function(d,i) {

For I in d, i>=0, I == i

Return d.somevalue

I = I + 1

};

Whereas in d3, the heavy lifting is done within the library. So telling d3 to iterate over some data is as simple as d3.data(yourdata). This also means that creating classes of objects with the same/similar properties is incredibly simple. For example, for a simple bar chart, appending x positioning or bar height is as simple as

myGraph.selectAll("rect")//Tell d3 to look at all "rectangles"
.data(myData)//Tell d3 what datasset to iterate
.enter().append("rect")//Put in rectangles if there aren't any
.attr(“height”, function(d) { return d.y}
.attr(“width”, function(d) { return d.x}

With d3 doing the task of iterating over the array and generating bars for each row of your data.

Where this gets hairy, however, is that d3 itself like creating arrays. And it likes creating arrays from your arrays. And then arrays of those arrays from your array.

Enter my tumultuous relationship with the d3.stack function. D3.stack is a function that slices an input array based on a key that the user provides and assigns y values so that values can be compared vertically across comparison groups. So, for example, if your data has a categorical variable that you would like to use for analysis, you could instruct d3 to create subarray of values for each other variable of interest separated by those categories. So something like

Var ageBands = [“Control”,”Experiment”]

Var stack = d3.stack(myData)

.keys(ageBands)

Would return two arrays, one for the control group and one for the experimental group, with subarrays for each other variable in the data.

D3.stack underwent some interesting changes in the migration from version 3 to version 4. The version 3 stack function would output individual rows for each other variable with y0 and y values representing the start and end y position of each bar in the “stack”.

0: [

{x: control, y0:0 , y:12}

{x: control, y0:0, y:20}

{x: control, y0:0, y:3}

]

1: [

{x: experiment, y0:12, y:

In version 4, however, new variables are not defined. The y0 and y values are represented by further subarrays of two values each, like this:

0: [

[0,12]

[0,20]

[0,3]

Key: control

]

1 : [

[12, 24]

[20,40]

[3,6]

Key:experiment

]

Additionally, each subarray has yet another array that contains the original data for that row. The nest array certainly makes for a more parsimonious and readable solution, however calling this information can become more problematic. Instead of accessing some value in a given array, the array has to be called via a function and then entries from the subarrays using code such as

.attr("height", function(d){ return d[0][0] - d[0][1]})

The nesting ultimately saves a lot of processing time, but takes a little getting used to, especially when the majority of resources on the internet, including d3’s API, cite examples using the old .stack() format.

So after three weeks of tinkering, this is what I’ve ended up with. No bells and whistles (yet), but everything displays where it should! In the javascript you can see the calls for the bar structures, as well as some broken tooltip code (which I am hoping to address today). The data comes from this study conducted by Faunalytics investigating the motivations, dietary habits, and recidivism among current and former vegans and vegetarians. Mean values were calculated using descriptive statistics functions in SPSS 24 and then written into the array manually.

New Goals for the Stacked Bar Chart Project

Tooltips are obviously the biggest missing piece here. The stacks themselves mean very little without the actual percentages. Another option would be to append text in the center of each bar with the percentage displayed, however I also don’t want too much visual information cluttering the visualization.

Axis labels are another concern, in the same vein. As of right now, the X-axis label is pulled from the variable names from the original dataset, and give very little contextual meaning about what those variables are supposed to represent. This is something I also considered addressing using tooltips, because of the limited space to write full descriptions along the axis.

There are other quality-of-life updates I would like to do with this code, however these are the big ones for right now.

 

Austin Round

 

2 thoughts on “Data mapping and iterating in D3: The 3 week Stacked Bar Graph Experiment

    1. Thanks Tom! And thanks for taking a crack at the tooltips. I got something functional on my chloropeth project; it looks like you and I took a similar approach, too.

      I’m going to revisit this bar chart project this weekend and see if I can make the legend/axis labeling more intuitive (rather than just variable names) and maybe add some animation, I’ll keep you posted.

Leave a Reply

Your email address will not be published.

Privacy Statement