Recreational Data Visualization

As is probably painfully obvious to anyone who has met me, have a lot of “Rubik’s” cubes (hardly any of my ‘speedcubes’ and twisty puzzles are actually Rubik’s brand). In my spare time, I practice solving a basic 3x3x3 speedcube like this one.  My general goal is to get faster. My mean solve time these days is around 26 or 27 seconds. Given that the current world record is 4.59s and the world record average* is 5.80s, I still have plenty of room for improvement.

In an attempt to track my progress, I have been keeping a spreadsheet of all of my solves and what date the solve occurred on.  I have some very basic charts in the spreadsheet tracking overall distribution and a time trend. I just thought I’d mention this here because, although not social science in any way, shape, or form, I may still play around with this data as I am learning data visualization tools. I’m always happy for feedback on my personal projects, so if there are any interesting ideas for analyses or visualizations pertaining to this data, I’d love to hear them.


*In speedcubing competitions, a competitor’s “average” is a trimmed mean, averaging the middle 3 of 5 solves by ignoring the fastest and slowest solve.

Visualizations and Data Scraping Update

My data is coming along well, albeit a bit slowly. I have finished what I can using excel to organize and parse out the data from the NOT A FLUKE site itself. The finished output (that I have yet to input into my coding tool) can be found here in the left half of the ‘processed 2’ sheet. The one remaining step is to incorporate the Carnegie Classification Listings so that I have geographic data as well as private/public, profit/nonprofit, and general school classification. It is unlikely that I will be able to incorporate variables that require hand-coding (specifics of misconduct, faculty relationship to victim, victim characteristics) this semester.

Now, what do I want to do with the data I have? How can it be visualized? Here are some research questions that I would like to explore through data visualizations:


How does faculty outcome vary geographically?

How does faculty outcome vary by faculty position in the university (president, dean, department chair, professor, etc)?

How does faculty position vary by public/private or by profit/nonprofit (or both)? – This could be getting at overall transparency and accountability of different kinds of institutions.

How does faculty outcome by year? Is there a change after the Obama era Title IX reform?


For faculty outcome, it would be easiest to code it into a binary variable “retention” denoting whether they stayed with the university or not. For my full thesis, I intend to do a somewhat more complicated analysis of faculty outcomes.


These are some of my thoughts at the moment. I will have an update on the data and tests with visualizing it later this week.