GitHub Data

Previously we have discussed how the data via Facebook, Twitter, web scraping, Wikipedia, and Linkedin is utilized for social research. This week’s readings and videos gave us an enhanced look at how GitHub is transforming the open source programming world. GitHub is a web-based hosting platform for Git repositories. In their paper Connecting Theory to Social Technology Platforms: A Framework for Measuring Influence in Context, Goggins and Petakovik explain GitHub as a “social site based on distributed software configuration management in Git”. They further explain that GitHub has his roots in socially connected computing where people collaborate to build professional credibility and make a contribution to a project on which they rely for their work. On GitHub, people participate by discussing the issues, code attributions, comments on pull requests, and documentation.

In his TedTalk , Clay Shirky discusses the optimistic implications of open-source programming as well as GitHub on democracy. He claims that GitHub isn’t just any collaborative platform, it creates new social norms through its unique collaborative nature and core operation. Shirky calls the GitHub’s benefits as “co-operation without co-ordination” and that’s what makes GitHub stand out from the traditional version of control systems. He claims that such impressive technology will one day transform how the government works. In 2012, in his blog titled “GitLaw: GitHub for Laws and Legal Documents – a Tourniquet for American Liberty“, a blogger named Abe Voelker discussed how the platform like Github can be used for laws and legal documents. He writes, “Imagine a public system like GitHub but instead of source code being tracked, legal documents such as bills/laws are tracked (and just like GitHub, versioned in git). Imagine if, before any bill is introduced to Congress, its contents were posted on this publicly available medium with adequate time before a vote?” That indeed is an interesting proposal. I can’t imagine the world where citizens are writing the legislations and bills or contributing to making changes in already existing laws via a platform like GitHub. How would we even ever reach an agreement? He also writes more about his version of legal GitHub: ‘GitLaw’, he notes that with such system in place, we will be able to track each and every line in the bill and who wrote it as GitHub assigns a unique identifying number to each user. It may be hard to visualize, but what if we do have such system? What if there was a platform like GitHub where all the citizens can contribute, co-operate, and collaborate on new legislation and bills?

In 2014, a study Exploring the Patterns of Social Behavior in GitHub explored the patterns of social behavior in GitHub. The researchers of this study, compared the growth curves of project and user in GitHub with three traditional open source software communities to explore differences of their growth modes. In their analysis they found the rapid growth in GitHub users, to describe this phenomenon, they introduce the sociological basis with the Diffusion of Innovation Theory which explores how, why, and at what rate new ideas and technology spread. In their study, researchers also construct a follow-networks according to the developer’s behavior in following each other in GitHub. In their findings, they illustrate four social behavior patterns among developers who got GitHub account during the rapid growth period. The patterns are linked below:


Source: Exploring the Patterns of Social Behavior in GitHub

Independence Pattern: The independence pattern indicates that a developer use Github as a traditional way and he always only links up with acquaintances. He just hosts his code or watches an interesting project but rarely makes a contribution to it.

Group Pattern: The group-pattern is often formed by a group developers who collaborate with each other to develop the same project.

Start Pattern: The star pattern indicates that a famous man (or a team) is followed by a large number of users but he almost never pays any attention to others.

Hub-Pattern: The hub pattern indicates that user follows many irrelevant developers but almost never be followed by others.

Being new to using GitHub, I am impressed by its collaborative capabilities. I am still new to learning how I can utilize GitHub for social science research. Russel says, “think of GitHub as an enabler of open source software”, I am looking forward to getting more familiar with social coding, social web mining, graphical modeling, and interest graphs.


Leave a Reply

Your email address will not be published. Required fields are marked *