Category Archives: socialnetworkanalysis

Scale-free networks

A scale-free network shows a power law distribution where there is a predictable imbalance. The 80/20 Rule is an example of a power law distribution: 20% of the population holds 80% of the wealth (except it doesn’t seem to be these numbers anymore).

In a scale-free social network, a few nodes have a high number of connections and most have a low number of connections. The few nodes become “hubs” in the network (as seen by the degree distribution in the graph on the right). In a random social network,  most nodes have an average number of connections with a wider degree distribution (as seen in the graph on the left).

Internet evangelists, such as Clay Shirky, have heralded the potential for every consumer to also act as a producer of online content, resulting in a newfound democratic medium. However, power law distributions in online networks present a significant obstacle to this democratic ideal. Shirky would argue that power law affects large networks, such as the the most popular news sites in the public sphere, but may not affect small networks, such as family and friends reading your blog. If you’re hoping to follow Adriana Huffington’s example with your blog, the likelihood of becoming the next Huffington Post is slim because the existing top news blogs have a significant advantage of widespread recognition and reputation. (For example, when you read the previous, you’ve probably heard of the Huffington Post before.)

Next, think about how many blogs we might read on a regular basis. This number is finite, although the exact number might be higher or lower depending on our responsibilities.  Meanwhile, the number of blogs has grown dramatically. For example, Tumblr reported 357.7 million blogs in July 2017.

Prior posts have discussed the conscious and unconscious influence of our networks on our choices. Therefore, we’re more likely to read blogs that our friends read. Multiply this line of reasoning by millions of individuals and millions of existing blogs. The result is the power law distribution in blogs. A few blogs, such as the Huffington Post and others in the chart below, end up with more power than others because of their high readership.

So, the Internet reflects similar inequalities in social media networks as offline networks.  Melvin Kranzberg’s first law of technology applies here as a good reminder: “Technology is neither good nor bad; nor is it neutral.” The technology (power of networks in this case) does not have an inherent ethical value because it’s the users of the technology that determine the ethical tone of the application, which can certainly be labeled as good or bad.

Social network analysis for family

While social network analysis has been useful in examining complex relationships between large institutions (such as universities) or ideas (such as predicting disruptive technologies), this week’s discussion focuses on more personal matters. In Connected, Christakis argued our social networks influenced happiness or weight. Lois (2016) identified four types of egocentric networks through cluster analysis to predict when couples became parents. From prior research, the four factors supporting the influence of social networks on fertility include: 1) social pressure, 2) social support, 3) emotional contagion, and 4) social learning. His research question is whether his four types of networks identified through have predictive validity.

Lois examined six years of data (in waves) from a sample spanning three birth cohorts, narrowed down to 3,104 respondents between ages 20-42 that initially had no children from the German Family Panel Study dataset. Out of this sample, 332 respondents (the egocentric nodes in the networks) became pregnant or had children. Each of these individuals responded to questions to generate names about their social network (other nodes) and then interpret their relationships (forming edges in the network).

He found four types of clusters within the network: family-remote (small, homogenous friend network), polarized (large, heterogeneous friends and family network), disintegrated (small network without many connections), and family-centered (strong connections with family). The results show the most significance in social mechanisms (the four factors supporting the influence of social networks mentioned earlier) between the family-remote and family-centered types. However, the author notes a significant limitation as the effect (having children) may be caused by other factors (such as age) than networks. Social network analysis helped reinforce the idea that our network can influence life choices, particularly in the area of starting a family.

In addition, I also would argue that this study applies the most to Germans. For example, the following chart shows the percentages of household types:

In comparison:

The differences may be statistically significant and cause different kinds of clusters to be seen in networks that likely would affect the analysis using information from another country.

In contrast, I chose to look at an second article on the other end of the spectrum. Burholt and Dobbs (2014) studied the support networks for elderly individuals in multigenerational/extended households to identify network typologies. The authors believed that the presence of other family might skew researchers’ ability to using network typologies to estimate levels of wellbeing in elderly individuals, so they wanted to identify if these multigenerational/extended households resulted in different types of networks as their primary research question. Their secondary question asked whether these typologies had the ability to predict outcomes (wellbeing or loneliness/isolation).

The authors collected a total sample of 590 older individuals (half male/half female) from the Families and Migration: Older People from South Asia project. This project had collected data using eight questions to support classification by the Wenger Support Network Typology: local family-dependent network, locally integrated network (local family and community involvement), local self-contained network (household centered), wider community-focused network (no local family and more community involvement), and private restricted network (no local family and few other connections). The elderly individual’s family and friends became the nodes with the various supportive relationships (from selected questions about chatting or helping with laundry) as the links.

Using cluster analysis, the authors selected a four cluster result as the most clear and interpretable: 28% multigenerational households: older integrated networks, 27% multigenerational households: younger family networks (largest type of household and more family-focused), ~27% family and friends integrated networks, and 18% restricted non-kin networks. When the authors compared the Wenger types with their new types, they found that significantly more individuals fell within their “restricted non-kin networks” (18%) than the private restricted network (4%). As a result, they concluded that their new typologies better identified individuals who might receive formal services because they lack other forms of support. Social network analysis helped identify a larger vulnerable population than expected because the restricted non-kin networks might decline family assistance and be more willing to pay for outside services.



Technologists look for “disruptive technologies” or something that dramatically causes a paradigm shift. For example, smartphones have largely replaced “traditional” cell phones (chart below is older and stops in 2011). Smartphones can be considered “disruptive” because they enabled widespread mobile internet use, social media, and video calls (see second chart below), which was difficult to use or nonexistent for traditional cell phones.

A. Momeni and K. Rost (2015) examined trends to predict potential disruptive technologies in the photovoltaic industry (VCU library proxy link to article). The research question: can patent-development paths, k-core analysis and topic modeling be used to better predict which technologies might become the next disruptive technologies?  Other previous methodologies had significant limitations for forecasting technological change.

The authors collected patent data from the European Patent Office (EPO) World- wide Patent Statistical Database between 1978-2012.  They used a keyword search for “photovoltaic” and “solar cell,” then cleaned up the data by selecting all hits from a certain patent classification. They also collected the extended patent family to consider all possible innovations and their citations. The final dataset had 9,328 patents.

From this data, the authors constructed a network of patents (nodes) and citations (edges). They selected the largest connected subnetwork (5,029 nodes) and then traced a patent development path based on the citation directionality, which resulted in 735 highly cited patents. Next, the authors performed k-core analysis to identify three subnetworks in the remaining nodes that corresponded with three different technological developments: thin-film, organic, and crystalline silicon (see below). Also, they analyzed the networks based on subset of years and found trends in the convergence of technologies.

The authors used the results to predict change in the industry for on the most rapidly growing technology based on the most highly cited patents. They also identified “hidden” technologies within each subnetwork that might become the disruptive technology.

While the paper presented a positive outcome, the authors did warn of several limitations. For example, their analysis depended on inventors seeking patents to be included in the sample. They also suggested that their method needs to be applied to other industries for testing.



Habermas and Castells

Jurgen Habermas defined the public sphere as a separate space where informed people debate social and political issues, form public opinion, and influence the state and society. In a democratic society, the public sphere ideally allows for everyone to have access to information and be able to participate equally in discussions. His vision allows for an open public sphere, although the reality might constrain participation for certain segments of society who may not have enough ability or resources. In the past, this became evident in the dominance of the bourgeoisie who came to salons and coffeehouses to discuss societal issues, which largely excluded the working class and sometimes women.

Manuel Castells declared that society has moved from the Industrial Revolution (production of material goods) to the Information Age (knowledge economy). The network society has been enabled by current technologies (such as smartphones and internet). Communication is based on an open structure network, which breaks down some of the traditional social hierarchies and national borders because information flows almost anywhere (China’s state censorship might be a notable exception). Different participants might have different value within the network, such as highly connected individuals.

Castell’s theory operates within the idea of the public sphere by somewhat eliminating time and space. Electronic communication is instantaneous and possible with anyone across the world. It also could used to communicate with individuals or communities, which could known or unknown. However, his perception of “timeless time” may seem like digital networks allow for disruption of the flow of linear time, but I would argue that multi-tasking is not new or unique to the digital age. Time even may gain linear importance in terms of “keeping up” with the latest news and trends. For Twitter, a single tweet might get lost among 6,000 tweets a second if a user doesn’t have many followers (network connectedness) or a particular hashtag isn’t trending.  In addition, network theory is compatible with traditional local and in-person networks.

The major effect from the network society has been increased participation and access to information. Want a graduate degree? Take online classes. The federal government has piloted public participation in coding through Github. Politicians get fewer letters and phone calls from constituents, but more emails and contacts from social media. The flow of digital information has increased from a river to a flood, which may be the greatest downside to the digital revolution.  Now someone might be able to search online for health symptoms and get hundreds of possibilities from various websites. Dr. Google will present mild possibilities from the common cold to deadly illnesses along with suggestions for folk remedies. Which source do you trust: the Mayo Clinic (based on their brick-and-mortar reputation) or the Wellness Mama blogger?

Castell is onto something that others have suggested: the form of the media matters. Habermas appears mostly concerned about the ability of the mass media to inform the public sphere and act as a good intermediary. Marshall McLuhan (infographic below) also argues that the medium fundamentally affects our ability to communicate. For example, the “tribal era” is characterized by an oral tradition of memorization and listening to storytelling, which is limited to a local community. The print era allows for the dissemination of more materials, but actually limits communication to a one-way exchange of ideas (from print to reader). Television expands the capability of the print era in being able to reach an even larger audience with only slightly more interaction than print (such as telephone interviews or arranging live appearances). The digital age finally expands the ability for participation: either one to one, one to many, or many to many. This is the root cause of why the digital age seems so remarkable to Castell.