Ch 9
End-of-Chapter Problems
Q1 For each of the following centrality measures discuss the appropriateness of analyzing both directed and undirected binary networks.
a. Degree centrality
For undirected networks, degree centrality is simply the number of ties a node has. This is a useful property to know.
For directed networks we can measure outdegree (the number of ties from a node to others) and indegree (the number of ties from others to the node). Both are useful and easily interpreted. If the ties mean 'seeks advice from', then outdegree is the number of different nodes a node seeks advice from, and indegree is the number of people who seek out the node for advice. Often, indegree is more interesting.
b. Betweenness centrality
Betweenness applies to both directed and undirected networks. For undirected networks, betweenness is (loosely-speaking) the number of times a node is along the shortest path between two nodes, where we consider all pairs of nodes. For directed networks, the meaning is the same: we need only take care to respect the direction of ties. For example, if A-->B-->C, then B gets a point for being between A and C. But if A-->B<--C, B does not get a point, because there is no directed path from A to C.
c. Eigenvector centrality
In undirected networks, eigenvector centrality can be thought of as a measure of popularity, where popularity means being connected to many people who are themselves popular. For directed data, we can define left eigenvector centrality, in which a node has a high score if they receive many ties from people who themselves have high scores, and right eigenvector centrality, in which a node has a high score if they send ties to many people who are themselves high scorers. In practice, left and right eigenvectors are often uninterpretable or consist of complex numbers. For directed data, it is better to use beta centrality.
d. Closeness centrality
For undirected networks, closeness centrality (specifically, Freeman's version) is the sum of distances from a node to all others. A high score can be interpreted in terms of being dependent on intermediaries, or in terms of how long it takes for something flowing from a node to reach all others (and vice-versa). For directed networks we can interpret out-closeness in terms of how long things take to get from a node to all others, and in-closeness as how long it takes things to reach a node from all others. In practice, there are problems measuring closeness in directed networks because often there is no directed path from a given node to another. This then requires implementing repair strategies such as assigning an artificial large value to the distance to an unreachable node, or calculating reciprocal distances and assigning zeros to unreachable pairs.
e. Beta centrality
Out-beta centrality can be interpreted as the total number of walks from a node to all others, weighted inversely by their length. In-beta centrality can be interpreted as the total number of walks from all nodes to a given node, again weighted inversely by their length. In undirected networks, as beta approaches 1/lambda (where lambda is the largest eigenvalue of the adjacency matrix) from below, beta centrality approaches eigenvector centrality. Thus, beta centrality is nicely interpretable for both directed and undirected networks.
Q2 Consider the Hawthorne bank wiring room games network, which is a binary, undirected graph of who plays games with whom among a set of workers. Calculate each of the following centrality measures.
a. Degree centrality
b. Closeness centrality
c. Eigenvector centrality
d. Beta centrality, with b parameters = 0, 0.1 and 0.18.
e. Betweenness centrality
#get the data
games = Hawthorne_BankWiring$Games
#calc measures
degree = xDegreeCentrality(games)[,1]
closeness = xClosenessCentrality(games)[,1]
eigenvec = xEigenvectorCentrality(games)[,1]
beta = xBetaCentrality(games,Beta=c(0,0.1,0.18))
between = xBetweennessCentrality(games)[,1]
#display side-by-side
meas = cbind(degree,closeness,eigenvec,beta,between)
meas
degree closeness eigenvec Beta.0 Beta.0.1 Beta.0.18 between
I1 4 37 0.30665931 4 8.879110 87.42515 0.0000000
I3 0 65 0.00000000 0 0.000000 0.00000 0.0000000
W1 6 30 0.41691182 6 12.669063 119.70150 7.5000000
W2 5 36 0.36535214 5 10.783908 104.36858 0.5000000
W3 6 30 0.41691182 6 12.669063 119.70150 7.5000000
W4 6 30 0.41691182 6 12.669063 119.70150 7.5000000
W5 5 27 0.32327806 5 10.736704 94.96027 60.0000000
W6 3 37 0.02878007 3 5.145091 16.46479 0.0000000
W7 5 29 0.08492990 5 8.407072 35.15675 56.6666667
W8 4 36 0.03337021 4 6.521917 19.82383 0.6666667
W9 4 36 0.03337021 4 6.521917 19.82383 0.6666667
S1 5 31 0.36800266 5 10.952780 105.51800 3.0000000
S2 0 65 0.00000000 0 0.000000 0.00000 0.0000000
S4 3 37 0.02878007 3 5.145091 16.46479 0.0000000
cor(meas)
degree closeness eigenvec Beta.0 Beta.0.1 Beta.0.18 between
degree 1.0000000 -0.9408099 0.7726720 1.0000000 0.9753798 0.8167369 0.3147072
closeness -0.9408099 1.0000000 -0.5894508 -0.9408099 -0.8781057 -0.6454806 -0.4044009
eigenvec 0.7726720 -0.5894508 1.0000000 0.7726720 0.8922781 0.9973232 0.1207582
Beta.0 1.0000000 -0.9408099 0.7726720 1.0000000 0.9753798 0.8167369 0.3147072
Beta.0.1 0.9753798 -0.8781057 0.8922781 0.9753798 1.0000000 0.9228719 0.2700887
Beta.0.18 0.8167369 -0.6454806 0.9973232 0.8167369 0.9228719 1.0000000 0.1473599
between 0.3147072 -0.4044009 0.1207582 0.3147072 0.2700887 0.1473599 1.0000000
Notes:
beta.0 = degree
beta.0.18 = eigenvector centrality
betweenness centrality is the measure least similar to all the others.
Q3 Create four visualizations of the Hawthorne bank wiring room games network. In each of the visualizations make the size of the nodes proportional to the value of each of the four centrality measures (degree, closeness, eigenvector, betweenness). Compare and contrast the differences and similarities of the measures across the four visualizations. From a social and behavioral perspective, how might you interpret these comparisons?
#a function for rescaling attributes when sizing nodes
rescale<-function(vec, dmin=0, dmax=1)
{
amax <- max(vec,na.rm=T)
amin <- min(vec,na.rm=T)
rng <- amax - amin
normed <- dmin + (dmax-dmin)*(vec-amin)/rng
return(normed)
}
#a function to package up the drawing commands
draw<-function(amat, sizeby=NULL, gm="graph", crd=NULL)
{
sizeby<-rescale(sizeby,.5,3)
par(mar=c(0,0,0,0))
crd <- sna::gplot(amat, jitter = F, mode="kamadakawai", displaylabels = T,
label.cex = .9, gmode=gm,
vertex.cex=sizeby, coord=crd)
return(crd)
}
#draw the plots in a 4-by-4 square
par(mfrow=c(2,2))
#draw the networks!
draw(games,degree)
draw(games,-closeness)
draw(games,eigenvec)
draw(games,between)
The pictures show the network has two groups (plus two isolates).
Top left: Degree. We can see the nodes in the bottom left group have higher degree than the other group
Top right: Closeness. Results are similar to degree, but the two bridging nodes get a little extra weight
Bottom left: Eigenvector. As if in an echo chamber, the measure exaggerates the difference in centrality for the points in the two groups
Bottom right: Betweenness. The bridging nodes in the middle get all the points.
Q4 Using the advice network in the Krackhardt high-tech managers dataset, provide an analysis of node centrality for each of the measures below. The network is directed and dichotomous.
Indegree and outdegree centrality. Create visualizations of the advice network, with one having node size proportional to indegree centrality and one with node size proportional to outdegree centrality. Compare the results.
Incoming and outgoing k-reach centrality, with k = 1, 2, 3
Beta reach centrality for outgoing ties, with b parameters = 0, 0.2, 0.4 and 0.6. Interpret the results.
Beta centrality – both ‘in’ and ‘out’
#get data
advice = Krackhardt_HighTech$Advice
#get degree measures
outdegree = rowSums(advice)
indegree = colSums(advice)
#format plots
par(mfrow=c(2,1))
#draw sized by indegree, and save coordinates
crd = draw(advice,indegree,"digraph")
#draw sized by outdegree, using saved coordinates
draw(advice,outdegree,"diagraph",crd)
Top: Indegree. There's a node in the center with very large indegree.
Bottom: Outdegree. That same node has shrunk to almost nothing -- seeks advice from few people
#calc outgoing and incoming k-reach centrality
outreach = xReachCentrality(advice,kReach = c(1,2,3))
inreach = xReachCentrality(t(advice),kReach = c(1,2,3))
#First three are outgoing, last three are incoming
cbind(outreach,inreach)
kReach.1 kReach.2 kReach.3 kReach.1 kReach.2 kReach.3
A01 6 20 20 13 17 20
A02 3 12 20 18 20 20
A03 15 20 20 5 19 20
A04 12 20 20 8 20 20
A05 15 20 20 5 15 20
A06 1 11 20 10 20 20
A07 8 20 20 13 20 20
A08 8 20 20 10 20 20
A09 13 20 20 4 15 20
A10 14 20 20 9 16 20
A11 3 12 20 11 19 20
A12 2 12 20 7 19 20
A13 6 20 20 4 15 20
A14 4 20 20 10 20 20
A15 20 20 20 4 15 20
A16 4 19 20 8 17 20
A17 5 15 20 9 20 20
A18 17 20 20 15 20 20
A19 11 20 20 4 15 20
A20 12 20 20 8 19 20
A21 11 20 20 15 20 20
Every node can reach every other in three steps, and every node can be reached by every other in three steps. A15 can reach everyone in 1 step, which means A15 seeks advice from everyone.
xBetaReachCentrality(advice,Beta=c(0,.2,.4,.6,1))
Beta.0 Beta.0.2 Beta.0.4 Beta.0.6
A01 6 8.80 11.60 14.40
A02 3 5.12 7.88 11.28
A03 15 16.00 17.00 18.00
A04 12 13.60 15.20 16.80
A05 15 16.00 17.00 18.00
A06 1 3.36 6.44 10.24
A07 8 10.40 12.80 15.20
A08 8 10.40 12.80 15.20
A09 13 14.40 15.80 17.20
A10 14 15.20 16.40 17.60
A11 3 5.12 7.88 11.28
A12 2 4.32 7.28 10.88
A13 6 8.80 11.60 14.40
A14 4 7.20 10.40 13.60
A15 20 20.00 20.00 20.00
A16 4 7.04 10.16 13.36
A17 5 7.20 9.80 12.80
A18 17 17.60 18.20 18.80
A19 11 12.80 14.60 16.40
A20 12 13.60 15.20 16.80
A21 11 12.80 14.60 16.40
When beta = 0, beta reach centrality gives you (out)degree. Nodes reached with paths longer than 1 link are not counted at all. When beta = 0.2, the measure counts all of the nodes a given node can reach, but only gives full weight to those it is immediately connected to. The others are severely discounted. As beta gets larger, the measure is more liberal and gives even distant nodes some weight.
Q5 Sampson (1969) collected the top three choices of liking and disliking among a set of monks in a monastery. The liking network was collected (retrospectively) for three points in time. We will use T3
Ignoring the values of choices for the liking network (i.e., dichotomizing at greater than zero), provide the k reach centrality for outgoing ties, with k = 1, 2, 3 and 4. Interpret the results.
Ignoring the values of choices for the disliking network, provide the k reach centrality for outgoing ties, with k = 1, 2, 3 and 4. Interpret the results and compare to the results for the liking network.
Finally, ignoring the values for both networks, apply the PN centrality measure and interpret the results.
#get the data
like = xDichotomize(Sampson_Monastery$LikeT3)
dislike = xDichotomize(Sampson_Monastery$Dislike)
#calc outgoing k-reach centrality for each network
krlike = xReachCentrality(like,kReach=c(1,2,3,4))
krdislike = xReachCentrality(dislike,kReach=c(1,2,3,4))
meas = cbind(krlike,krdislike)
meas
kReach.1 kReach.2 kReach.3 kReach.4 kReach.1 kReach.2 kReach.3 kReach.4
ROMUALD 4 10 16 17 0 0 0 0
BONAVENTURE 3 7 11 13 0 0 0 0
AMBROSE 3 9 12 16 3 9 13 14
BERTHOLD 3 6 10 13 4 8 11 12
PETER 3 6 9 12 3 8 11 12
LOUIS 3 9 12 16 3 10 13 13
VICTOR 3 6 10 13 3 8 11 12
WINFRID 3 6 10 11 0 0 0 0
JOHN_BOSCO 3 9 11 14 3 7 11 12
GREGORY 3 6 10 11 3 8 12 12
HUGH 3 6 10 11 3 7 11 12
BONIFACE 3 5 7 10 3 9 13 13
MARK 3 5 7 10 3 8 11 12
ALBERT 3 5 7 10 3 8 11 12
AMAND 3 11 16 16 3 8 11 12
BASIL 4 9 14 16 4 10 12 12
ELIAS 3 7 10 14 3 8 11 12
SIMPLICIUS 3 7 10 14 3 7 12 13
Somewhat improbably, the monks tend to have more 'enemies of enemies' than 'friends of friends'. It is surprising that someone like Louis can reach 10 people in 2 steps or less, given that he only has 3 direct dislikes. The remaining 7 are people who are disliked by the people he dislikes.
#calc pn centrality, combining positive and negative ties
xPNCentrality(like,dislike)
PNcentrality
ROMUALD 1.1149650
BONAVENTURE 1.0838575
AMBROSE 0.9360470
BERTHOLD 0.8861251
PETER 0.9287938
LOUIS 0.9350133
VICTOR 0.9307870
WINFRID 1.0833276
JOHN_BOSCO 0.9227497
GREGORY 0.9516707
HUGH 0.9589060
BONIFACE 0.9467080
MARK 0.9502032
ALBERT 0.9517389
AMAND 0.9340754
BASIL 0.9196967
ELIAS 0.9412390
SIMPLICIUS 0.9322612
The results show that Romuald, Bonaventure and Winfrid have, on balance, more positive ties than negative, and that their positive ties tend not to be to those with lots of negative ties, nor are their negative ties to monks with lots of positive ties.