For this lab, we’ll use a few different packages for data and analysis. We’re going to use the ison_algebra
dataset again, directly from the package.
suppressPackageStartupMessages(library(migraph)) # note that you may need a special version for what follows...
data("ison_algebra", package = "migraph")
The network is anonymous, but I think it would be nice to add some names, even if it’s just pretend. Luckily, I’ve added a function for this. This makes plotting the network just a wee bit more accessible:
<- to_named(ison_algebra)
ison_algebra autographr(ison_algebra)
There are actually three different types of tie here. Let’s separate them out into separate networks.
<- to_uniplex(ison_algebra, "friend_tie"))
(m182_friend #> # A tbl_graph: 16 nodes and 62 edges
#> #
#> # A directed simple graph with 3 components
#> #
#> # Node Data: 16 × 1 (active)
#> name
#> <chr>
#> 1 Elizabeth
#> 2 Maria
#> 3 Jenny
#> 4 Ollie
#> 5 Luna
#> 6 Barbara
#> # … with 10 more rows
#> #
#> # Edge Data: 62 × 3
#> from to weight
#> <int> <int> <dbl>
#> 1 2 1 1
#> 2 2 7 1
#> 3 2 8 1
#> # … with 59 more rows
<- autographr(m182_friend) + ggtitle("Friendship")
gfriend <- to_uniplex(ison_algebra, "social_tie"))
(m182_social #> # A tbl_graph: 16 nodes and 129 edges
#> #
#> # A directed simple graph with 1 component
#> #
#> # Node Data: 16 × 1 (active)
#> name
#> <chr>
#> 1 Elizabeth
#> 2 Maria
#> 3 Jenny
#> 4 Ollie
#> 5 Luna
#> 6 Barbara
#> # … with 10 more rows
#> #
#> # Edge Data: 129 × 3
#> from to weight
#> <int> <int> <dbl>
#> 1 1 5 1.2
#> 2 1 8 0.15
#> 3 1 9 2.85
#> # … with 126 more rows
<- autographr(m182_social) + ggtitle("Social")
gsocial <- to_uniplex(ison_algebra, "task_tie"))
(m182_task #> # A tbl_graph: 16 nodes and 88 edges
#> #
#> # A directed simple graph with 1 component
#> #
#> # Node Data: 16 × 1 (active)
#> name
#> <chr>
#> 1 Elizabeth
#> 2 Maria
#> 3 Jenny
#> 4 Ollie
#> 5 Luna
#> 6 Barbara
#> # … with 10 more rows
#> #
#> # Edge Data: 88 × 3
#> from to weight
#> <int> <int> <dbl>
#> 1 1 5 0.3
#> 2 1 9 0.3
#> 3 1 10 0.3
#> # … with 85 more rows
<- autographr(m182_task) + ggtitle("Task")
gtask library(patchwork)
+ gsocial + gtask gfriend
Where might innovation be most likely to occur in these networks? Let’s take a look at which actors are least constrained by their position in the task network to begin with. {migraph}
makes this easy enough with the node_constraint()
function.
node_constraint(m182_task)
#> Elizabeth Maria Jenny Ollie Luna Barbara
#> 0.930 0.877 0.664 0.987 0.611 0.609
#>
#> + 10 others
We see that this function returns a vector of constraint scores that can range between 0 and 1. Let’s size the nodes according to this score, and identify the node with the minimum constraint score.
ggidentify(m182_task, node_constraint, min)
#> Warning: This function has been included in the `autographr()` function. Please
#> run `autographr(object, node_measure, identify_function)` instead.
#> Using `stress` as default layout
Why minimum? And what can we learn from this plot about where innovation might occur within this network?
Now we are going to identify and interpret the roles or relations between a set of structurally equivalent positions. We’re going to identify structurally equivalent positions across all the data that we have, including ‘task’, ‘social’ and ‘friend’ ties, but the unit test this week will ask you to run this on a uniplex version of this network.
Ok, so to begin with we need to obtain the profiles that we are going to correlate to identify same/similar positions. For structural equivalence, we can start with a census of all the outgoing and incoming ties to reveal their tie partners.
dim(node_tie_census(ison_algebra))
#> [1] 16 96
head(structural_combo <- node_tie_census(ison_algebra))[,c(1,17,33,49,65,81)]
#> fromElizabeth toElizabeth fromElizabeth toElizabeth fromElizabeth
#> Elizabeth 0 0 0.0 0.0 0.0
#> Maria 0 1 0.0 0.0 0.0
#> Jenny 0 0 0.0 0.0 0.0
#> Ollie 0 0 0.0 0.0 0.0
#> Luna 0 1 1.2 1.2 0.3
#> Barbara 0 1 0.0 0.0 0.0
#> toElizabeth
#> Elizabeth 0.0
#> Maria 0.0
#> Jenny 0.0
#> Ollie 0.0
#> Luna 0.3
#> Barbara 0.0
We can see that the result is a matrix of 16 rows and 96 columns, because we want to catalogue or take a census of all the different incoming/outgoing partners our 16 nodes might have across these three networks. Note also that the result is a weighted matrix; what would you do if you wanted it to be binary?
The next step, once we have our data (the tie census), is to cluster nodes by their equivalence. In summary, we’re going to hierarchically cluster the nodes based on the distances in dissimilarity in their outgoing and incoming ties. Or you can just run the following line:
<- cluster_structural_equivalence(ison_algebra))
(str_res #>
#> Call:
#> stats::hclust(d = distances)
#>
#> Cluster method : complete
#> Number of objects: 16
This object doesn’t tell us much, but we can investigate it more using {migraph}
’s ggtree()
. This is a dendrogram of the hierarchical clustering object. Basically, as we move to the right, we’re allowing for more and more dissimilarity among those we cluster together. A fork or branching point indicates the level of dissimilarity at which those two or more nodes would be said to be equivalent.
ggtree(str_res)
ggtree(str_res, 2) # for example let's say there are just two main clusters
ggtree(str_res, 4) # or four? what are we seeing here?
Ok, so we can draw a line and this establishes how many clusters we have (or vice versa), but also which nodes belong to which cluster. But how many clusters should we pick?
To establish that, we need to iterate through all of our options, calculating for each how correlated this pattern is with the observed network. We then plot this and, using the “elbow method”, decide how many clusters.
ggidentify_clusters(str_res, structural_combo)
When there is one cluster for each vertex in the network, cell values will be identical to the observed correlation matrix, and when there is one cluster for the whole network, the values will all be equal to the average correlation across the observed matrix. So the correlations in each by-cluster matrix are correlated with the observed correlation matrix to see how well each by-cluster matrix fits the data.
Ok, so it looks here as if there is a clear bend in the elbow/knee at four clusters. This is reasonably parsimonious and well-fitting. More clusters than this only distinguishes nodes that are less dissimilar.
We can use cutree()
to cut the tree at our desired point and return the resulting vector of cluster assignments.
<- cutree(str_res, 4))
(str_clu #> Elizabeth Maria Jenny Ollie Luna Barbara Kay Christie
#> 1 2 3 1 3 3 2 2
#> Annabelle Aidan Cooper Adrienne Jack Donovan Aubrey Goldie
#> 1 1 3 1 2 2 1 4
This we can use for various things. Most immediately, we may wish to see these cluster assignments mapped onto our networks. All we need to do is add the variable to existing networks and plot them:
<- m182_task %>% as_tidygraph() %>% mutate(clu = str_clu)
m182_task autographr(m182_task, node_color = "clu") + ggtitle("Task")
<- m182_social %>% as_tidygraph() %>% mutate(clu = str_clu)
m182_social autographr(m182_social, node_color = "clu") + ggtitle("Social")
<- m182_friend %>% as_tidygraph() %>% mutate(clu = str_clu)
m182_friend autographr(m182_friend, node_color = "clu") + ggtitle("Friend")
Now we can use the 4-cluster solution to generate blockmodels. We’ll do this on the valued network, but binary is possible too.
<- blockmodel(m182_task, str_clu))
(task_blockmodel #>
#> Network Blockmodel:
#>
#> Block membership:
#>
#> Elizabeth Kay Adrienne Maria Jack Donovan Christie Annabelle
#> 1 2 3 1 3 3 2 2
#> Jenny Ollie Aubrey Luna Aidan Cooper Barbara Goldie
#> 1 1 3 1 2 2 1 4
#>
#> Reduced form blockmodel:
#>
#> Elizabeth Maria Jenny Ollie Luna Barbara Kay Christie Annabelle Aidan Cooper Adrienne Jack Donovan Aubrey Goldie
#> Block 1 Block 2 Block 3 Block 4
#> Block 1 0.36000 0.0000 0.0250 4.1500
#> Block 2 0.00500 0.4050 0.0150 3.5400
#> Block 3 0.01875 0.0075 0.6625 1.0125
#> Block 4 4.12500 3.3900 1.3875 NaN
plot(task_blockmodel)
<- blockmodel(m182_social, str_clu))
(social_blockmodel #>
#> Network Blockmodel:
#>
#> Block membership:
#>
#> Elizabeth Kay Adrienne Maria Jack Donovan Christie Annabelle
#> 1 2 3 1 3 3 2 2
#> Jenny Ollie Aubrey Luna Aidan Cooper Barbara Goldie
#> 1 1 3 1 2 2 1 4
#>
#> Reduced form blockmodel:
#>
#> Elizabeth Maria Jenny Ollie Luna Barbara Kay Christie Annabelle Aidan Cooper Adrienne Jack Donovan Aubrey Goldie
#> Block 1 Block 2 Block 3 Block 4
#> Block 1 2.83500 0.1000 0.2250 1.250
#> Block 2 0.09500 5.2875 0.0825 1.920
#> Block 3 0.24375 0.0750 4.5750 0.075
#> Block 4 0.92500 1.2000 0.3375 NaN
plot(social_blockmodel)
<- blockmodel(m182_friend, str_clu))
(friend_blockmodel #>
#> Network Blockmodel:
#>
#> Block membership:
#>
#> Elizabeth Kay Adrienne Maria Jack Donovan Christie Annabelle
#> 1 2 3 1 3 3 2 2
#> Jenny Ollie Aubrey Luna Aidan Cooper Barbara Goldie
#> 1 1 3 1 2 2 1 4
#>
#> Reduced form blockmodel:
#>
#> Elizabeth Maria Jenny Ollie Luna Barbara Kay Christie Annabelle Aidan Cooper Adrienne Jack Donovan Aubrey Goldie
#> Block 1 Block 2 Block 3 Block 4
#> Block 1 0.6000000 0.03333333 0.04166667 0
#> Block 2 0.1000000 1.00000000 0.10000000 0
#> Block 3 0.3333333 0.15000000 1.16666667 0
#> Block 4 0.0000000 0.00000000 0.00000000 NaN
plot(friend_blockmodel)
What do these plots show? Plotting the blockmodel like this is particularly useful for characterising what the profile of ties (partners) is for each position/equivalence class. We might characterise them like so:
Finally, we can reduce the graph to just interactions between roles. Let’s start off by graphing the valued/weighted blockmodel.
<- c("Freaks","Squares","Nerds","Geek")
group_labels <- reduce_graph(social_blockmodel, group_labels))
(social_reduced #> IGRAPH a037297 DNW- 4 15 --
#> + attr: name (v/c), weight (e/n)
#> + edges from a037297 (vertex names):
#> [1] Freaks ->Freaks Freaks ->Squares Freaks ->Nerds Freaks ->Geek
#> [5] Squares->Freaks Squares->Squares Squares->Nerds Squares->Geek
#> [9] Nerds ->Freaks Nerds ->Squares Nerds ->Nerds Nerds ->Geek
#> [13] Geek ->Freaks Geek ->Squares Geek ->Nerds
autographr(social_reduced)
<- reduce_graph(task_blockmodel, group_labels))
(task_reduced #> IGRAPH dda2ca7 DNW- 4 14 --
#> + attr: name (v/c), weight (e/n)
#> + edges from dda2ca7 (vertex names):
#> [1] Freaks ->Freaks Freaks ->Nerds Freaks ->Geek Squares->Freaks
#> [5] Squares->Squares Squares->Nerds Squares->Geek Nerds ->Freaks
#> [9] Nerds ->Squares Nerds ->Nerds Nerds ->Geek Geek ->Freaks
#> [13] Geek ->Squares Geek ->Nerds
autographr(task_reduced)
<- reduce_graph(friend_blockmodel, group_labels))
(friend_reduced #> IGRAPH 7d9feef DNW- 4 9 --
#> + attr: name (v/c), weight (e/n)
#> + edges from 7d9feef (vertex names):
#> [1] Freaks ->Freaks Freaks ->Squares Freaks ->Nerds Squares->Freaks
#> [5] Squares->Squares Squares->Nerds Nerds ->Freaks Nerds ->Squares
#> [9] Nerds ->Nerds
autographr(friend_reduced)
What can help interpreting these profiles is getting the summaries of average weight ties by group.
group_tie_census(m182_task, str_clu)
#> fromElizabeth fromMaria fromJenny fromOllie fromLuna fromBarbara
#> Block 1 0.15 0.02 0.00 0.02 0.05 0.02
#> Block 2 0.00 0.42 0.00 0.00 0.03 0.00
#> Block 3 0.07 0.00 0.26 0.00 0.75 0.38
#> Block 4 5.10 4.80 0.30 0.90 1.80 0.90
#> fromKay fromChristie fromAnnabelle fromAidan fromCooper fromAdrienne
#> Block 1 0.00 0.00 0.55 0.30 0.00 0.6
#> Block 2 0.33 0.33 0.00 0.00 0.00 0.0
#> Block 3 0.00 0.00 0.00 0.07 0.60 0.0
#> Block 4 2.85 2.40 2.40 5.40 1.05 4.5
#> fromJack fromDonovan fromAubrey fromGoldie toElizabeth toMaria toJenny
#> Block 1 0.00 0.00 0.17 4.12 0.07 0.0 0.00
#> Block 2 0.39 0.15 0.00 3.39 0.00 0.6 0.00
#> Block 3 0.00 0.07 0.00 1.39 0.07 0.0 0.26
#> Block 4 4.50 3.15 6.60 0.00 4.65 4.8 0.75
#> toOllie toLuna toBarbara toKay toChristie toAnnabelle toAidan toCooper
#> Block 1 0.02 0.05 0.05 0.0 0.00 0.55 0.25 0.00
#> Block 2 0.00 0.03 0.03 0.3 0.27 0.00 0.00 0.00
#> Block 3 0.00 0.60 0.45 0.0 0.00 0.00 0.04 0.68
#> Block 4 1.05 2.40 1.20 2.4 1.95 2.85 5.40 1.20
#> toAdrienne toJack toDonovan toAubrey toGoldie
#> Block 1 0.68 0.00 0.00 0.22 4.15
#> Block 2 0.00 0.39 0.06 0.03 3.54
#> Block 3 0.00 0.04 0.00 0.00 1.01
#> Block 4 5.25 3.90 3.90 5.55 0.00
# ADVANCED: Note on deductive clustering:
# It's pretty straightforward to alter the code above to test hypotheses.
# Simply supply your own cluster vector, where the elements in the vector are in
# the same order as the vertices in the matrix, and the values represent the
# cluster to which each vertex belongs.
task_social_cors <- cor(task_social)
# For example, if you believed that actors 2, 7, and 8 formed one group,
# actor 16 former another group, and everyone else formed a third group,
# you could represent this as follows:
dedclust = c(1, 2, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1, 3)
# Then examine the fitness of this cluster configuration as follows:
dedclust_mat <- NetCluster::generate_cluster_cor_mat(task_social_cors, dedclust)
dedclust_mat
gcor(dedclust_mat, task_social_cors)
We’re going to use the same pair of networks as with structural equivalence. But this time we’re not going to get the correlation of ties, but rather the correlation of profiles/patterns of local configurations. How can we identify patterns of local configurations? We’ll measure these profiles in terms of triad counts.
Now, there is a function for calculating triad censuses:
graph_triad_census(m182_task))
(#> 003 012 102 021D 021U 021C 111D 111U 030T 030C 201 120D 120U 120C 210 300
#> 133 38 212 2 0 1 14 13 0 0 93 1 0 0 11 42
But as you can see, it just gives a aggregated tally for the whole network and not one differentiated by actor (which is what we need). Fortunately, {migraph}
offers a node-level triad census too.
# (By putting parentheses around this command, it'll assign AND print!)
<- node_triad_census(m182_task))
(task_triads #> 003 012 102 021D 021U 021C 111D 111U 030T 030C 201 120D 120U 120C 210
#> Elizabeth 36 9 45 0 0 0 0 1 0 0 3 0 0 0 4
#> Maria 45 10 38 0 0 0 1 1 0 0 1 0 0 0 3
#> Jenny 55 0 44 0 0 0 0 0 0 0 0 0 0 0 0
#> Ollie 102 0 3 0 0 0 0 0 0 0 0 0 0 0 0
#> Luna 55 0 37 0 0 0 2 1 0 0 4 0 0 0 0
#> Barbara 65 19 10 0 0 0 5 0 0 0 3 1 0 0 2
#> Kay 91 1 13 0 0 0 0 0 0 0 0 0 0 0 0
#> Christie 91 1 13 0 0 0 0 0 0 0 0 0 0 0 0
#> Annabelle 90 1 14 0 0 0 0 0 0 0 0 0 0 0 0
#> Aidan 86 1 18 0 0 0 0 0 0 0 0 0 0 0 0
#> Cooper 95 0 10 0 0 0 0 0 0 0 0 0 0 0 0
#> Adrienne 90 1 14 0 0 0 0 0 0 0 0 0 0 0 0
#> Jack 93 1 11 0 0 0 0 0 0 0 0 0 0 0 0
#> Donovan 90 3 12 0 0 0 0 0 0 0 0 0 0 0 0
#> Aubrey 88 2 15 0 0 0 0 0 0 0 0 0 0 0 0
#> Goldie 0 0 0 0 0 0 0 0 0 0 73 0 0 0 6
#> 300
#> Elizabeth 7
#> Maria 6
#> Jenny 6
#> Ollie 0
#> Luna 6
#> Barbara 0
#> Kay 0
#> Christie 0
#> Annabelle 0
#> Aidan 0
#> Cooper 0
#> Adrienne 0
#> Jack 0
#> Donovan 0
#> Aubrey 0
#> Goldie 26
Can you recall what these MAD codes mean? MAN might be easier to remember, for NULL dyads is the last, but MAD is probably more appropriate. ?igraph::triad.census
can be used to check what each of the MAD codes means.
As with the structural equivalence, we can simply run our function and return an object that has hierarchically clustered our nodes, but this time it will be based on their (dis)similarity from each others patterns of ties.
<- cluster_regular_equivalence(m182_task)
reg_res ggtree(reg_res,4)
Ok, so it looks like these nodes are much more similar in terms of their patterns of ties than their actual ties.
Like before, we’ll loop through each possible cluster solution and see how well they match the observed matrix of triad type correlations.
ggidentify_clusters(reg_res, task_triads)
The cluster correlation plot seems a bit ambiguous here, at least visually. But the elbow method has highlighted 2 clusters as a pretty good solution.
ggtree(reg_res, 2)
<- cutree(reg_res, 2))
(reg_clu #> Elizabeth Maria Jenny Ollie Luna Barbara Kay Christie
#> 1 1 1 1 1 1 1 1
#> Annabelle Aidan Cooper Adrienne Jack Donovan Aubrey Goldie
#> 1 1 1 1 1 1 1 2
<- m182_task %>% as_tidygraph() %>% mutate(regclu = reg_clu)
m182_task autographr(m182_task, node_color = "regclu") + ggtitle("Task")
As before, we can use these clusters to blockmodel the task network.
<- blockmodel(m182_task, reg_clu))
(task_blockmodel #>
#> Network Blockmodel:
#>
#> Block membership:
#>
#> Elizabeth Maria Jenny Ollie Luna Barbara Kay Christie
#> 1 1 1 1 1 1 1 1
#> Annabelle Aidan Cooper Adrienne Jack Donovan Aubrey Goldie
#> 1 1 1 1 1 1 1 2
#>
#> Reduced form blockmodel:
#>
#> Elizabeth Maria Jenny Ollie Luna Barbara Kay Christie Annabelle Aidan Cooper Adrienne Jack Donovan Aubrey Goldie
#> Block 1 Block 2
#> Block 1 0.1357143 3.11
#> Block 2 3.1500000 NaN
plot(task_blockmodel)
Finally, we can reduce the graph to just interactions between roles. Obviously this is not particularly informative with only two clusters though…
<- reduce_graph(task_blockmodel, c("Regulars","Geek")))
(task_reduced #> IGRAPH 141cead DNW- 2 3 --
#> + attr: name (v/c), weight (e/n)
#> + edges from 141cead (vertex names):
#> [1] Regulars->Regulars Regulars->Geek Geek ->Regulars
autographr(task_reduced)
Finally, we can try to get a sense of what our different clusters represent by generating a cluster-by-triad-type matrix. This is an m x n matrix, where m is the number of clusters and n is the 16 possible triad types. Each cell is the average number of the given triad type for each individual in the cluster:
group_triad_census(m182_task, reg_clu)
#> 003 012 102 021D 021U 021C 111D 111U 030T 030C 201 120D 120U 120C
#> Block 1 78.13 3.27 19.8 0 0 0 0.53 0.2 0 0 0.73 0.07 0 0
#> Block 2 0.00 0.00 0.0 0 0 0 0.00 0.0 0 0 73.00 0.00 0 0
#> 210 300
#> Block 1 0.6 1.67
#> Block 2 6.0 26.00
# ADVANCED: Note that we can also blockmodel our communities from last week.
# walktrap_blockmodel <- blockmodel(get.adjacency(m182_main, sparse = F),
# friend_wt$membership)
# plot(walktrap_blockmodel)
# walktrap_blockmodel
# # And graphs that from the reduced form blockmodels...
# walktrap_blockmodel_red <- graph.adjacency(walktrap_blockmodel$block.model, weighted = T)
# plot(walktrap_blockmodel_red, edge.width = E(walktrap_blockmodel_red)$weight,
# vertex.color = rainbow(2) )
# # Admittedly, not terribly interesting...
#
# edgebet_blockmodel <- blockmodel(get.adjacency(m182_main, sparse = F),
# friend_eb$membership)
# plot(edgebet_blockmodel) # blockmodel
# edgebet_blockmodel_red <- graph.adjacency(edgebet_blockmodel$block.model, weighted = T)
# plot(edgebet_blockmodel_red, edge.width=E(edgebet_blockmodel_red)$weight,
# vertex.color=rainbow(3) ) # reduced graph
# # Cool
#
# fastgreed_blockmodel <- blockmodel(get.adjacency(m182_main, sparse = F),
# friend_fg$membership)
# plot(fastgreed_blockmodel) # blockmodel
# fastgreed_blockmodel_red <- graph.adjacency(fastgreed_blockmodel$block.model, weighted = T)
# plot(fastgreed_blockmodel_red, edge.width=E(fastgreed_blockmodel_red)$weight,
# vertex.color=rainbow(3) ) # reduced graph