class: center, middle, inverse, title-slide .title[ # Advanced Network Analysis ] .subtitle[ ## Temporal ERGMs ] .author[ ### Shahryar Minhas [s7minhas.com] ] --- exclude: true ## Readings - Garry Robins and Philippa Pattison. Random Graph Models for Temporal Processes in Social Networks. *Journal of Mathematical Sociology*, 25(1):5--41, 2001. - Bruce A. Desmarais and Skyler J. Cranmer. Statistical mechanics of networks: Estimation and uncertainty. *Physica A: Statistical Mechanics and its Applications*, 391(4):1865--1876, 2012. --- ## Longitudinal networks <img src="images/bitMapTrans.png" width="180px" style="display: block; margin: auto auto auto 0;" /> <img src="bitNet10.gif" width="800px" /> --- ## Longitudinal networks - Networks that change over time. - Examples: networks of friends, conflict networks, trade networks. - Want to model network dynamics within and across time periods. --- ## Outline - Setting up longitudinal network data - Visualizing longitudinal data - Descriptive statistics - Inferential analysis --- ## Getting the data ready We are used to data in this format: ``` r #install_github("ochyzh/networkdata") library(networkdata) data(allyData) head(dyadData)[,1:8] ``` ``` ## cname1 cname2 year ally war contiguity id1_cinc id2_cinc ## 2 CAN USA 1991 1 0 1 0.0119571 0.1364806 ## 3 MEX USA 1991 1 0 1 0.0125758 0.1364806 ## 4 COL USA 1991 1 0 0 0.0046681 0.1364806 ## 5 VEN USA 1991 1 0 0 0.0052502 0.1364806 ## 6 PER USA 1991 1 0 0 0.0033841 0.1364806 ## 7 BRA USA 1991 1 0 0 0.0240151 0.1364806 ``` --- ## Getting the data ready We need to convert this information such that: - the dependent variable must be a `list` of `network` objects - nodal covariates are vertex attributes in the `list` of `network` objects - dyadic covariates are included separately in a `list` of `matrices` --- ## Start with setting up war Output should look like this: .pull-left[ ``` r class(war) ``` ``` ## [1] "list" ``` ``` r length(war) ``` ``` ## [1] 10 ``` ``` r class(war[[1]]) ``` ``` ## [1] "matrix" "array" ``` ``` r dim(war[[1]]) ``` ``` ## [1] 50 50 ``` ] .pull-right[ ``` r war[[1]][1:3,1:3] ``` ``` ## USA CHN IND ## USA 0 0 0 ## CHN 0 0 1 ## IND 0 0 0 ``` ] --- ## contiguity should be easier Output should look like this: ``` r class(contiguity) ``` ``` ## [1] "matrix" "array" ``` ``` r dim(contiguity) ``` ``` ## [1] 50 50 ``` ``` r contiguity[1:3,1:3] ``` ``` ## USA CHN IND ## USA 0 0 0 ## CHN 0 0 1 ## IND 0 1 0 ``` --- ## Now set up DV with vertex attributes Output should look like this: ``` r class(ally) ``` ``` ## [1] "list" ``` ``` r length(ally) ``` ``` ## [1] 10 ``` ``` r class(ally[[1]]) ``` ``` ## [1] "network" ``` ``` r list.vertex.attributes(ally[[1]]) ``` ``` ## [1] "cinc" "cname" "polity" "vertex.names" "year" ``` --- ## TERGM: Discrete time model - Developed by [Robins & Pattison (2001)](https://www.tandfonline.com/doi/abs/10.1080/0022250X.2001.9990243?casa_token=6IzkEyfRhVAAAAAA:f5jksd3D1DV9G9lEHcDO1ZoZGaeE5pQc2zdVidN4SlhIelQpLgqV5idhEcEJCOrema_0XuTEow5G7Q) and further developed by [Hanneke et al. (2010)](https://projecteuclid.org/euclid.ejs/1276694116) - Scholars in political science most notably [Cranmer and Desmarais (2011)](https://people.cs.umass.edu/~wallach/courses/s11/cmpsci791ss/readings/cranmer11inferential.pdf) have eased the use and highlighted the utility of these types of models for political science - Extension of ERGM to the temporal setting is based on the idea of panel regression - In a sequence of observations, lagged earlier observations or derived information thereof can be used as predictors for later observations. + In other words, some of the statistics are direct functions of an earlier realization of the network + In its most basic form, the TERGM is a conditional ERGM with an earlier observation of the network occurring among the predictors. --- ## TERGM: Discrete time model - To extend ERGM to a longitudinal context, [Hanneke et al. (2010)](https://projecteuclid.org/euclid.ejs/1276694116) make a Markov assumption on the network from one time step to the next - Specifically, given an observed network `\(Y^{t}\)`, make the assumption that `\(Y^{t}\)` is independent of `\(Y^{1}, \ldots, Y^{t-2}\)` - Thus a sequence of network observations has the property that: `\begin{align} Pr(Y^{2}, Y^{3}, \ldots, Y^{t} | Y^{1}) = Pr(Y^{t} | Y^{t-1}) Pr(Y^{t-1} | Y^{t-2}) \ldots Pr(Y^{2} | Y^{1}) \end{align}` --- ## TERGM: Discrete time model - With this assumption in mind we just need to choose a form for the conditional PDF of `\(P(Y^{t} | Y^{t-1})\)` - `\(Y^{t} | Y^{t-1}\)` can be expressed through an ERGM distribution, which then gives us what is referred to as a TERGM: `\begin{align} \Pr(Y^{t} | \theta, Y^{t-1}) &= \frac{ \exp( \theta^{T} g(Y^{t}, Y^{t-1}) ) }{ \mathcal{k} } \end{align}` --- ## TERGM: Block-diag visualization - TERGM is essentially estimated through an ERGM with the dependent variable modeled as a block-diagonal matrix (such as below) - Constraints are put on the model such that cross-network edges in the off-diagonal blocks are prohibited <img src="images/tergm_block_diag.png" width="500px" style="display: block; margin: auto;" /> --- ## btergm Package - The `btergm` package has been developed by [Leifeld, Cranmer, & Desmarais (2018)](https://www.jstatsoft.org/article/view/v083i06) to estimate longitudinal networks using TERGM <img src="images/btergm.png" width="500px" style="display: block; margin: auto;" /> - Package provides two functions to estimate a TERGM, one using a pseudolikelihood (`btergm`) and the other using MCMC-MLE (`mtergm`) --- ## Running a TERGM - We are going to run a TERGM on the longitudinal alliance network, and will employ the following specification: + `edges`: density term + `edgecov(war)`: list of matrices where cross-sections denote war + `edgecov(contiguity)`: matrix of distances between countries + `absdiff(polity)`: Absolute difference between polity of `\(i\)` and `\(j\)` + `absdiff(cinc)`: Absolute difference between cinc of `\(i\)` and `\(j\)` + `gwesp(.5, fixed = TRUE)`: Geometric weighted triangle term --- ## Running a TERGM ``` r library(btergm) tergmFit <- btergm( ally ~ edges + edgecov(war) + edgecov(contiguity) + nodecov('polity') + absdiff("polity") + nodecov('cinc') + absdiff("cinc") + gwesp(.5, fixed = TRUE) ) ``` ``` ## t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8 t=9 t=10 ## ally (row) 50 50 50 50 50 50 50 50 50 50 ## ally (col) 50 50 50 50 50 50 50 50 50 50 ## war (row) 50 50 50 50 50 50 50 50 50 50 ## war (col) 50 50 50 50 50 50 50 50 50 50 ## contiguity (row) 50 50 50 50 50 50 50 50 50 50 ## contiguity (col) 50 50 50 50 50 50 50 50 50 50 ## t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8 t=9 t=10 ## maximum deleted nodes (row) 0 0 0 0 0 0 0 0 0 0 ## maximum deleted nodes (col) 0 0 0 0 0 0 0 0 0 0 ## remaining rows 50 50 50 50 50 50 50 50 50 50 ## remaining columns 50 50 50 50 50 50 50 50 50 50 ## t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8 t=9 t=10 ## ally (row) 50 50 50 50 50 50 50 50 50 50 ## ally (col) 50 50 50 50 50 50 50 50 50 50 ## war (row) 50 50 50 50 50 50 50 50 50 50 ## war (col) 50 50 50 50 50 50 50 50 50 50 ## contiguity (row) 50 50 50 50 50 50 50 50 50 50 ## contiguity (col) 50 50 50 50 50 50 50 50 50 50 ``` ``` r #check model fit: #GOF1<-gof(tergmFit) ``` --- ## Results ``` r summary(tergmFit) ``` ``` ## Estimate Boot mean 2.5% 97.5% ## edges -5.69586158 -5.70326749 -5.8470 -5.4967 ## edgecov.war[[i]] -0.07515258 -0.06963466 -0.3202 0.1405 ## edgecov.contiguity[[i]] 2.54540447 2.54655506 2.4927 2.6226 ## nodecov.polity -0.00027339 -0.00011494 -0.0062 0.0056 ## absdiff.polity 0.01865488 0.01880566 0.0142 0.0232 ## nodecov.cinc -1.42023487 -1.44550741 -2.0620 -0.8232 ## absdiff.cinc 26.96999913 27.03209595 26.2319 27.8021 ## gwesp.fixed.0.5 2.24553096 2.24844146 2.1292 2.3447 ``` --- ## Temporal Terms - `delrecip(mutuality = FALSE, lag = 1)`: checks for delayed reciprocity. For example, if node `\(j\)` is tied to node `\(i\)` at `\(t = 1\)`, does this lead to a reciprocation of that tie back from `\(i\)` to `\(j\)` at `\(t = 2\)`? If mutuality = TRUE is set, this extends not only to ties, but also non-ties. The lag argument controls the size of the temporal lag: with `\(lag = 1\)`, reciprocity over one consecutive time period is checked. Note that as lag increases, the number of time steps on the dependent variable decreases. - `memory(type = "stability", lag = 1)`: controls for the impact of a previous network on the current network. Four different types of memory terms are available: positive autoregression (type = "autoregression") checks whether previous ties are carried over to the current network; dyadic stability (type = "stability") checks whether both edges and non-edges are stable between the previous and the current network; edge loss (type = "loss") checks whether ties in the previous network have been dissolved and no longer exist in the current network; and edge innovation (type = "innovation") checks whether previously unconnected nodes have the tendency to become tied in the current network. --- ## Temporal Terms Cont'd - `timecov(x = NULL, minimum = 1, maximum = NULL, transform = function(t) t)`: checks for linear or non-linear time trends with regard to edge formation. Optionally, this can be combined with a covariate to create an interaction effect between a dyadic covariate and time in order to test whether the importance of a covariate increases or decreases over time. In the default case, edges modeled as being linearly increasingly important over time. By tweaking the transform function, arbitrary functional forms of time can be tested. For example, transform = sqrt (for a geometrically decreasing time effect), transform = function(x) x^2 (for a geometrically increasing time effect), transform = function(t) t (for a linear time trend) or polynomial functional forms (e.g., 0 + (1 * t) + (1 * t^2)) can be used. --- ``` r tergmFit1 <- btergm(ally ~ edges + edgecov(war) + edgecov(contiguity) + nodecov('polity') + absdiff("polity") + nodecov('cinc') + absdiff("cinc") + gwesp(.5, fixed = TRUE)+ timecov()) summary(tergmFit1) ``` --- ## Duque (2018) The DV is diplomatic ties, `dipl_ties`: ``` r #Clear your memory and unload `btergm` as it clashes with `network`: detach("package:btergm", unload=TRUE) data("duqueData") class(dipl_ties) ``` ``` ## [1] "list" ``` ``` r length(dipl_ties) ``` ``` ## [1] 8 ``` ``` r class(dipl_ties[[1]]) ``` ``` ## [1] "data.frame" ``` --- ## The Dependent Variable we can see that `dipl_ties` is currently a list of `data.frames`. Let's convert it into a list of networks. ``` r library(statnet) for (i in 1:8) { dipl_ties[[i]]<-as.network(as.matrix(dipl_ties[[i]])) } class(dipl_ties[[1]]) ``` ``` ## [1] "network" ``` --- ## Duque (2018) Popularity hypothesis: High-status states should receive more recognition simply because of their position in the social structure, rather than because of the possession of status attributes (`2-instars`). Reciprocity and transitivity: A state’s existing relations should influence the state’s ability to achieve status (`mutual` and `triangle`). Homophily: States should recognize states that have similar values and resources as them (`absdiff`). --- ## Dyadic Covariates Contiguity (`contig`) and alliances (`allies`) are time-varying edge-level covariates. We must make sure that they are stored as lists of matrices. .pull-left[ ``` r #Contiguity: class(contig) ``` ``` ## [1] "list" ``` ``` r length(contig) ``` ``` ## [1] 8 ``` ``` r class(contig[[1]]) ``` ``` ## [1] "data.frame" ``` ``` r dim(contig[[1]]) ``` ``` ## [1] 134 134 ``` ] .pull-right[ ``` r contig[[1]][1:3,1:3] ``` ``` ## 2 20 40 ## 2 0 1 0 ## 20 1 0 0 ## 40 0 0 0 ``` ] --- ## Allies .pull-left[ ``` r #Allies: class(allies) ``` ``` ## [1] "list" ``` ``` r length(allies) ``` ``` ## [1] 8 ``` ``` r class(allies[[1]]) ``` ``` ## [1] "data.frame" ``` ``` r dim(allies[[1]]) ``` ``` ## [1] 134 134 ``` ] .pull-right[ ``` r allies[[1]][1:3,1:3] ``` ``` ## 2 20 40 ## 2 0 1 0 ## 20 1 0 0 ## 40 0 0 0 ``` ] --- ## Dyadic Covariates It looks like `allies` and `contig` are currently stored as lists of `data.frames`. We must convert them to lists of matrices. ``` r for (i in 1:8) { contig[[i]]<-as.matrix(contig[[i]]) allies[[i]]<-as.matrix(allies[[i]]) } ``` --- ## Now set up DV with vertex and dyadic attributes Our node-level covariate, `polity$dem_dum` must be defined as a vertex attribute in each of the `dipl_ties` networks. ``` r #Define Dem as a vertex attribute for each year of dipl_ties (didn't work as a loop): set.vertex.attribute(dipl_ties[[1]],"dem",polity$dem_dum[polity$year==1970]) set.vertex.attribute(dipl_ties[[2]],"dem",polity$dem_dum[polity$year==1975]) set.vertex.attribute(dipl_ties[[3]],"dem",polity$dem_dum[polity$year==1980]) set.vertex.attribute(dipl_ties[[4]],"dem",polity$dem_dum[polity$year==1985]) set.vertex.attribute(dipl_ties[[5]],"dem",polity$dem_dum[polity$year==1990]) set.vertex.attribute(dipl_ties[[6]],"dem",polity$dem_dum[polity$year==1995]) set.vertex.attribute(dipl_ties[[7]],"dem",polity$dem_dum[polity$year==2000]) set.vertex.attribute(dipl_ties[[8]],"dem",polity$dem_dum[polity$year==2005]) dipl_ties[[1]] %v% "dem" ``` ``` ## [1] 1 1 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 1 0 ## [38] 0 1 0 0 1 0 0 0 0 1 0 0 0 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ## [75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 ## [112] 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 ``` ``` r dipl_ties[[2]] %v% "dem" ``` ``` ## [1] 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 ## [38] 1 0 0 1 0 0 1 0 0 0 1 1 0 0 0 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ## [75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 ## [112] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 ``` --- ## Specify the Model: ``` r library(btergm) #This runs for 5 min: tergm_Duque<-btergm(dipl_ties ~ edges + istar(2) + ostar(2) + mutual + triangles+ absdiff("dem")+ +nodeicov("dem")+ +edgecov(allies) +edgecov(contig), R=1000) summary(tergm_Duque) ``` --- ## Run the Model ``` ## Estimate Boot mean 2.5% 97.5% ## edges -5.377882 -5.405644 -5.6133 -5.1253 ## istar2 0.028279 0.028712 0.0267 0.0313 ## ostar2 0.029487 0.030022 0.0266 0.0340 ## mutual 2.485567 2.482407 2.3500 2.5956 ## triangle 0.011750 0.011628 0.0103 0.0127 ## absdiff.dem -0.287938 -0.289257 -0.3656 -0.2207 ## nodeicov.dem -0.204423 -0.195365 -0.2515 -0.1246 ## edgecov.allies[[i]] 1.205762 1.220685 1.1238 1.3205 ## edgecov.contig[[i]] 1.765076 1.765720 1.5732 2.0623 ``` --- ## Results <img src="images/Duque_res.png" width="400px" style="display: block; margin: auto;" /> --- #Example with Time Vars ``` r #This runs for 5 min: tergm_Duque1<-btergm(dipl_ties ~ edges + istar(2) + ostar(2) + mutual + triangles+ absdiff("dem")+ +nodeicov("dem")+ +edgecov(allies)+ +edgecov(contig)+ timecov(), R=1000) summary(tergm_Duque1) ``` --- ## Example with Time Vars ``` ## Estimate Boot mean 2.5% 97.5% ## edges -4.915580 -4.944023 -5.2673 -4.5906 ## istar2 0.029973 0.030279 0.0283 0.0333 ## ostar2 0.031485 0.031775 0.0289 0.0353 ## mutual 2.428539 2.424032 2.2767 2.5464 ## triangle 0.011518 0.011492 0.0102 0.0126 ## absdiff.dem -0.298055 -0.300189 -0.3669 -0.2392 ## nodeicov.dem -0.122282 -0.130947 -0.2013 -0.0506 ## edgecov.allies[[i]] 1.199983 1.217104 1.0998 1.3312 ## edgecov.contig[[i]] 1.837357 1.828172 1.6100 2.1092 ## edgecov.timecov1[[i]] -0.126189 -0.127274 -0.2050 -0.0802 ``` --- ## Bayesing ERGM (BERGM) - [Paper: Caimo & Friel (2011)](https://arxiv.org/pdf/1007.5192.pdf) - [`Bergm` on CRAN](https://cran.r-project.org/web/packages/Bergm/index.html) - [Vignette](https://www.jstatsoft.org/article/view/v061i02) --- ## ego-ERGM - [Paper: Salter-Townshend & Murphy (2016)](https://www.tandfonline.com/doi/full/10.1080/10618600.2014.923777?casa_token=OW--9OnTYVQAAAAA:rZoEcB6rP28SRir4MK3xK8l-URnKY18PyRsXy09nxPgemy3Oh_-vm9A8ynAchlWdXy5Z9ERt1sYs) --- ## Separable temporal ERGM (STERGM) - [Paper: Krivitsky & Handcock (2012)](https://arxiv.org/pdf/1011.1937.pdf) - [`tergm` on CRAN](https://cran.r-project.org/web/packages/tergm/) - [Vignette](https://cran.r-project.org/web/packages/tergm/vignettes/STERGM.pdf) <img src="images/stergm_pic.png" width="500px" style="display: block; margin: auto;" /> --- ## Hierarchical Exponential-Family Graph Model (HERGM) - [Paper: Schweinberger & Handcock (2015)](https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/rssb.12081) - [`hergm` on CRAN](https://cran.r-project.org/web/packages/hergm/index.html) - [Vignette](https://www.jstatsoft.org/article/view/v085i01) --- ## Multilevel ERGM - [Paper: Wang et al. (2016)](https://www.researchgate.net/profile/Emmanuel_Lazega2/publication/301266001_General_Conclusion/links/5a88026ca6fdcc6b1a3b606d/General-Conclusion.pdf#page=130) <img src="images/multilevel_ergm.png" width="500px" style="display: block; margin: auto;" />