\documentclass{article} \usepackage{natbib} %for easy biblo \usepackage{hyperref} %for url links \usepackage{comment} \usepackage{color} %\VignetteIndexEntry{networkDynamic Example} \begin{document} \definecolor{Sinput}{rgb}{0.19,0.19,0.75} \definecolor{Soutput}{rgb}{0.2,0.3,0.2} \definecolor{Scode}{rgb}{0.75,0.19,0.19} \DefineVerbatimEnvironment{Sinput}{Verbatim}{formatcom = {\color{Sinput}}} \DefineVerbatimEnvironment{Soutput}{Verbatim}{formatcom = {\color{Soutput}}} \DefineVerbatimEnvironment{Scode}{Verbatim}{formatcom = {\color{Scode}}} \renewenvironment{Schunk}{}{} \SweaveOpts{concordance=TRUE} <>= foo <- packageDescription("networkDynamic") @ \title{Package examples for \Sexpr{foo$Package}: \Sexpr{foo$Title} \\ \small{Version \Sexpr{foo$Version}}} \author{Carter T. Butts, Skye Bender-deMoll, Ayn Leslie-Cook, \\Pavel N. Krivitsky, Zack Almquist, David R. Hunter,\\ Martina Morris } \maketitle \tableofcontents \section{Introduction} The \verb@networkDynamic@ package provides support for a simple family of dynamic extensions to the \verb@network@ \citep*{network} class; these are currently accomplished via the standard \verb@network@ attribute functionality (and hence the resulting objects are still compatible with all conventional routines), but greatly facilitate the practical storage and utilization of dynamic network data. The dynamic extensions are motivated in part by the need to have a consistent data format for exchanging data, storing the inputs and outputs to relational event models, statistical estimation and simulation tools such as \verb@ergm@ \citep*{ergm} and \verb@tergm@ \citep*{tergm}, and dynamic visualizations. The key features of the package provide basic utilities for working with networks in which: \begin{itemize} \item Vertices have `activity' or `existence' status that changes over time (they enter or leave the network) \item Edges which appear and disappear over time \item Arbitrary attribute values attached to vertices and edges that change over time \item Meta-level attributes of the network which change over time \item Both continuous and discrete time models are supported, and it is possible to effectively blend multiple temporal representations in the same object \end{itemize} In addition, the package is primarily oriented towards handling the dynamic network data inputs and outputs to network statistical estimation and simulation tools like \verb@statnet@ and \verb@tergm@. This document will provide a quick overview and use demonstrations for some of the key features. We assume that the reader is already familiar with the use and features of the \verb@network@ package. Note: Although \verb@networkDynamic@ shares some of the goals (and authors) of the experimental and quite confusable \verb@dynamicNetwork@ package \citep*{dynamicNetwork}, they are are incompatible. \section{How to start and end relationships easily} A very quick condensed example of starting and ending edges to show why it is useful and some of the alternate syntax options. \subsection{Activating edges} The standard assumption in the \verb@network@ package and most sociomatrix representations of networks is that an edge between two vertices is either present or absent. However, many of the phenomena that we wish to describe with networks are dynamic rather than static processes, having a set of edges which change over time. In some situations the edge connecting a dyad may break and reform multiple times as a relationship is ended and re-established. The \verb@networkDynamic@ package adds the concept of `activation spells' for each element of a \verb@network@ object. Edges are considered to be present in a network when they are active, and treated as absent during periods of inactivity. After a relationship has been defined using the normal syntax or network conversion utilities, it can be explicitly activated for a specific time period using the \verb@activate.edges@ methods. Alternatively, edges can be added and activated simultaneously with the \verb@add.edges.active@ helper function. <>= library(networkDynamic) # load the dynamic extensions triangle <- network.initialize(3) # create a toy network add.edge(triangle,1,2) # add an edge between vertices 1 and 2 add.edge(triangle,2,3) # add a more edges activate.edges(triangle,at=1) # turn on all edges at time 1 only activate.edges(triangle,onset=2, terminus=3, e=get.edgeIDs(triangle,v=1,alter=2)) add.edges.active(triangle,onset=4, length=2,tail=3,head=1) @ Notice that the \verb@activate.edges@ method refers to the relationship using the \verb@e@ argument to specify the ids of the edges to activate. To be safe, we are looking up the ids using the \verb@get.edgeIDs@ method with the \verb@v@ and \verb@alter@ arguments indicating the ids of the vertices involved in the edge. The \verb@onset@ and \verb@terminus@ parameters give the starting and ending point for the activation period (more on this and the \verb@at@ syntax later). When a network object has dynamic elements added, it also gains the \verb@networkDynamic@ class, so it is both a \verb@network@ and \verb@networkDynamic@ object. <>= class(triangle) print(triangle) @ \subsection{Peeking back in time} After the activity spells have been defined for a network, it is possible to extract views of the network at arbitrary points in time using the \verb@network.extract@ function in order to calculate traditional graph statistics. <>= degree<-function(x){as.vector(rowSums(as.matrix(x)) +colSums(as.matrix(x)))} # handmade degree function degree(triangle) # degree of each vertex, ignoring time degree(network.extract(triangle,at=0)) degree(network.extract(triangle,at=1)) # just look at t=1 degree(network.extract(triangle,at=2)) degree(network.extract(triangle,at=5)) degree(network.extract(triangle,at=10)) @ The vertex degrees at each extracted time point are different what would be expected for the ``timeless'' network. When the network was sampled outside of the defined time range (at 0 and 10) it returned degrees of 0, suggesting that no edges are present at all. It may be helpful to plot the networks to help understand what is going on. The plots below show the result of the standard plot command (\verb@plot.network.default@) for the triangle, as well as plots of the network at specific time points. <>= par(mfrow=c(2,2)) #show multiple plots plot(triangle,main='ignoring dynamics',displaylabels=T) plot(network.extract( triangle,onset=1,terminus=2),main='at time 1',displaylabels=T) plot(network.extract( triangle,onset=2,terminus=3),main='at time 2',displaylabels=T) plot(network.extract( triangle,onset=5,terminus=6),main='at time 5',displaylabels=T) @ %\begin{figure} %\begin{center} %<>= %<> @ %\end{center} %\caption{Network plot of our trivial triangle network} %\label{fig:fig1} %\end{figure} \section{Birth, Death, Reincarnation and other ways for vertices to enter and leave networks} \subsection{Activating vertices} Many network models need the ability to specify activity spells for vertices in order to account for changes in the population due to `vital dynamics' (births and deaths) or other types of entrances and exits from the sample population. In \verb@networkDynamic@ activity spells for a vertex can be specified using the \verb@activate.vertices@ methods. Like edges, vertices can have multiple spells of activity. If we build on the triangle example: <>= activate.vertices(triangle,onset=1,terminus=5,v=1) activate.vertices(triangle,onset=1,terminus=10,v=2) activate.vertices(triangle,onset=4,terminus=10,v=3) network.size.active(triangle,at=1) # how big is it? network.size.active(triangle,at=4) network.size.active(triangle,at=5) @ Using the \verb@network.size.active@ function shows us that specifying the activity ranges has effectively changed the sizes (and corresponding vertex indices--more on that later) of the network. Notice also that we've created contradictions in the definition of this hand-made network, for example stating that vertex 3 isn't active until time 4 when earlier we said that there were ties between all nodes at time 1. The package does not prohibit these kinds of paradoxes, but it does provide a utility to check for them. <>= network.dynamic.check(triangle) @ \subsection{Deactivating elements} In this case, we can resolve the contradictions by explicitly deactivating the edges involving vertex 3: <>= deactivate.edges(triangle,onset=1,terminus=4, e=get.edgeIDs(triangle,v=3,neighborhood="combined")) network.dynamic.check(triangle) @ The deactivation methods for vertices,\verb@deactivate.vertices@, works the same way, but it accepts a \verb@v=@ parameter to indicate which vertices should be modified instead of the \verb@e=@ parameter. \section{``Spells'': the magic under the hood} In which we provide a brief glimpse into the underlying data structures. \subsection{How we save time} There are many possible ways of representing change in an edge set over time. Several of the most commonly used are: \begin{itemize} \item A series of networks or network matrices representing the state of the network at sequential time points \item An initial network and a list of edge toggles representing changes to the network at specific time points \item A collection of `spell' intervals giving the onset and termination times of each element in the network they are attached to \item A set of multiplex edges with time values attached. \end{itemize} This package uses the spell representation, and stores the spells as a perfectly normal but specially named \verb@active@ attributes on the network. These attributes are a 2-column spell matrix in which the first column gives the onset, the second the terminus, and each row defines an additional activity spell for the network element. For more information, see \verb@?activity.attribute@. As an example, to peek at the spells defined for the vertices: <>= get.vertex.activity(triangle) # vertex spells get.edge.activity(triangle) # edge spells @ Notice that the first edge has a 2-spell matrix where the first spell extends from time 1 to time 1 (a zero-duration or instantaneous spell), and the second from time 2 to time 3 (a ``unit length'' spell. More on this below). The third edge has the interesting special ``null'' spell \verb@c(Inf,Inf)@ defined to mean `never active` which was produced when we deleted the activity associated with the 3rd edge. Within this package, spells are assumed to be `right-open' intervals, meaning that the spell includes its lower bound but not its upper bound. For example, the spell [2,3) covers the range between t>=2 and t<3. Another way of thinking of it is that terminus means ``until''. So the spell ranges from 2 \emph{until} 3, but does not include 3. Although it would certainly be possible to directly modify the spells stored in the \verb@active@ attributes, it is much safer to use the various \verb@activate@ and \verb@deactivate@ methods to ensure that the spell matrix remains in a correctly defined state. The goal of this package is to make it so that it is rarely necessary to work with spells, or even worry very much about the underlying data structures. It should be possible to use the provided utilities to convert between the various representations of dynamic networks. However, even if the details of data structure can be ignored, it is still important to be very clear about the underlying temporal model of the network you are working with. \subsection{Multiple spells != multiplex} One of the features that makes the \verb@network@ package so flexible is that it allows \emph{multiplex} edges. This means that a pair (or set ...) of vertices can be linked by multiple ``parallel'' edges. Often this is used as a way to store several different kinds of relations within the same network object. It is important to be clear that, as we have defined it, having multiplex edges between vertices is not the same thing as an edge with multiple activity spells. It is entirely possible to activate multiple edges between a vertex pair with different spell values in order to attach relationship-specific timing information for situations where this an appropriate and useful representation. \section{Differences between Discrete and Continuous data} Its 2 AM on Tuesday. Do you know what your temporal model is? Does 2 AM mean 2:00 AM, or from 2:00 to 2:59:59? We discuss this below, as well as other existential questions such as the differences between ``at'' and ``onset, terminus'' syntax. There are two key approaches to representing time when measuring something. \subsection{You might be discrete if...} The \emph{discrete} model thinks of time as equal chunks, ticks, discrete steps, or panels. To measure something we count up the value of interest for that chunk. Discrete time is expressed as series of integers. We can refer to the 1st step, the 365th step, but there is no concept of ordering of events within steps and we can't have fractional steps. A discrete time simulation can never move its clock forward by half-a-tick. As long as the steps can be assumed to be the same duration, there is no need to worry about what the duration actually is. This model is very common in the traditional social networks world. Sociometric survey data may aggregated into a set of weekly network ``panels'', each of which is thought of as a discrete time step in the evolution of the network. We ignore the exact timing of what minute each survey was completed, so that we can compare the week-to-week dynamics. \subsection{You might be continuous if...} In a \emph{continuous} model, measurements are thought of as taking place at an instantaneous point in time (as precisely as can be reasonably measured). Events may have specific durations, but they will almost never be integers. Instead of being present in week 1 and absent in week 2 a relationship starts on Tuesday at 7:45 PM and ends on Friday at 10:01 AM. Continuous time models are useful when the the ordering of events is important. It still may be useful to represent observations in panels or measure time in integer units, but we must assume that the state of the network could have changed between our observation at noon on Friday of week 1 and noon on Friday of week 2. \subsection{Comparing models} Although underlying data model for the \verb@networkDynamic@ package is continuous time, discrete time models can easily be represented. But it is important to be clear about what model you are using when interpreting measurements. For example, the \verb@activate.vertex@ methods can be called using an \verb@onset=t@ and \verb@terminus=t+1@ style, or an \verb@at=t@ style (which converts internally to \verb@onset=t@ , terminus=t). Here are several ways of representing the similar time information for an edge lasting two time steps: <>= disc <- network.initialize(2) disc[1,2]<-1 activate.edges(disc,onset=4,terminus=6) # terminus = t+1 is.active(disc,at=4,e=1) is.active(disc,at=5,e=1) is.active(disc,at=6,e=1) @ Remember that the edge is not active at time 6, because we specified that it is only active \emph{until} time 6. And since we are thinking of this as a discrete network, we shouldn't ask if the edge is active at \verb@t=5.5@ (but it is). <>= is.active(disc,at=5.5,e=1) @ If we really wanted it to be active at time 6, we'd have to think of it as a continuous network and add on a tiny smidgen of time \footnote{Sometimes a tiny bit of time can get added on due to floating point rounding errors. In rare cases this causes problems in spell comparisons where spells don't match even though it seems they should. This happens because many decimal numbers do not have exact binary equivalents. For example, 1.0-0.9-0.1 = -2.775558e-17 , not 0 as we might expect. So according to the rules of floating point math, 3.6125 != (289*0.0125).}. <>= cont <- network.initialize(2) cont[1,2]<-1 activate.edges(cont,onset=3.0,terminus=6.0001) is.active(cont,at=4,e=1) is.active(cont,at=6,e=1) is.active(cont,at=6.5,e=1) @ We could also chose to represent each measurement as the point in time at which the edge was observed. <>= point <- network.initialize(2) # continuous waves point[1,2]<-1 activate.edges(point,at=4) activate.edges(point,at=5) is.active(point,at=4,e=1) is.active(point,at=4.5,e=1) # this doesn't makes sense is.active(point,at=4,e=1) @ In short, \verb@networkDynamic@ provides some great tools, but you need to think carefully about how time is measured in your data to get correct results. \section{Show me how it was: extracting static views of dynamic networks } Because working with spells correctly can be complicated, the package provides utility methods for dynamic versions of common network operations. View the help page at \verb@?network.extensions@ for full details and arguments. \subsection{Testing for activity} As is probably already apparent, the activity range of a vertex, set of vertices, edge, or set of edges can be tested using the \verb@is.active@ method by including a time range and list of vertexIDs or edgeIDs to check. <>= is.active(triangle, onset=1, length=1,v=2:3) is.active(triangle, onset=1, length=1,e=get.edgeIDs(triangle,v=1)) @ \subsection{Listing active elements} Depending on the end use, a more convenient way to express these queries might be to use utility functions to retrieve the ids of the network elements of interest that are active for that time range. <>= get.edgeIDs.active(triangle, onset=2, length=1,v=1) get.neighborhood.active(triangle, onset=2, length=1,v=1) is.adjacent.active(triangle,vi=1,vj=2,onset=2,length=1) @ These methods of course accept the same additional arguments as their \verb@network@ counterparts. In some situations it may be helpful to be able to extract a list of all the dyads that are active at a time point. the \verb@get.dyads.active@ returns a two-column matrix (essentially an 'edgelist') of the pairs of vertices connected by edges active within the query spell. This assumes that the network is not hypergraphic or multiplex. <>= get.dyads.active(triangle, at=1) @ \subsection{Are regular network objects active?} What happens when we ask about the activity of a regular \verb@network@ object? Or what if only some vertices or edges in a \verb@networkDynamic@ object have activity attributes defined? Many functions include the \verb@active.default@ parameter for controlling how elements without spells should be treated. If the parameter is not explicitly given (\verb@active.default=TRUE@), they will behave as if they are active from \verb@-Inf@ to \verb@Inf@. <>= static<-network.initialize(3) is.active(static,at=100,v=1:3) is.active(static,at=100,v=1:3,active.default=FALSE) dynamic<-activate.vertices(static,onset=0,terminus=200,v=2) is.active(dynamic,at=100,v=1:3) is.active(dynamic,at=100,v=1:3,active.default=FALSE) @ The \verb@active.default@ parameter doesn't alter the activity of elements that have been explicitly deactivated and are represented by the ``null spell'' (Inf,Inf). <>= inactive<-network.initialize(2) deactivate.vertices(inactive,onset=-Inf,terminus=Inf,v=2) is.active(inactive,onset=Inf,terminus=Inf,v=1:2,active.default=TRUE) @ \subsection{Basic descriptives} The primary focus of the \verb@networkDynamic@ package is providing the foundational data structures and manipulation tools, so that the implementation of temporal SNA measures are provided by other packages such as \verb@tsna@. However, there are some crude metrics available here. In some contexts, especially writing simulations on a network that can work in both discrete and continuous time, it may be important to know all the time points at which the structure of the network changes. The package includes a function \verb@get.change.times@ that can return a list of times for the entire network, or edges and vertices independently: <>= get.change.times(triangle) get.change.times(triangle,vertex.activity=FALSE) get.change.times(triangle,edge.activity=FALSE) @ We have also implemented dynamic versions of the basic network functions \verb@network.size@ and \verb@network.edgecount@ which accept the standard activity parameters: <>= network.size.active(triangle,onset=2,terminus=3) network.edgecount.active(triangle,at=5) @ \subsection{Collapsing a network vs. extracting it} We've already introduced the \verb@network.extract@ function which can extract a sub-range of time from a \verb@networkDynamic@ and return it as a \verb@networkDynamic@. <>= get.change.times(triangle) network.edgecount(triangle) notflat <- network.extract(triangle,onset=1,terminus=3,trim.spells=TRUE) is.networkDynamic(notflat) network.edgecount(notflat) # did we lose edge2? get.change.times(notflat) @ By default, the \verb@network.extract@ function returns a \verb@networkDynamic@ object with the subset of edges in the original network that are active during the query period. The \verb@trim.spells@ parameter tells it to take the more computationally expensive step of actually modifying the activity spells in all of the network elements to trim them to the specified range. There is also a \verb@network.collapse@ function which extracts the appropriate range and returns a static \verb@network@ object with the timing information removed. <>= flat <-network.collapse(triangle,onset=1,terminus=3) is.networkDynamic(flat) get.change.times(flat) network.edgecount(flat) list.edge.attributes(flat) @ If the argument \verb@rm.time.info=FALSE@ , the \verb@network.collapse@ function also adds \verb@activity.count@ and \verb@activity.duration@ attributes to the vertices and edges to give a crude summary of the timing information that has been removed. However, the duration information does not take into account possible censoring of ties at the beginning and end of the network observation time period. <>= flat <-network.collapse(triangle,onset=1,terminus=3,rm.time.info=FALSE) flat%v%'activity.duration' flat%e%'activity.count' flat%e%'activity.duration' @ \subsection{Wiping the slate: removing activity information} Most \verb@network@ methods will ignore the timing information on a \verb@networkDynamic@ object. However, there may be situations where it is desirable to remove all of the timing information attached to a \verb@networkDynamic@ object. (Note: this is not the same thing as deactivating elements of the network.) This can be done using the \verb@delete.edge.activity@ and \verb@delete.vertex.activity@ functions which accept arguments to specify which elements should have the timing information deleted. <>= delete.edge.activity(triangle) delete.vertex.activity(triangle) get.change.times(triangle) get.vertex.activity(triangle) @ Although the timing information of the edges and/or vertices may be removed, other \verb@networkDynamic@ methods will assume activity or inactivity across all time points, based on the argument \verb@active.default@. \subsection{Differences between ``any'' and ``all'' aggregation rules} In addition to the point-based (\verb@at@ syntax) or unit interval (\verb@length=1@) activity tests and extraction operations used in most examples so far, the methods also support the idea of a ``query spell' specified using the same onset and terminus syntax. So it is also possible (assuming it makes sense for the network being studied) to use \verb@length=27.52@ or \verb@onset=0, terminus=256@. Querying with a time range does raise an issue: how should we handle situations where edges or vertices have spells that begin or end part way through the query spell? Although other potential rules have been proposed, the methods currently include a \verb@rule@ argument that can take the values of \verb@any@ (the default) or \verb@all@. The former returns elements if they are active for any part of the query spell, and the later only returns elements if they are active for the entire range of the query spell. <>= query <- network.initialize(2) query[1,2] <-1 activate.edges(query, onset=1, terminus=2) is.active(query,onset=1,terminus=2,e=1) is.active(query,onset=1,terminus=3,rule='all',e=1) is.active(query,onset=1,terminus=3,rule='any',e=1) @ \section{Squooshing data into networkDynamic objects} Obviously for most non-trivial data-sets it doesn't make sense to write out long lists of each edge and vertex to be added and removed. The package includes some handy conversion tools for moving from some common representations of network dynamics to a \verb@networkDynamic@ object. Currently, the \verb@networkDynamic()@ conversion function importing edge and vertex timing information from the following formats: \begin{description} \item[lists of networks]{A list of network objects assumed to describe sequential panels of network observations. Network sizes may vary if some vertices are only active in certain panels.} \item[spells]{A matrix or data.frame of spells specifying edge timing. Assumed to be \verb@[onset,terminus,vertex.id]@ for vertices and \verb@[onset,terminus,tail vertex.id, head vertex.id]@ for edges. Dynamic attributes can be included as extra columns.} \item[toggles]{A matrix or data.frame of toggles giving a sequence of activation and deactivation times for toggles. Columns are assumed to be \verb@[toggle time,vertex.id]@ for vertices and \verb@[toggle time, tail vertex id of the edge, head vertex id of the edge]@ for edges.} \item[changes]{Like toggles, but with an additional \verb@direction@ column indicating \verb@1@ if the toggle should change the element state to active and \verb@0@ if it should be deactivated.} \end{description} Please see \verb@?networkDynamic@ for full parameter explanations. The sections below give some more in-depth examples. \subsection{But my data are panels of network matrices...} Researchers frequently have network data in the form of network panels or ``stacks'' of network matrices. The \verb@networkDynamic@ package includes one such classic dynamic network data-set in this format: Newcomb's Fraternity Networks. The data are 14 panel observations of friendship preference rankings among fraternity members in a 1956 sociology study. (For more details, see \verb@?newcomb@.) This network is a useful example because it has edge weights that change over time and the \verb@newcomb.rank@ version has asymmetric rank choice ties\footnote{The attributes of individual panels will be converted to dynamic attributes, see the section on TEAs}. <>= require(networkDynamic) data(newcomb) # load the data length(newcomb) # how many networks? is.network(newcomb[[1]]) # is it really a network? as.sociomatrix(newcomb[[1]]) # peek at sociomatrix newcombDyn <- networkDynamic(network.list=newcomb) # make dynamic get.change.times(newcombDyn) @ When converting panel data in this form, \verb@as.networkDynamic@ assumes that the panels should be assigned times of unit intervals starting at t=0, so the first panel is given the spell [0,1), the second [1,2), etc. This is important because if you use ``at'' query syntax the time does not correspond to the panel index. <>= all(as.sociomatrix(newcomb[[5]]) == as.sociomatrix(network.extract(newcombDyn,at=5))) all(as.sociomatrix(newcomb[[5]]) == as.sociomatrix(network.extract(newcombDyn,at=4))) @ If this isn't consistent with how you would like to model your data, you can use the \verb@onsets@ and \verb@termini@ parameters to provide timings for each of the panels. This is also useful if we want to be explicit about the gap in observations due to the missing week 9. <>= newcombGaps <- networkDynamic(network.list=newcomb, onsets=c(1:8,10:15),termini=c(2:9,11:16)) get.vertex.activity(newcombGaps)[[1]] # peek at spells for v1 @ Another option would be to use the \verb@adjust.activity@ function to modify the network timing after it was loaded. It is often a good idea store some descriptive meta-data for the network to give hints to functions later on about how they should bin or display the data: <>== nobs <-get.network.attribute(newcombGaps,'net.obs.period') names(nobs) nobs$'time.unit'<-'week' nobs$'mode'<-'discrete' nobs$'time.increment'<-1 set.network.attribute(newcombGaps,'net.obs.period',nobs) @ \subsection{Converting from toggles. } Sometimes dynamic network data from a simulation process arrives in an efficient ``toggle'' format. The edge dynamics of the network can be expressed as a three-column matrix giving simply the time at which an edge changes, and the vertices at either end of the edge. Because it doesn't say if the edge is turned on or off (see the ``changes'' format for that) we also need an initial network to give the starting state for each of the edges. Usually this kind of input would come from the low-level output of a simulation, but we can create a crude synthetic data-set to demonstrate the conversion. Lets say we have a network of size 10, and at each time step we want a single randomly chosen edge to turn on or off, and we will do this 1000 times. <>= toggles <-cbind(time=1:1000, tail=sample(1:10,1000,replace=TRUE), head=sample(1:10,1000,replace=TRUE)) head(toggles) # peek at begining empty<-network.initialize(10,loops=TRUE) # to define initial states randomNet <-networkDynamic(base.net=empty,edge.toggles=toggles) @ We converted the toggles using the \verb@edge.toggles@argument to \verb@networkDynamic()@. If we wanted the vertices to flip on and off as well, the function also accepts a \verb@vertex.toggles@ argument. Once the toggles have been translated into a network format, we can do things like look at the distribution of edge durations created by our crude model. <>= edgeDurations<-get.edge.activity(randomNet,as.spellList=TRUE)$duration hist(edgeDurations) summary(edgeDurations) @ Our ``simulation'' was long enough that we should see a nice long-tailed distribution of edge activity durations. However, when we check the terminus censoring, <>= sum(get.edge.activity(randomNet,as.spellList=TRUE)$terminus.censored) @ we can see that there are around 50 edges that had not ended at the termination of our simulation period, so we should interpret the mean durations with caution. It is also interesting to consider how we might have skewed the edge durations because we started with an empty network. To examine this we can construct a time-series of the number of active edges in the network by repeatedly applying the dynamic edge-counting function at each time point. <>= nEdgesActive<-sapply(0:1000, function(t){network.edgecount.active(randomNet,at=t)}) plot(nEdgesActive,xlab='timestep',ylab='number of active edges') @ This shows us that it took a few hundred time steps of ``burn in'' for the network to move from its initial extreme (the zero-edges condition) to a sort of equilibrium state. After which there were enough active edges that some of them started getting toggled off again, and it continues to wobble around a value of about 50 edges until the end. \subsection{Converting a stream of spells: McFarland's classroom interactions} Not surprisingly, the \verb@networkDynamic()@ function can create \verb@networkDynamic@ objects from a matrix of activity spells stored in a \verb@data.frames@. It assumes that the first two columns give the onset and terminus of a spell, and the third and forth columns correspond to the network indices of the ego and alter vertices for that dyad. Multiple spells per dyad are expressed by multiple rows. In the following example, we read some tabular data describing arc relationships out of example text files. For more information about the data-set (which also exists as a \verb@networkDynamic@ object) see \verb@?cls33_10_16_96@. <>= vertexData <-read.table(system.file('extdata/cls33_10_16_96_vertices.tsv', package='networkDynamic'),header=TRUE,stringsAsFactors=FALSE) vertexData[1:5,] # peek edgeData <-read.table(system.file('extdata/cls33_10_16_96_edges.tsv', package='networkDynamic'),header=TRUE,stringsAsFactors=FALSE) edgeData[1:5,] # peek @ Now that the spell data is loaded in, we need to form it into a network. We want to use the \verb@vertex_id@, \verb@start_minute@ and \verb@end_minute@ from the vertex data, and the \verb@from_vertex_id@, \verb@to_vertex_id@, \verb@start_minute@ and \verb@end_minute@ from the edge data. Since the columns are not in the order that we want, we reorder the column indices when passing to the \verb@edge.spells@ and the \verb@vertex.spells@ arguments of \verb@networkDynamic@. <>= classDyn <- networkDynamic(vertex.spells=vertexData[,c(3,4,1)], edge.spells=edgeData[,c(3,4,1,2)]) @ The conversion printed out summary of the (optional) network observation period attribute (\verb@net.obs.period@) which tells us that it made a guess that this was a continuous time network. And if we peek at the change times of the network, it appears that it this is probably accurate. <>= get.change.times(classDyn)[1:10] @ If we wanted to include the \verb@weight@ and \verb@interaction_type@ variable in the edge data as dynamic edge attributes, we can include them as extra columns and set some additional parameters. <>== classDyn <- networkDynamic(vertex.spells=vertexData[,c(3,4,1)], edge.spells=edgeData[,c(3,4,1,2,5,6)], create.TEAs=TRUE,edge.TEA.names=c('weight','type')) @ We can also store the time units for the network, just in case we (or someone else) needs to know them later <>== nobs <-get.network.attribute(classDyn,'net.obs.period') names(nobs) nobs$'time.unit'<-'minutes' set.network.attribute(classDyn,'net.obs.period',nobs) @ The original data include some attribute information for the vertices which we'd like to add, but first we need to check if they are dynamic or not. We will assume that if each \verb@vertex_id@ has only one row, the attributes must have only one spell associated with them and can be treated as static. We also must make sure the \verb@vertex_ids@ are in order. Since \verb@read.table@ creates a \verb@data.frame@ object, we explicitly convert factors to character values. <>= nrow(vertexData)==length(unique(vertexData$vertex_id)) @ Looks good! Lets load 'em up... <>= set.vertex.attribute(classDyn,"data_id",vertexData$data_id) set.vertex.attribute(classDyn,"sex",as.character(vertexData$sex)) set.vertex.attribute(classDyn,"role",as.character(vertexData$role)) @ To run standard network measures we will need to first ``bin'' or ``slice'' the network up into static networks. Using the \verb@get.networks@ function we will collapse the classroom data into series of networks, each of which aggregates 5 minutes of streaming interactions. <>= classNets <- get.networks(classDyn,start=0,end=50,time.increment=5,rule='latest') classDensity <- sapply(classNets, network.density) plot(classDensity,type='l',xlab='network slice #',ylab='density') @ Since this data-set consists of continuous time streams of relational information, the choice of 5 minutes is fairly arbitrary. Other durations will reveal dynamics at various timescales. <>= par(mfrow=c(2,2)) # show multiple plots plot(network.extract( classDyn,onset=0,length=40,rule="any"), main='entire 40 min class period',displaylabels=T) plot(network.extract( classDyn,onset=0,length=5,rule="any"), main='a 5 min chunk',displaylabels=T) plot(network.extract( classDyn,onset=0,length=2.5,rule="any"), main='a 2.5 min chunk',displaylabels=T) plot(network.extract( classDyn,onset=0,length=.1,rule="any"), main='a single conversation turn',displaylabels=T) @ \subsection{Reconciling and data and adjusting timing} Often network data does not arrive in the ideal form for analysis and may need some cleaning and adjustment. For example, the data describing edge activity and vertex activity may have come from different sources and may not fully align. As mentioned earlier the \verb@network.dynamic.check@ function can be used to highlight some of these cases, but how can we resolve them? The \verb@reconcile.edge.activity@ and \verb@reconcile.vertex.activity@ functions can be used to modify either the edge or vertex activity spells to force them to be consistent. If we make a trivial example where the first vertex is only active part of the time, even though its edge is always active, we can then force the edge to match. <>= # make a network where the first vertex is not always active dirtyData<-networkDynamic(vertex.spells=matrix(c(0,1,1, 3,5,1, 0,5,2),ncol=3,byrow=TRUE), edge.spells=matrix(c(0,5,1,2),ncol=4,byrow=TRUE)) network.dynamic.check(dirtyData)$dyad.checks # print out the edge spell before .. as.data.frame(dirtyData) @ We can then ask to have the spells of the edges truncated to match the vertices <>= reconcile.edge.activity(dirtyData,mode="reduce.to.vertices") as.data.frame(dirtyData) @ Notice that the edge.id 1 now has two spells, one from 0 until 1, and the second from 3 until 5. Another common use case for this is when we have only edge activity data, and want to make it so that vertices only become active when they are involved in an edge (they don't appear as isolates earlier) <>= # before.. head(get.vertex.activity(classDyn,as.spellList = TRUE)) # modify vertex spells to encompass all of their incident edges reconcile.vertex.activity(classDyn,mode='encompass.edges') # after.. head(get.vertex.activity(classDyn,as.spellList = TRUE)) @ A related utility is the function for applying a transformation to all the time units in a network. This is useful if you want to shift all of the timing to start at a certain value, or if you wanted to change the time units that the network is measured in. For example, if we wanted to change the McFarland Classroom network to be measured in units of hours instead of minutes: <>= adjust.activity(classDyn,factor = 1/60) head(get.vertex.activity(classDyn,as.spellList = TRUE)) @ \subsection{Importing Pajek's timed network format} Version 1.13 of the \verb@network@ package expanded support for reading the .net and .paj formats used by the Pajek network analysis software (\url{http://mrvar.fdv.uni-lj.si/pajek/}), including Pajek's most commonly used temporal network formats in the \verb@read.paj()@ function. We can download the Sampson monastery data from a Pajek example data set and extract it from a compressed zip archive. Note that to get \verb@read.paj@ to give us back a \verb@networkDynamic@ object we have to include the \verb@time.format='networkDynamic'@ argument. <>= sampFile<-tempfile('days',fileext='.zip') download.file('https://github.com/bavla/Nets/raw/refs/heads/master/data/Pajek/esna/Sampson.zip',sampFile) sampData<-read.paj(unz(sampFile,'Sampson.paj'), time.format='networkDynamic', edge.name = 'liked') names(sampData) @ Because this was a .paj file, it can include multiple networks and some vertex-level data as separate elements in the the list returned. In this case, we want to grab the first more complete network (for background info, see \url{http://vlado.fmf.uni-lj.si/pub/networks/data/esna/sampson.htm}). We can attach the partition info onto the network as an attribute, and filter out the edges to include only the positive ``like'' ties. <>= sampData$partitions sampData$networks[[1]] sampDyn<-get.inducedSubgraph(sampData$networks[[1]], eid=which(sampData$networks[[1]]%e%'liked' > 0)) sampDyn%v%'cloisterville'<-sampData$partitions$Sampson_cloisterville @ We can then extract and plot the network at several time points. Notice that some of the novices had left the monastery by time 5. <>= par(mfcol=c(2,2)) plot(network.extract(sampDyn,at=1),vertex.col='cloisterville', edge.col='gray', label.cex=0.6, displaylabels=TRUE, main='Sampson "like" net at time 1') plot(network.extract(sampDyn,at=2),vertex.col='cloisterville', edge.col='gray',label.cex=0.6, displaylabels=TRUE, main='Sampson "like" net at time 2') plot(network.extract(sampDyn,at=3),vertex.col='cloisterville', edge.col='gray',label.cex=0.6, displaylabels=TRUE, main='Sampson "like" net at time 3') plot(network.extract(sampDyn,at=5),vertex.col='cloisterville', edge.col='gray',label.cex=0.6, displaylabels=TRUE, main='Sampson "like" net at time 5') par(mfcol=c(1,1)) @ \subsection{Batteries and tergm example not included} Unfortunately we can't include a real live \verb@tergm@ model example here, because the \verb@tergm@ package depends on \verb@networkDynamic@, and we don't want to create a circular package dependency. But are nice examples located in the \verb@tegrm@ and \verb@ndtv@ package vignettes. The \verb@EpiModel@ package also includes tools for saving longitudinal network epidemic simulations as \verb@networkDynamic@ objects. \section{Persistent IDs} As has already been mentioned, the standard vertex and edge ids used in the network package are indices, so they must change when the network size changes during an extraction operation \footnote{In the case of vertex ids, they may also change during vertex deletions, or additions to the first mode of a bipartite network}. So how can we follow a specific network element through a series of slicing and dicing operations? Since v0.5, the networkDynamic package supports defining an (optional) ``persistent id'' (pid) for edges and vertices. \subsection{Translating between ids and persistent ids} Once a persistent id has been defined, the functions \verb@get.vertex.id()@ and \verb@get.vertex.pid@ can be used to translate between the normal ids and the pids. For edges, the functions are named \verb@get.edge.id@ and \verb@get.edge.pid@. Lets look at an example where we find the original vertices corresponding to vertices in smaller extracted net. <>== haystack<-network.initialize(30) activate.vertices(haystack,v=10:20) @ Now hide some needles in the haystack... <>== set.vertex.attribute(haystack,'needle',TRUE,v=sample(10:20,2)) @ ... make up an id for the vertices, and define that it will be our persistent id. <>= set.vertex.attribute(haystack,'hayId',paste('straw',1:30,sep='')) set.network.attribute(haystack,'vertex.pid','hayId') @ Lets find the needles in the new stack after some hay has been removed over time <>= newstack<-network.extract(haystack,at=100,active.default=FALSE) network.size(newstack) needleIds <-which(get.vertex.attribute(newstack,'needle')) needleIds @ What are the pids of vertices with needles? Which vertices are the corresponding ones in the original haystack? <>== get.vertex.pid(newstack,needleIds) get.vertex.id(haystack,get.vertex.pid(newstack,needleIds)) @ \subsection{Defining pids} In the example above, we made up a new id, but if the data set already had some type of unique identifier for vertices, we could have used it instead. For the previous McFarland example we could use \verb@data_id@: <>== set.network.attribute(classDyn,'vertex.pid','data_id') @ In some cases it might be tempting to use the \verb@vertex.names@ attributes of networks as a persistent id without checking that it is unique. This can cause problems if vertices are added or deleted. <>= net<-network.initialize(3) add.vertices(net,1) delete.vertices(net,2) # notice the NA value as.matrix(net) @ To make life easier, we can just indicate that a unique set of \verb@vertex.names@ can safely be used as a vertex.pid by setting \verb@vertex.pid@ to \verb@'vertex.names'.@ This has the advantage of not adding an extra attribute that needs to be carried around, and the pids will appear as the labels. <>= net<-network.initialize(3) set.network.attribute(net,'vertex.pid','vertex.names') add.vertices(net,1,vertex.pid='4') add.vertices(net,1) delete.vertices(net,2) as.matrix(net) @ Notice that when we added vertices in the first case we explicitly included in a vertex pid for the new vertex. In the second case, we didn't specify a pid, so it made up a messy one to make sure they stayed unique. The function \verb@initialize.pids()@ can also be used to create a set of pids on all existing vertices (named vertex.pid) and edges (named edge.pid). The pids are currently initialized with meaningless but unique pseudo-random hex strings using the \verb@tempfile@ function (something like \verb@'4ad912252bc2'@) These are also the types of new pids that will be created if add.vertices is called in a network with a vertex.pid defined, as in the example above. It is a good idea to define pids after a network object as been constructed and before any extractions are performed. <>== net<-network.initialize(3) add.edges(net,tail=1:2,head=2:3) initialize.pids(net) net%v%'vertex.pid' net%e%'edge.pid' @ The edge pids can be useful in looking up edges if vertex deletions cause the ids of the edge's vertices to be permuted. \section{Transforming networkDynamic objects to other representations} Great, I got all my data into your magic format, now how do I get it out again? \subsection{Converting to lists of spells} As we've already demonstrated, for number of types of analysis it is useful to be able to dump the edge timing information into a ``flat'' tabular representation. <>= newcombEdgeSpells<-get.edge.activity(newcombDyn,as.spellList=TRUE) newcombEdgeSpells[1:5,] # peek at the beginning @ The first two columns of the spell matrix give the network indices of the vertices involved in the edge, and the next two give the onset and terminus for the spell. The \verb@right.censored@ column indicates if a statistical estimation process using this spell list should assume that the entire duration of the edge's activity is included or that it was partially censored by the observation window. Note that the \verb@duration@ column gives the total duration for the \emph{specific spell} of the edge,(not the entire edge duration) and an edge may appear in multiple rows. Because this may be the most common type of conversion people need to do, we also created an \verb@as.data.frame@ alias \verb@get.edge.activity@ function. <>= newcombEdgeSpells<-as.data.frame(newcombDyn) newcombEdgeSpells[1:5,] # peek at the beginning @ Of course, these methods only return information about the edge dynamics so there is a corresponding \verb@get.vertex.activity@ function. <>= vertSpells <- get.vertex.activity(newcombDyn,as.spellList=TRUE) vertSpells[1:5,] @ Which is not so exciting in this example, since we don't have any vertex dynamics. \subsection{Converting to a list of networks or matrices} The \verb@get.networks@ function gives us a quick way to collapse a series of static \verb@network@ objects from a \verb@networkDynamic@. We can use \verb@lapply@ to extract a list of several non-overlapping unit slices from the random network we created a while back, and them print them out as matrices. <>= lapply(get.networks(randomNet,start=0,end=2,time.increment=1),as.matrix) @ If we have a network that already has 'slicing' information in its \verb@net.obs.period@ attribute, \verb@get.networks@ can use those values a defaults. So if we apply it to our previous \verb@newcombGaps@ example we should get back our list of 15 networks, with the ninth one missing. <>= newSlices<-get.networks(newcombGaps) sapply(newSlices,network.size) @ \section{Dynamic attributes} An important tool for working with dynamic networks is the ability to represent time-varying attributes of networks, vertices (changing properties) and edges (changing weights). In the \verb@networkDynamic@ package we refer to these as dynamic attributes or ``TEAs'' (Temporally Extended Attributes). A TEA is a standard edge, vertex, or network attribute that has a name ending in \verb@.active@ and carries meta-data detailing changes its state over time. We store the TEAs as a two-part list, where the first part is a list of values, and the second is a spell matrix where each row gives the onset and terminus of activity for the corresponding element in value list. See \verb@?activate.vertex.attribute@ for the full specification of Temporally Extended Attributes. Of course we try to hide most of this as much as possible by providing a set of accessor functions. \subsection{Activating TEA attributes} The functions for creating TEA attributes are named similarly to the regular functions for manipulating network, vertex, and edge attributes but they also accept the spell-related arguments (onset, terminus, at, length). <>= net <-network.initialize(5) activate.vertex.attribute(net,"happiness", -1, onset=0,terminus=1) activate.vertex.attribute(net,"happiness", 5, onset=1,terminus=3) activate.vertex.attribute(net,"happiness", 2, onset=4,terminus=7) list.vertex.attributes(net) # what are they actually named? get.vertex.attribute.active(net,"happiness",at=2) get.vertex.attribute(net,"happiness.active",unlist=FALSE)[[1]] @ Notice that when using the \verb@activate.vertex.attribute@ and \verb@get.vertex.attribute.active@ functions we don't have to include the ``.active'' part of the attribute name, it handles that on its own. When we used the regular \verb@get.vertex.attribute@ function to peek at the attribute of the first vertex we can see the list of values (-1,5,2) and the spell matrix. We also had to include the \verb@unlist=FALSE@ argument so that it didn't mangle the list object by smooshing into a vector when it was returned. There are similar activation functions for edge and network-level attributes. <>= activate.network.attribute(net,'colors',"red", onset=0,terminus=1) activate.network.attribute(net,'colors',"green", onset=1,terminus=5) add.edges(net,tail=c(1,2,3),head=c(2,3,4)) # need edges to activate- activate.edge.attribute(net,'weight',c(5,12,7),onset=1,terminus=3) activate.edge.attribute(net,'weight',c(1,2,1),onset=3,terminus=7) @ Since we didn't give the edges themselves timing info, they will be assumed to be always active. But we've specified that the ``weight'' of the edges should vary over time. \subsection{Querying TEA attributes} What happens when there are no values defined? When we activate the vertex attributes, we left a gap in the spell coverage. What if we ask for values in the time period? <>= get.vertex.attribute.active(net,"happiness",at=3.5) get.vertex.attribute.active(net,"happiness", onset=2.5,terminus=3.5) get.vertex.attribute.active(net,"happiness", onset=2.5,terminus=3.5,rule="all") @ In the first case, no values are defined so \verb@NA@ is returned. In the second case, the query spell included part of a defined value since inclusion rule defaults to \verb@rule='any'@ and the query intersected with part of the spell associated with the value 5. We can ask it to only return values if they match the entire query spell by setting \verb@rule='all'@, which is what happened in the third case. The functions also permit queries that will intersect with multiple attribute values. In this case the earliest value is returned, but it also gives a warning that the value returned may not be the appropriate value for the time range. <>= get.vertex.attribute.active(net,"happiness",onset=2.5,terminus=4.5) @ <>= cat('Warning message: In get.vertex.attribute.active(net, "happiness", onset = 2.5, terminus = 4.5) : Multiple attribute values matched query spell for some vertices, only earliest value used') @ If we know that this behavior (returning the earliest attribute value that intersects with the query spell) is what is desired, we can suppress the warnings by specifying \verb@rule='earliest'@. <>= get.vertex.attribute.active(net,"happiness",onset=2.5,terminus=4.5,rule='earliest') @ As might be expected, \verb@rule='latest'@ also works, but it returns the latest (most recent, largest time value) attribute intersecting with the query spell. <>= get.vertex.attribute.active(net,"happiness",onset=2.5,terminus=4.5,rule='latest') @ In many cases the user might want to aggregate the values together in some way, but that there is no way for the query function know what the correct aggregation method would be--especially if the attributes have categorical rather than numeric values. Should the results be a sum? An average? A time-weighted average? A value sampled at random? In order to handle these cases correctly, code must be designed to explicitly handle the multiple values. To facilitate this the query functions have an argument \verb@return.tea=TRUE@ which can be set so that they will return the (appropriately trimmed) TEA structure to be evaluated. <>= get.vertex.attribute.active(net,"happiness",onset=2.5,terminus=4.5, return.tea=TRUE)[[1]] @ If we wanted to calculate the sum value for an attribute over a particular time range <>= sapply(get.vertex.attribute.active(net,"happiness",onset=0,terminus=7, return.tea=TRUE),function(splist){ sum(unlist(splist[[1]])) }) @ The query syntax for network- and edge-level TEAs is similar to the vertex case \verb@get.network.attribute.active@ and \verb@get.edge.attribute.active@. However, in keeping with the pattern established by the \verb@network@ package, \verb@get.edge.value.active@ works as an alternate. <>= get.edge.attribute.active(net,'weight',at=2) get.edge.attribute.active(net,'weight',at=5) @ There are are also functions for checking which attributes are present at any point in time (optionally excluding non-TEA attributes). <>= list.vertex.attributes.active(net,at=2) list.edge.attributes.active(net,at=2) list.network.attributes.active(net,at=2,dynamic.only=TRUE) @ There may be situations where we want to know the time at which a TEA attribute takes a certain value or matches a specific criteria. For example, when does a value of happiness equal 2? Or when is the edge weight greater than 10? <<>= when.vertex.attrs.match(net,"happiness",2) when.edge.attrs.match(net,'weight',10, match.op = '>') @ By default, the functions will return a vector with an appropriate value for each vertex (or edge) and they accept the standard \verb@e@ and \verb@v@ arguments to specify a subset of edges or vertices to query. Notice that the second example returned \verb@Inf@ for the edges with weights that never met the match criteria. The value returned for non-matching elements can be set with the \verb@no.match@ argument. \subsection{Modifying TEAs} The TEA functions are designed to maintain the appropriate sorted representation of attributes and spells even if attributes are not added in temporal order. So its possible to overwrite the attribute values. <>= activate.vertex.attribute(net, "happiness",100, onset=0,terminus=10,v=1) get.vertex.attribute.active(net,"happiness",at=2) @ Or set attributes to be inactive for specific time ranges and vertices. <>= deactivate.vertex.attribute(net, "happiness",onset=1,terminus=10,v=2) get.vertex.attribute.active(net,"happiness",at=2) @ \section{Making Lin Freeman's windsurfers gossip} For a more advanced and realistic demonstration of TEAs and, we will construct a toy rumor diffusion model. Our intention is to release packages in the near future which provide built-in functions for much of the simulation code below. In 1988, Lin Freeman collected a month-long data-set of daily social interactions between windsurfers on California beaches \citep{almquist}, \citep{freeman}. The data-set is included in \verb@networkDynamic@ and has some challenging features, including vertex dynamics (different people are present on the beach on different days) and a missing day of observation. (Run \verb@?windsurfers@ for more details). <>= data(windsurfers) # let's go to the beach! range(get.change.times(windsurfers)) sapply(0:31,function(t){ # how many people in net each day? network.size.active(windsurfers,at=t)}) @ Although not directly relevant for the trivial simulation we are about to build, the \verb@windsurfers@ network object also includes some network-level dynamic attributes that give information about the weather, etc. We can extract the information as a time-series for plotting. <>= list.network.attributes.active(windsurfers,-Inf,Inf,dynamic.only=TRUE) par(mfcol=c(2,1)) # show multiple plots plot(sapply(0:31,function(t){ # how many people in net each day? network.size.active(windsurfers,at=t)}), type='l',xlab="number on beach",ylab="day" ) plot(sapply(0:31,function(t){ # how many people in net each day? get.network.attribute.active(windsurfers,'atmp',at=t)}), type='l',xlab="air temp",ylab="day" ) par(mfcol=c(1,1)) @ But the appropriate values will also appear in the network returned when we collapse to a specific day. <>= day3 <-network.collapse(windsurfers,at=2) day3%n%'day' # what day of the week is day 3? day3%n%'atmp' # air temp? @ \subsection{A toy diffusion model} We will create a very crude model of information transmission as an example of simulation employing dynamic attributes on a network with changing edges and vertices. We will assume that there is a ``rumor'' spreading among the windsurfers. At each time step, they have some probability of passing the rumor to the people they are interacting with on the beach that day. First we define a function to run the simulation: <>= runSim<-function(net,timeStep,transProb){ # loop through time, updating states times<-seq(from=0,to=max(get.change.times(net)),by=timeStep) for(t in times){ # find all the people who know and are active knowers <- which(!is.na(get.vertex.attribute.active( net,'knowsRumor',at=t,require.active=TRUE))) # get the edge ids of active friendships of people who knew for (knower in knowers){ conversations<-get.edgeIDs.active(net,v=knower,at=t) for (conversation in conversations){ # select conversation for transmission with appropriate prob if (runif(1)<=transProb){ # update state of people at other end of conversations # but we don't know which way the edge points so.. v<-c(net$mel[[conversation]]$inl, net$mel[[conversation]]$outl) # ignore the v we already know v<-v[v!=knower] activate.vertex.attribute(net,"knowsRumor",TRUE, v=v,onset=t,terminus=Inf) # record who spread the rumor activate.vertex.attribute(net,"heardRumorFrom",knower, v=v,onset=t,length=timeStep) # record which friendships the rumor spread across activate.edge.attribute(net,'passedRumor', value=TRUE,e=conversation,onset=t,terminus=Inf) } } } } return(net) } @ \subsection{Go!} <>= set.seed(123) # so we will get the same results each time the document is built @ Then we set the parameters and the initial state of the network and run the simulation. <>= timeStep <- 1 # units are in days transProb <- 0.2 # how likely to tell in each conversation/day # start the rumor out on vertex 1 activate.vertex.attribute(windsurfers,"knowsRumor",TRUE,v=1, onset=0-timeStep,terminus=Inf) activate.vertex.attribute(windsurfers,"heardRumorFrom",1,v=1, onset=0-timeStep,length=timeStep) windsurfers<-runSim(windsurfers,timeStep,transProb) # run it! @ \subsection{OK, what happened?} We'll make some network plots so we can get an idea of what happened. <>= par(mfcol=c(1,2)) # show two plots side by side wind7<-network.extract(windsurfers,at=7) plot(wind7, edge.col=sapply(!is.na(get.edge.value.active(wind7, "passedRumor",at=7)), function(e){ switch(e+1,"darkgray","red")}), vertex.col=sapply(!is.na(get.vertex.attribute.active(wind7, "knowsRumor",at=7)), function(v){switch(v+1,"gray","red")}), label.cex=0.5,displaylabels=TRUE,main="gossip at time 7") wind30<-network.extract(windsurfers,at=30) plot(wind30, edge.col=sapply(!is.na(get.edge.value.active(wind30, "passedRumor",at=30)),function(e){switch(e+1,"darkgray","red")}), vertex.col=sapply(!is.na(get.vertex.attribute.active(wind30, "knowsRumor",at=30)),function(v){switch(v+1,"gray","red")}), label.cex=0.5,displaylabels=TRUE,main="gossip at time 30") par(mfcol=c(1,1)) @ Which people heard the rumor halfway through the month? How many heard each day? <>= get.vertex.attribute.active(windsurfers,'knowsRumor',at=15) plot(sapply(0:31,function(t){ sum(get.vertex.attribute.active(windsurfers,'knowsRumor',at=t), na.rm=TRUE)}), main='windsurfers who know',ylab="# people",xlab='time' ) @ In additional to extracting values, we can do operations using the TEA attributes directly. Our simulation function recorded each time a person was told the rumor. What are the ids of the people who told person 3? On which days did person 3 hear the rumor? <>= # pull TEA from v3, extract values from 1st part and unlist unlist(get.vertex.attribute.active(windsurfers,'heardRumorFrom', onset=0,terminus=31,return.tea=TRUE)[[3]][[1]]) # pull TEA from v3, extract times from 2nd part and pull col 1 get.vertex.attribute.active(windsurfers,'heardRumorFrom', onset=0,terminus=31,return.tea=TRUE)[[3]][[2]][,1] @ \subsection{Picturing the rumor tree} We can also write a function to create a rumor transmission tree using the \verb@heardRumorFrom@ attribute in order to plot out the sequence of conversation steps that spread the gossip. <>= transTree<-function(net){ # for each vertex in net who knows knowers <- which(!is.na(get.vertex.attribute.active(net, 'knowsRumor',at=Inf))) # find out who the first transmission was from transTimes<-get.vertex.attribute.active(net,"heardRumorFrom", onset=-Inf,terminus=Inf,return.tea=TRUE) # subset to only ones that know transTimes<-transTimes[knowers] # get the first value of the TEA for each knower tellers<-sapply(transTimes,function(tea){tea[[1]][[1]]}) # create a new net of appropriate size treeIds <-union(knowers,tellers) tree<-network.initialize(length(treeIds),loops=TRUE) # copy labels from original net set.vertex.attribute(tree,'vertex.names',treeIds) # translate the knower and teller ids to new network ids # and add edges for each transmission add.edges(tree,tail=match(tellers,treeIds), head=match(knowers,treeIds) ) return(tree) } plot(transTree(windsurfers),displaylabels=TRUE, label.cex=0.5,label.col='blue',loop.cex=3) @ We can see that the rumor started at v1, our seed vertex, which has a little loop because it infected itself. \section{Related packages} The statnet team is releasing several packages that work closely with the \verb@networkDynamic@ package to provide additional features. \begin{itemize} \item \verb@networkDynamicData@ : A collection of dynamic network datasets from various sources and multiple authors represented networkDynamic format. \url{http://cran.r-project.org/web/packages/networkDynamicData} \item \verb@ndtv@ : Network Dynamic Temporal Visualization package -- like TV for your networks. The \verb@ndtv@ package creates network movies as videos or interactive HTML5 animations, timelines and other visualizations of dynamic networks stored in the \verb@networkDynamic@ format. \url{http://cran.r-project.org/web/packages/ndtv}. \item A longer tutorial demonstrating features of \verb@ndtv@ and \verb@networkDynamic@ data import is located at \url{http://statnet.csde.washington.edu/workshops/SUNBELT/current/ndtv/ndtv_workshop.html} \item \verb@tsna@ : Temporal SNA tools for measuring and doing descriptive statistics on dynamic networks stored in the \verb@networkDynamic@ format \url{http://cran.r-project.org/web/packages/tsna}. \end{itemize} \section{Citing networkDynamic} You can use R's built in citation function to give the citation for the package. <>= citation(package='networkDynamic') @ \section{Vocabulary definitions} This is a list of terms and common function arguments giving their special meanings within the context of the \verb@networkDynamic@ package. \begin{description} \item [spell] bounded interval of time describing activity period of a network element \item [onset] beginning of spell \item [terminus] end of a spell \item [length] the duration of a spell \item [at] a single time point, a spell with zero length where onset=terminus \item [start] beginning (least time) of observation period (or series of spells) \item [end] end (greatest time) of observation period (or series of spells) \item [spell list or spell matrix] a means of describing the activity of a network or network element using a matrix in which one column contains the onsets and another the termini of each spell \item [toggle list] a means of describing the activity of a network or network element using a list of times at which an element changed state (`toggled') \item [onset-censored] when elements of a dynamic network are known to be active before start of the defined observation period, even if the onset of the spell is not known. \item [terminus-censored] when elements of a dynamic network are known to be active after the end of the defined observation period, even if the terminus of the spell is not known. \item [TEA] Temporally Extended Attribute: structure for storing dynamic attribute data on vertices, edges, and networks. \item [pid] A ``Persistent ID'' for a vertex or edge that will remain the same despite extraction and deletion operations. \end{description} \section{Complete package function listing} Below is a reference list of all the public functions included in the package \verb@networkDynamic@ <>= cat(ls("package:networkDynamic"),sep="\n") @ \begin{thebibliography}{} \bibitem[Almquist, et al (2011)]{almquist} Almquist, Zack W. and Butts, Carter T. (2011). \newblock Logistic Network Regression for Scalable Analysis of Networks with Joint Edge/Vertex Dynamics. \newblock \emph{IMBS Technical Report MBS 11-03}, University of California, Irvine. \bibitem[Bender-deMoll et al.(2008)]{dynamicNetwork} Bender-deMoll, S., Morris, M. and Moody, J. (2008) \newblock Prototype Packages for Managing and Animating Longitudinal Network Data: dynamicnetwork and rSoNIA \newblock \emph{Journal of Statistical Software} 24:7. \bibitem[Butts(2008)]{network} Butts CT (2008). \newblock network: A Package for Managing Relational Data in R. \newblock \emph{Journal of Statistical Software}, 24(2). \url{http://www.jstatsoft.org/v24/i02/}. \bibitem[Krivitsky P and Handcock M (2013)]{tergm}. \newblock Fit, Simulate and Diagnose Models for Network Evolution based on Exponential-Family Random Graph Models. \newblock The Statnet Project (). Version 3.2-11879-11880.1-2013.02.22-17.20.10, . \bibitem[Hunter et al.(2008b)]{ergm} Hunter DR, Handcock MS, Butts CT, Goodreau SM, Morris M (2008b). \newblock ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. \newblock \emph{Journal of Statistical Software}, 24(3). \url{http://www.jstatsoft.org/v24/i03/}. \bibitem[Newcomb(1961)]{newcomb} Newcomb T. (1961) \emph{The acquaintance process} New York: Holt, Reinhard and Winston. \bibitem[Freeman et al (1988)]{freeman} Freeman, L. C., Freeman, S. C., Michaelson, A. G., (1988) \newblock On human social intelligence. \newblock \emph{Journal of Social Biological Structure} 11, 415-425. \end{thebibliography} \end{document}