Playlist: 30 videos

Graph Limits, Nonparametric Models, and Estimation

Remote video URL
0:43:20
Sumit Mukherjee (Columbia University)
https://simons.berkeley.edu/node/22617
Graph Limits, Nonparametric Models, and Estimation

Consider the subgraph sampling model, where we observe a random subgraph of a given (possibly non random) large graph $G_n$, by choosing vertices of $G_n$ independently at random with probability $p_n$. In this setting, we study the question of estimating the number of copies $N(H,G_n)$ of a fixed motif/small graph (think of $H$ as edges, two stars, triangles) in the big graph $G_n$. We derive necessary and sufficient conditions for the consistency and the asymptotic normality of a natural Horvitz-Thompson (HT) type estimator.

As it turns out, the asymptotic normality of the HT estimator exhibits an interesting fourth-moment phenomenon, which asserts that the HT estimator (appropriately centered and rescaled) converges in distribution to the standard normal whenever its fourth-moment converges to 3. We apply our results to several natural graph ensembles, such as sparse graphs with bounded degree, Erdős-Renyi random graphs, random regular graphs, and dense graphons.

This talk is based on joint work with Bhaswar B. Bhattacharya and Sayan Das.
Visit talk page
Remote video URL
0:53:21
Peter Caines (McGill University)
https://simons.berkeley.edu/node/22609
Graph Limits, Nonparametric Models, and Estimation

An embedded vertexon sequence is the sequence of vertex sets of a sequence of graphs in a connected compact set $M$ in $R^{m}$ together with an asymptotically dense partition hierarchy of $M$. Vertexon sequences and the associated graph sequences have subsequential limit measures in $M$ and $M^2$ respectively, independent of the partition hierarchy within a large class. The differentiation of functions on vertexon limits with open support in $M,$ and certain directional derivatives in the support of the associated graphon measure in $M^2,$ may be defined. Consequently, subject to conditions, one may seek stationary node dependent Nash values for Embedded Graphon Mean Field Games (EGMFGs). An example will be given in the case of a class of Linear Quadratic Gaussian EGMFGs.\

Work with Rinel Foguen-Tschuendom, Shuang Gao, Minyi Huang
Visit talk page
Remote video URL
0:50:11
Francesca Parise (Cornell University)
https://simons.berkeley.edu/node/22610
Graph Limits, Nonparametric Models, and Estimation

Many of today’s most promising technological systems involve very large numbers of autonomous agents that influence each other and make strategic decisions within a network structure. Examples include opinion dynamics, targeted marketing in social networks, economic exchange and international trade, product adoption and social contagion. While traditional tools for the analysis of these systems assumed that a social planner has full knowledge of the underlying game, when we turn to very large networks two issues emerge. First, collecting data about the exact network of interactions becomes very expensive or not at all possible because of privacy concerns. Second, methods for designing optimal interventions that rely on the exact network structure typically do not scale well with the population size.

To obviate these issues, in this talk I will consider a framework in which the social planner designs interventions based on probabilistic instead of exact information about agent’s interactions. I will introduce the tool of “graphon games” as a way to formally describe strategic interactions in this setting and I will illustrate how this tool can be exploited to design asymptotically optimal interventions for general classes of network games, beyond the linear quadratic setting.
Visit talk page
Remote video URL
0:47:41
Sonja Petrović (Illinois Institute of Technology)
https://simons.berkeley.edu/node/22607
Graph Limits, Nonparametric Models, and Estimation

Consider longitudinal networks whose edges turn on and off according to a discrete-time Markov chain with exponential-family transition probabilities. We characterize when their joint distributions are also exponential families with the same parameter, and show that the permutation-uniform subclass of these chains permit interpretation as an independent, identically distributed sequence on the same state space. We apply these ideas to temporal exponential random graph models, for which permutation uniformity is well suited, and discuss mean-parameter convergence, dyadic independence, and exchangeability. The framework facilitates applying standard tools to longitudinal-network Markov chains from either asymptotics or single-observation exponential random graph models.
The latter are often in log-linear form, allowing us to frame the problem of testing model fit through an exact conditional test whose p-value can be approximated efficiently in networks of both small and moderately large sizes. An important extension of this theory is to latent-variable blockmodels, an application which will be briefly discussed.

This talk is based on joint work with William K. Schwartz, Hemanshu Kaul, Despina Stasi, Elizabeth Gross, Debdeep Pati, and Vishesh Karwa.
Visit talk page
Remote video URL
0:47:11
Frank den Hollander (Leiden University)
https://simons.berkeley.edu/node/22612
Graph Limits, Nonparametric Models, and Estimation

We consider an inhomogeneous Erd\H{o}s-R\'enyi random graph $G_N$ with vertex set $[N] = \{1,\dots,N\}$ for which the pair of vertices $i,j \in [N]$, $i\neq j$, is connected by an edge with probability $r(\tfrac{i}{N},\tfrac{j}{N})$, independently of other pairs of vertices. Here, $r\colon\,[0,1]^2 \to (0,1)$ is a symmetric function that plays the role of a reference graphon.

(1) Let $\lambda_N$ be the maximal eigenvalue of the \emph{adjacency matrix} of $G_N$. It is known that $\lambda_N/N$ satisfies an LDP with rate $N$ as $N \to \infty$. The associated rate function $\psi_r$ is given by a variational formula that involves the rate function $I_r$ of a large deviation principle on graphon space. We analyse this variational formula in order to identify the basic properties of $\psi_r$, especially when the reference graphon $r$ is of rank 1.

(2) Let $\hat\lambda_N$ be the maximal eigenvalue of the \emph{Laplacian matrix} of $G_N$. We show that $\hat\lambda_N/N$ satisfies a downward LDP with rate $\binom{N}{2}$ and an upward LDP with rate $N$. The associated rate functions $\hat\psi_r$ and $\hat\psi^*_r$ are those of the downward LDP and upward LDP for the maximal degree in the graph. We identify the basic properties of both $hat\psi_r$ and $\hat\psi^*_r$.
Visit talk page
Remote video URL
0:51:5
Kavita Ramanan (Brown University)
https://simons.berkeley.edu/node/22613
Graph Limits, Nonparametric Models, and Estimation

We consider interacting particle systems on suitable convergent sequences of sparse (or heterogeneous graphs) and show that the limiting dynamics of the associated neighborhood empirical measure process
(the so-called hydrodynamic limit) can be autonomously characterized in terms of a non-Markovian process. We then describe Markovian approximations to the latter and provide examples where they are exact. This includes joint work with G. Cocomello and A. Ganguly.
Visit talk page
Remote video URL
0:47:5
Shankar Bhamidi (University of North Carolina, Chapel Hill)
https://simons.berkeley.edu/node/22614
Graph Limits, Nonparametric Models, and Estimation

The goal of this talk is to describe probabilistic approaches to three major problems for dynamic networks, both of which are intricately connected to long range dependence in the evolution of such models:

1.Nonparametric change point detection: Consider models of growing networks which evolve via new vertices attaching to the pre-existing network according to one attachment function $f$ till the system grows to size $τ(n) n$, when new vertices switch their behavior to a different function g till the system reaches size n. The goal is to estimate the change point given the observation of the networks over time with no knowledge of the functions driving the dynamics. We will describe non-parametric estimators for such problems.

2. Detecting the initial seed which resulted in the current state of the network: Imagine observing a static time slice of the network after some large time $n$ started with an initial seed. Suppose all one gets to see is the current topology of the network (without any label or age information). Developing probably efficient algorithms for estimating the initial seed has inspired intense activity over the last few years in the probability community. We will describe recent developments in addressing such questions including robustness results such as the fixation of so called hub vertices as time evolves.

3. Co-evolving networks: models of networks where dynamics on the network (directed random walks towards the root) modifies the structure of the network which then impacts behavior of subsequent dynamics. We will describe non-trivial phase transitions of condensation around the root and its connection to quasi-stationary distributions of 1-d random walks.
Visit talk page
Remote video URL
0:45:51
Mei Yin (University of Denver)
https://simons.berkeley.edu/node/22616
Graph Limits, Nonparametric Models, and Estimation

The theory of graph limits is an important tool in understanding properties of large networks. We begin the talk with a survey of this theory, concentrating in particular on the sparse setting. We then investigate a power-law random graph model and cast it in the sparse graph limit theory framework. The distinctively different structures of the limit graph are explored in detail in the sub-critical and super-critical regimes. In the sub-critical regime, the graph is empty with high probability, and in the rare event that it is non-empty, it consists of a single edge. Contrarily, in the super-critical regime, a non-trivial random graph exists in the limit, and it serves as an uncovered boundary case between different types of graph convergence.
Visit talk page
Remote video URL
0:48:50
Luana Ruiz (University of Pennsylvania)
https://simons.berkeley.edu/node/22611
Graph Limits, Nonparametric Models, and Estimation

Graph neural networks (GNNs) are successful at learning representations from most types of network data but suffer from limitations in large graphs, which do not have the Euclidean structure that time and image signals have in the limit. Yet, large graphs can often be identified as being similar to each other in the sense that they share structural properties. Indeed, graphs can be grouped in families converging to a common graph limit -- the graphon. A graphon is a bounded symmetric kernel which can be interpreted as both a random graph model and a limit object of a convergent sequence of graphs. Graphs sampled from a graphon almost surely share structural properties in the limit, which implies that graphons describe families of similar graphs. We can thus expect that processing data supported on graphs associated with the same graphon should yield similar results. In this talk, I formalize this intuition by showing that the error made when transferring a GNN across two graphs in a graphon family is small when the graphs are sufficiently large. This enables large-scale graph machine learning by transference: training GNNs on moderate-scale graphs and executing them on large-scale graphs.
Visit talk page
Remote video URL
0:54:40
Olga Klopp (ESSEC Business School)
https://simons.berkeley.edu/node/22618
Graph Limits, Nonparametric Models, and Estimation

Variational methods are extremely popular in the analysis of network data. Statistical guarantees obtained for these methods typically provide asymptotic normality for the problem of estimation of global model parameters under the stochastic block model. In the present work, we consider the case of networks with missing links that is important in application and show that the variational approximation to the maximum likelihood estimator converges at the minimax rate. This provides the first minimax optimal and tractable estimator for the problem of parameter estimation for the stochastic block model. We complement our results with numerical studies of simulated and real networks, which confirm the advantages of this estimator over current methods.

This is joint work with Adrian Roellin.
Visit talk page