Schedule for: 24w5214 - Computational Complexity of Statistical Inference
Beginning on Sunday, February 25 and ending Friday March 1, 2024
All times in Banff, Alberta time, MST (UTC-7).
Sunday, February 25 | |
---|---|
16:00 - 17:30 | Check-in begins at 16:00 on Sunday and is open 24 hours (Front Desk - Professional Development Centre) |
17:30 - 19:30 |
Dinner ↓ A buffet dinner is served daily between 5:30pm and 7:30pm in Vistas Dining Room, top floor of the Sally Borden Building. (Vistas Dining Room) |
20:00 - 22:00 |
Informal gathering ↓ Meet and Greet at the BIRS Lounge, located on the 2nd floor of PDC. (Other (See Description)) |
Monday, February 26 | |
---|---|
07:00 - 08:45 |
Breakfast ↓ Breakfast is served daily between 7 and 9am in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |
08:45 - 09:00 |
Introduction and Welcome by BIRS Staff ↓ A brief introduction to BIRS with important logistical information, technology instruction, and opportunity for participants to ask questions. (TCPL 201) |
09:00 - 10:00 |
Frederic Koehler: Computational and statistical limits of inference from gaussian fields? ↓ I will talk about recent works concerning learning and testing
Gaussian graphical models and important special cases such as
Gaussian free fields on unknown graphs. As I will discuss, this is closely related
to the problem of sparse linear regression and involves potential computational-statistical
gaps and connections to sparse pca. Based on joint works with Raghu Meka, Ankur Moitra,
Jon Kelner, and Dhruv Rohatgi. (TCPL 201) |
10:00 - 10:30 | Coffee Break (TCPL Foyer) |
10:30 - 11:00 |
Alex Wein: Fine-Grained Extensions of the Low-Degree Testing Framework ↓ The low-degree polynomial framework has emerged as a versatile tool for probing the computational complexity of statistical problems by studying the power and limitations of a restricted class of algorithms: low-degree polynomials. Focusing on the setting of hypothesis testing, I will discuss some extensions of this method that allow us to tackle finer-grained questions than the standard approach.
First, for the task of detecting a planted clique in a random graph, we ask not merely when this can be done in polynomial time O(n^c), but seek the optimal exponent c as a function of the clique size. To this end, we consider algorithms that make non-adaptive edge queries and then apply a low-degree test, and we determine the number of queries required. This is joint work with Jay Mardia and Kabir Verchand.
Second, in the spiked Wigner model with any iid spike prior, we seek the precise optimal tradeoff curve between type I and type II error rates. Conditional on an appropriate strengthening of the "low-degree conjecture," we show that tests based on the spectrum achieve the best possible tradeoff curve among poly-time algorithms, while exponential-time non-spectral tests can do better. This is joint work with Ankur Moitra. (TCPL 201) |
11:20 - 11:50 | Introductions & Icebreakers (Online) |
11:30 - 13:00 |
Lunch ↓ Lunch is served daily between 11:30am and 1:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |
11:45 - 12:05 |
Group Photo ↓ Meet in foyer of TCPL to participate in the BIRS group photo. The photograph will be taken outdoors, so dress appropriately for the weather. Please don't be late, or you might not be in the official group photo! (TCPL Foyer) |
13:15 - 14:00 |
David Steurer: Private block model and graphon estimation via sum-of-squares ↓ We develop differentially-private algorithms for parameter-learning stochastic block models and for graphon estimation.
They are the first to achieve pure node-differential privacy in polynomial running time for any constant number of blocks.
Their statistical utility guarantees match those of the previous best information-theoretic node-private mechanisms for these problems that have exponential running time.
The algorithm is based on a score function defined in terms of a sum-of-squares relaxation whose level depends on the number of blocks.
The key ingredients of our results are
(1) a characterization of the distance between two block graphons in terms of a quadratic optimization over the polytope of doubly stochastic matrices,
(2) a general sum-of-squares convergence result for polynomial optimization over arbitrary polytopes, and
(3) a general approach to perform Lipschitz extensions of score functions as part of the sum-of-squares algorithmic paradigm.
This talk is based on a joint work with Hongjie Chen, Jingqiu Ding, Tommaso D'Orsi, Yiding Hua, Chih-Hung Liu, and David Steurer. (TCPL 201) |
14:20 - 14:50 |
David Gamarnik: From Sparse Random Graphs to Mean-field Models and Back ↓ Mean-field models such as spin glasses turned out be an excellent tool for studying optimization in sparse random graphs such as MAX-CUT and MIN-bisection problems. Surprisingly, the opposite is also true: some problems which are hard to analyze for the mean-field models turn out far more tractable when formulated on sparse random graphs with growing degrees, and pushed back to mean-field models using the Lindeberg's interpolation method.
We will demonstrate this idea for the problem of studying the number of local maximums in the spin glasses, also related to the so-called metastable states aka Bray-Moore, also related to the so-called friendly partitions of a random graph, which is widely studied in combinatorics. Using sparse-to-dense method we confirm in particular the Bray-Moore threshold rigorously.
Joint work with Yatin Dandy (EPFL) and Lenka Zdeborova (EPLF) (TCPL 201) |
14:50 - 15:20 |
Aukosh Jagannath: Finding Planted Clique via Gradient Descent ↓ Is hardness for "black box" methods such as Gradient descent or natural Markov dynamics reasonable evidence of hardness-on-average? This question was posed by Jerrum in his landmark work on the planted clique problem (PC) where he argued that if the Metropolis process on the space of cliques cannot solve PC for sqrt(n) sized cliques it would be an "indictment" of this popular perspective. Recently, Chen-Mossel-Zadik '23 gave a stunning resolution of Jerrum's question by showing that the Metropolis chain cannot solve this problem even for sublinear sized cliques and even from the most natural initialization (the empty clique)! We revisit the power of "black box"/Markov chain approaches in this talk. We show that the most basic black box algorithm/Markov Chain for constrained optimization problems---adding a lagrange multiplier for the constraint and running gradient descent--- solves the planted clique problem to the correct order in linear time when initialized from the full graph. Importantly, the method still fails if initalized from the empty clique as it cannot escape mediocrity from "below". Joint work with Reza Gheissari (Northwestern) and Yiming Xu (Waterloo) (TCPL 201) |
15:00 - 15:30 | Coffee Break (TCPL Foyer) |
15:45 - 16:15 |
Jane Lange: Agnostic Proper Learning of Monotone Functions: Beyond the Black-Box Correction Barrier ↓ We give the first agnostic, efficient, proper learning algorithm for monotone Boolean functions. Given 2^Õ(sqrt(n)/ε) uniformly random examples of an unknown function f:{±1}^n→{±1}, our algorithm outputs a hypothesis g:{±1}^n→{±1} that is monotone and (opt+ε)-close to f, where opt is the distance from f to the closest monotone function. The running time of the algorithm (and consequently the size and evaluation time of the hypothesis) is also 2^Õ(sqrt(n)/ε), nearly matching the lower bound of Blais et al (RANDOM '15). We also give an algorithm for estimating up to additive error ε the distance of an unknown function f to monotone using a run-time of 2^Õ(sqrt(n)/ε). Previously, for both of these problems, sample-efficient algorithms were known, but these algorithms were not run-time efficient. Our work thus closes this gap in our knowledge between the run-time and sample complexity. This work builds upon the improper learning algorithm of Bshouty and Tamon (JACM '96) and the proper semiagnostic learning algorithm of Lange, Rubinfeld, and Vasilyan (FOCS '22), which obtains a non-monotone Boolean-valued hypothesis, then "corrects" it to monotone using query-efficient local computation algorithms on graphs. This black-box correction approach can achieve no error better than 2opt+ε information-theoretically; we bypass this barrier by a) augmenting the improper learner with a convex optimization step, and b) learning and correcting a real-valued function before rounding its values to Boolean. Our real-valued correction algorithm solves the "poset sorting'' problem of [LRV22] for functions over general posets with non-Boolean labels. (TCPL 201) |
16:15 - 16:45 |
Sitan Chen: Provably learning a multi-head attention layer ↓ Despite the widespread empirical success of transformers, little is known about their learnability from a computational perspective. In practice these models are trained with SGD on a certain next-token prediction objective, but in theory it remains a mystery even to prove that such functions can be learned efficiently at all. In this work, we give the first nontrivial provable algorithms and computational lower bounds for this problem. Our results apply in a realizable setting where one is given random sequence-to-sequence pairs that are generated by some unknown multi-head attention layer. Our algorithm, which is centered around using examples to sculpt a convex body containing the unknown parameters, is a significant departure from existing provable algorithms for learning multi-layer perceptrons, which predominantly exploit fine-grained algebraic and rotation invariance properties of the input distribution. (TCPL 201) |
16:45 - 17:15 |
Florent Krzakala: Learning multi-index functions with gradient(s) descent(s) ↓ How do two-layer neural networks learn multi-index functions
from data over time? I will present and review an extensive set of
theoretical results answering this question, and discuss in particular
the interaction between batch size, number of iterations, the
(crucial) fact that one reuses or not each batch of data, and the
complexity of the function to learn. Based on arXiv: arXiv:2202.00293
arXiv:2305.18270, arXiv:2402.03220, and arXiv:2402.04980 (TCPL 201) |
17:30 - 19:30 |
Dinner ↓ A buffet dinner is served daily between 5:30pm and 7:30pm in Vistas Dining Room, top floor of the Sally Borden Building. (Vistas Dining Room) |
19:30 - 21:00 | Tselil Schramm: Lightning Talks (TCPL 201) |
Tuesday, February 27 | |
---|---|
07:00 - 08:45 |
Breakfast ↓ Breakfast is served daily between 7 and 9am in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |
09:00 - 10:00 |
Pravesh Kothari: The Kikuchi Matrix Method, part I ↓ In two talks, Pravesh and Peter will present a recently developed method to reduce combinatorial problems to establishing the unsatisfiability of k-sparse linear equations mod 2 (aka k-XOR formulas) with a limited amount of randomness. This latter task is then accomplished by understanding the spectral properties of certain "Kikuchi" graphs associated with the k-XOR formulas. Kikuchi graphs are appropriately chosen induced subgraphs of Cayley graphs on the hypercube (or sometimes variants such as products of hypercubes) generated by the coefficient vectors of the given equations and understanding their spectrum reduces to certain natural combinatorial properties of the input formula.
We will discuss the following applications in this talk:
1. Proving Feige's conjectured hypergraph Moore bound -- the optimal trade-off between the size of the smallest linear dependent subset of a system of k-sparse linear equations modulo 2 and the total number of equations that generalizes the famous irregular Moore bound of Alon, Hoory and Linial (2002) for graphs (equivalently, 2-sparse linear equations mod 2),
2. Proving a cubic lower bound on 3-query locally decodable codes improving on a quadratic lower bound of Kerenedis and de Wolf (2004),
3. Proving an exponential lower bound on linear 3-query locally correctable codes (this result establishes a sharp separation between 3-query LCCs and 3-query LDCs that admit a construction of sub-exponential length and is the first result to obtain a super-polynomial lower bound for > 2-query local codes).
Time permitting, we may also discuss applications to a strengthening of Szemeredi's theorem on arithmetic progressions in dense subsets of integers. (TCPL 201) |
10:00 - 10:30 | Coffee Break (TCPL Foyer) |
10:30 - 11:00 |
Peter Manohar: The Kikuchi Matrix Method, part II ↓ In two talks, Pravesh and Peter will present a recently developed method to reduce combinatorial problems to establishing the unsatisfiability of k-sparse linear equations mod 2 (aka k-XOR formulas) with a limited amount of randomness. This latter task is then accomplished by understanding the spectral properties of certain "Kikuchi" graphs associated with the k-XOR formulas. Kikuchi graphs are appropriately chosen induced subgraphs of Cayley graphs on the hypercube (or sometimes variants such as products of hypercubes) generated by the coefficient vectors of the given equations and understanding their spectrum reduces to certain natural combinatorial properties of the input formula.
We will discuss the following applications in this talk:
1. Proving Feige's conjectured hypergraph Moore bound -- the optimal trade-off between the size of the smallest linear dependent subset of a system of k-sparse linear equations modulo 2 and the total number of equations that generalizes the famous irregular Moore bound of Alon, Hoory and Linial (2002) for graphs (equivalently, 2-sparse linear equations mod 2),
2. Proving a cubic lower bound on 3-query locally decodable codes improving on a quadratic lower bound of Kerenedis and de Wolf (2004),
3. Proving an exponential lower bound on linear 3-query locally correctable codes (this result establishes a sharp separation between 3-query LCCs and 3-query LDCs that admit a construction of sub-exponential length and is the first result to obtain a super-polynomial lower bound for > 2-query local codes).
Time permitting, we may also discuss applications to a strengthening of Szemeredi's theorem on arithmetic progressions in dense subsets of integers. (TCPL 201) |
11:00 - 11:30 |
Aaron Potechin: Sum of Squares Lower Bounds for Non-Gaussian Component Analysis ↓ Non-Gaussian Component Analysis (NGCA) is the task of finding a non-Gaussian direction in a high-dimensional dataset. NGCA has been extensively studied and hardness for NGCA implies hardness for many other learning problems including learning mixture models, robust mean/covariance estimation, robust linear regression, and learning simple neural networks and generative models.
Prior work has shown statistical query and low-degree polynomial lower bounds for NGCA but not sum of squares (SoS) lower bounds.
In this work, we show a superconstant degree SoS lower bound for distinguishing between the following two distributions (which is a distinguishing variant of NGCA):
Random distribution: Take m samples from N(0,Id).
Planted distribution: First choose a hidden direction v. Then take m samples which have some non-Gaussian distribution A in the hidden direction v and have distribution N(0,1) in directions which are orthogonal to v.
In this talk, I will show how low-degree polynomial lower bounds for this problem can be shown using graph matrices, describe why proving SoS lower bounds is significantly more challenging and requires new techniques, and sketch how we overcome these challenges to prove our SoS lower bound.
This is joint work with Ilias Diakonikolas, Sushrut Karmalkar, and Shuo Pang. (TCPL 201) |
11:30 - 12:00 | Group assignment for visioning session (TCPL Foyer) |
11:30 - 13:00 |
Lunch ↓ Lunch is served daily between 11:30am and 1:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |
13:00 - 17:30 | Unstructured time for collaboration (TCPL Lounge) |
15:00 - 15:30 | Coffee Break (TCPL Foyer) |
17:30 - 19:30 |
Dinner ↓ A buffet dinner is served daily between 5:30pm and 7:30pm in Vistas Dining Room, top floor of the Sally Borden Building. (Vistas Dining Room) |
19:30 - 21:00 | Visioning session (TCPL 201) |
Wednesday, February 28 | |
---|---|
07:00 - 08:45 |
Breakfast ↓ Breakfast is served daily between 7 and 9am in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |
09:00 - 10:00 |
Yuval Ishai: Cryptography from Planted Graphs: Security with Logarithmic-Size Messages ↓ I will discuss recent cryptographic applications of natural conjectures about the hardness of detecting planted graphs in random ambient graphs, which naturally extend standard conjectures about the planted clique problem.
The applications include new secret-sharing schemes and secure computation protocols, in which public information is used to obtain (conjectured) computational security with only logarithmic communication.
Our work motivates several open questions about the possibility of hiding a given planted graph in a small ambient graph that can be arbitrarily distributed.
Joint work with Damiano Abram, Amos Beimel, Eyal Kushilevitz, and Varun Narayanan (TCPL 201) |
10:00 - 10:30 | Coffee Break (TCPL Foyer) |
10:30 - 11:15 |
Siqi Liu: Detection and recovery of latent geometry in random graphs ↓ In recent years, random graph models with latent geometry have received more attention. In general, these graphs are generated by first sampling random points from some metric space and then sampling every edge independently with probability depending on the distance between its two endpoints. The computational problems here are to detect the underlying geometry and to recover the coordinates of the vertices from the sampled graph. These models have richer structures that allow correlations between edges and admit uncountable alphabet sizes. They are more realistic models for real world networks and data, but so far we have limited understanding of their information and computation thresholds for both detection and recovery. In this talk we will go over known algorithmic and hardness results for several random geometric graph models and discuss some open directions. (TCPL 201) |
11:15 - 12:00 |
Mitali Bafna: Polynomial Time Power-Sum Decomposition of Polynomials ↓ We give efficient algorithms for finding the power sum decomposition of an input polynomial P(x) = sum p_i(x)^d, with components p_i’s. The case of linear p_i’s is equivalent to the well-studied tensor decomposition problem while the quadratic case occurs naturally in learning Gaussian mixtures from low-order moments. Unlike tensor decomposition, both the unique identifiability and algorithms for this problem are not well-understood. Our algorithm succeeds in decomposing a sum of m = O(n) generic quadratic pis for d = 3 and more generally the dth power-sum of m = n^{O(d)} generic degree-K polynomials for any K > 2. Our algorithm relies only on basic numerical linear algebraic primitives, is exact (up to numerical precision errors), and handles an inverse polynomial noise when the p_i’s have random Gaussian coefficients. (TCPL 201) |
11:30 - 13:00 |
Lunch ↓ Lunch is served daily between 11:30am and 1:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |
13:30 - 14:15 |
Adam Klivans: TDS Learning ↓ We revisit the fundamental problem of learning with distribution shift, in which a learner is given labeled samples from training distribution D, unlabeled samples from test distribution D′ and is asked to output a classifier with low test error. The standard approach in this setting is to bound the loss of a classifier in terms of some notion of distance between D and D′. These distances, however, seem difficult to compute and do not lead to efficient algorithms.
We depart from this paradigm and define a new model called testable learning with distribution shift (TDS learning), where we can obtain provably efficient algorithms for certifying the performance of a classifier on a test distribution. In this model, a learner outputs a classifier with low test error whenever samples from D and D′ pass an associated test; moreover, the test must accept if the marginal of D equals the marginal of D′. We give several positive results for learning well-studied concept classes such as halfspaces, intersections of halfspaces, and decision trees. Prior to our work, no efficient algorithms for these basic cases were known without strong assumptions on D′. (TCPL 201) |
14:30 - 15:00 |
Ankur Moitra: Reinforcement Learning without Intractable Oracles? ↓ Much of reinforcement learning theory is built on top of oracles that are computationally hard to implement. For example, for consider the problem of learning near-optimal policies in Partially Observable Markov Decision Processes (POMDPs). Existing algorithms either need to make strong assumptions about the model dynamics (e.g. deterministic transitions) or assume access to an oracle for solving a hard optimistic planning or estimation problem as a subroutine. Are there natural and well-motivated assumptions that make learning easy? Our main result is a quasi-polynomial time learning algorithm under the assumption of observability, that stipulates well-separated distributions on states lead to well-separated distributions on observations. We hope (and have some evidence) this is just the first step in a broader agenda of identifying new structural assumptions in reinforcement learning that enable strong end-to-end algorithmic guarantees.
This is based on joint work with Noah Golowich and Dhruv Rohatgi (TCPL 201) |
15:00 - 17:30 | Unstructured time for collaboration (TCPL 201) |
17:30 - 19:30 |
Dinner ↓ A buffet dinner is served daily between 5:30pm and 7:30pm in Vistas Dining Room, top floor of the Sally Borden Building. (Vistas Dining Room) |
19:45 - 20:15 |
David Wu: Robust recovery for stochastic block models, simplified and generalized ↓ We study the problem of robust community recovery: efficiently recovering communities in
sparse stochastic block models in the presence of adversarial corruptions. In the absence of
adversarial corruptions, there are efficient algorithms when the signal-to-noise ratio exceeds
the Kesten–Stigum (KS) threshold, widely believed to be the computational threshold for this
problem. The question we study is: does the computational threshold for robust community recovery
also lie at the KS threshold? We answer this question affirmatively, providing an algorithm for
robust community recovery for arbitrary stochastic block models on any constant number of
communities, generalizing the work of Ding, d’Orsi, Nasser & Steurer [DdNS22] on an efficient
algorithm above the KS threshold in the case of 2-community block models.
There are three main ingredients to our work:
1. The Bethe Hessian of the graph is defined as 𝐻𝐺(𝑡) ≜ (𝐷𝐺 − 𝐼)𝑡
2 − 𝐴𝐺𝑡 + 𝐼 where 𝐷𝐺 is the
diagonal matrix of degrees and 𝐴𝐺 is the adjacency matrix. Empirical work suggested that
the Bethe Hessian for the stochastic block model has outlier eigenvectors corresponding
to the communities right above the Kesten-Stigum threshold [KMM+13, SKZ14].
We formally confirm the existence of outlier eigenvalues for the Bethe Hessian, by explicitly
constructing outlier eigenvectors from the community vectors.
2. We develop an algorithm for a variant of robust PCA on sparse matrices. Specifically,
an algorithm to partially recover top eigenspaces from adversarially corrupted sparse
matrices under mild delocalization constraints.
3. A rounding algorithm to turn vector assignments of vertices into a community assignment,
inspired by the algorithm of Charikar & Wirth [CW04] for 2XOR. (TCPL 201) |
20:30 - 21:00 |
Dmitriy Kunisky: Universal lower bounds against low-degree polynomials ↓ I will present new techniques for proving lower bounds against the class of low-degree polynomial algorithms for hypothesis testing between high-dimensional probability distributions. Unlike most previous analyses, these techniques do not require knowledge of explicit orthogonal polynomials for the underlying distributions, and therefore are more broadly useful. As a general application, I will show how to derive such lower bounds for detecting a sufficiently "dilute" signal through a noisy channel that are universal over many channels, in particular including additive noise with essentially arbitrary distribution. This in turn will give evidence that the statistical-to-computational gaps in models like spiked matrices and tensors with Gaussian noise occur also under generic noise models, and so are robust and general phenomena. (TCPL 201) |
Thursday, February 29 | |
---|---|
07:00 - 08:45 |
Breakfast ↓ Breakfast is served daily between 7 and 9am in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |
09:00 - 10:00 | Aayush Jain (TCPL 201) |
10:00 - 10:30 | Coffee Break (TCPL Foyer) |
10:30 - 11:00 |
Rachel Lin: Sparse Learning Parity with Noises and Cryptographic Applications ↓ The Sparse Learning Parity with Noises (LPN) problems are variants of the LPN problems where the public coefficient matrix is restricted to be k-sparse (i.e., each column has exactly a hamming weight of k). Different versions of the sparse LPN problems over F_2 have been long studied in the literature of average-case complexity. More recently, the generalization to larger fields F_q has been introduced and has quickly found interesting applications in cryptography. In this talk, we will discuss the hardness of the sparse LPN problems. Additionally, we will demonstrate how the sparsity feature enables the construction of multi-party homomorphic secret sharing and secure multiparty computation protocols with succinct communication that is sublinear in the complexity of the computation.
Joint work with Yuval Ishai, Aayush Jain, and Dao Quang. (TCPL 201) |
11:00 - 11:30 | Andrej Bogdanov (TCPL 201) |
11:30 - 13:00 |
Lunch ↓ Lunch is served daily between 11:30am and 1:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |
11:30 - 12:00 |
Elena Grigorescu: List Learning with Attribute Noise ↓ We introduce and study the model of list learning with attribute noise. Learning with attribute noise was introduced by Shackelford and Volper (COLT 1988) as a variant of PAC learning in which the algorithm has access to noisy examples and uncorrupted labels, and the goal is to recover an accurate hypothesis. Sloan (COLT 1988) and Goldman and Sloan (Algorithmica 1995) discovered information-theoretic limits to learning in this model, which have impeded further progress. In this article we extend the model to that of list learning, drawing inspiration from the list-decoding model in coding theory, and its recent variant studied in the context of learning. On the positive side, we show that sparse conjunctions can be efficiently list learned under some assumptions on the underlying ground-truth distribution. On the negative side, our results show that even in the list-learning model, efficient learning of parities and majorities is not possible, regardless of the representation used.
Joint work with Mahdi Cheraghchi, Brendan Juba, Karl Wimmer and Ning Xie (Appeared in AISTATS 2021). (TCPL 201) |
13:00 - 17:30 | Unstructured time for collaboration / free afternoon in Banff national park (TCPL Foyer) |
15:00 - 15:30 | Coffee Break (TCPL Foyer) |
17:30 - 19:30 |
Dinner ↓ A buffet dinner is served daily between 5:30pm and 7:30pm in Vistas Dining Room, top floor of the Sally Borden Building. (Vistas Dining Room) |
20:00 - 20:30 |
Jiaming Xu: Sharp statistical limits for shuffled linear regression ↓ This presentation focuses on the problem of shuffled linear regression, where the correspondence between predictors and responses in a linear model is obfuscated by a latent permutation and one needs to jointly estimate the permutation and regression coefficients. This problem finds applications in various areas ranging from robotics, multi-target tracking, signal processing, to data integration and de-anonymization. While the statistical and computational facets of shuffled linear regression have been a subject of extensive research over the past decade, the precise statistical thresholds for recovery have remained elusive. In this talk, l will describe our latest progress in establishing these precise thresholds for both perfect and near-perfect recovery. Our proof builds upon a careful analysis of the maximum likelihood estimator (MLE) divided into the large-error regime and the small-error regime. From the computational perspective, the MLE represents a specific instance of the NP-hard quadratic assignment problem. No polynomial-time algorithms are known to succeed close to the statistical limits, except for noiseless or low-dimensional settings. The talk is based on joint work with Leon Lufkin and Yihong Wu from Yale University. The preprint is available at https://arxiv.org/abs/2402.09693 (TCPL 201) |
20:30 - 21:00 |
Kiril Bangachev: On The Fourier Coefficients of High-Dimensional Random Geometric Graphs ↓ The random geometric graph $\mathsf{RGG}(n,\mathbb{S}^{d-1}, p)$ is formed by sampling $n$ i.i.d. vectors $\{V_i\}_{i = 1}^n$ uniformly on $\mathbb{S}^{d-1}$ and placing an edge between pairs of vertices $i$ and $j$ for which $\langle V_i,V_j\rangle \ge \tau^p_d,$ where $\tau^p_d$ is such that the expected density is $p.$ We study the low-degree Fourier coefficients of the distribution $\mathsf{RGG}(n,\mathbb{S}^{d-1}, p).$
Our main conceptual contribution is a novel two-step strategy for bounding Fourier coefficients which we believe is more widely applicable to studying latent space distributions. First, we localize the dependence among edges to few fragile edges. Second, we partition the space of latent vector configurations $(\mathsf{RGG}(n,\mathbb{S}^{d-1}, p))^{\otimes n}$ based on the set of fragile edges and on each subset of configurations, we define a noise operator acting independently on edges not incident (in an appropriate sense) to fragile edges.
We apply the resulting bounds to: 1) Settle the low-degree polynomial complexity of distinguishing spherical random geometric graphs from Erdos-Renyi both in the case of observing a complete set of edges and in the non-adaptively chosen mask $\mathcal{M}$ model recently introduced by [MVW24]; 2) Exhibit a statistical-computational gap for distinguishing $\mathsf{RGG}$ and the planted coloring model [KVWX23] in a regime when $\mathsf{RGG}$ is distinguishable from Erdos-Renyi; 3) Reprove known bounds on the second eigenvalue of random geometric graphs.
Joint work with Guy Bresler (TCPL 201) |
Friday, March 1 | |
---|---|
07:00 - 08:45 |
Breakfast ↓ Breakfast is served daily between 7 and 9am in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |
09:00 - 10:00 |
Cris Moore: Tensor networks, tensor cumulants, and tensor PCA ↓ Consider the following problem in high-dimensional statistics: there is a hidden vector v, and we observe v’s p-th outer product with itself, giving a tensor of arity p, with Gaussian noise in every entry. Given this noisy tensor, our goal is to reconstruct v, as well as to reject the null hypothesis that the tensor consists only of noise. To a physicist, this is a planted p-spin model, where p is the arity of the tensor, and where the coupling matrix is correlated with the hidden vector. To a computer scientist, it is a tensor version of PCA (principal component analysis). But we lack a theory of linear algebra for tensors, so we can’t simply look at the dominant eigenvector and hope it is correlated with v. What should we do instead? What is the “spectrum” of a tensor anyway?
Many algorithms for this problem first “flatten” the tensor into a matrix, and then look at its spectrum. But a more natural approach is to form a tensor network with copies of the observed tensor, and contract it to obtain a scalar or vector. I’ll describe how this yields a simple, graphical, and combinatorial proof of bounds on the low-degree likelihood ratio, and explains why many tensor networks provide optimal low-degree algoruthms. Perhaps more importantly, our work reveals a tantalizing extension of free probability theory and free cumulants from random matrices to random tensors.
This is joint work with Tim Kunisky and Alex Wein. (TCPL 201) |
10:00 - 10:30 | Coffee Break (TCPL Foyer) |
10:30 - 11:00 |
Checkout by 11AM ↓ 5-day workshop participants are welcome to use BIRS facilities (TCPL ) until 3 pm on Friday, although participants are still required to checkout of the guest rooms by 11AM. (Front Desk - Professional Development Centre) |
10:30 - 11:00 |
Brice Huang: A Constructive Proof of the Spherical Parisi Formula ↓ The Parisi formula for the free energy is among the crown jewels in the theory of spin glasses. We present a new proof of the lower bound for the spherical mean-field model. Our method is constructive: we obtain an ultrametric tree of pure states, each with approximately the same free energy as the entire model, which are hierarchically arranged in accordance with the Parisi ansatz. We construct this tree "layer by layer" given the minimizer to Parisi's variational problem. On overlap intervals with full RSB, the tree is built by an optimization algorithm due to Subag. On overlap intervals with finite RSB, the tree is constructed by a new truncated second moment argument, which also characterizes the free energy of the resulting pure states.
Joint work with Mark Sellke (TCPL 201) |
12:00 - 13:30 | Lunch from 11:30 to 13:30 (Vistas Dining Room) |