# Schedule for: 23w5108 - Perspectives on Matrix Computations: Theoretical Computer Science Meets Numerical Analysis

Beginning on Sunday, March 5 and ending Friday March 10, 2023

All times in Banff, Alberta time, MST (UTC-7).

Sunday, March 5 | |
---|---|

16:00 - 17:30 | Check-in begins at 16:00 on Sunday and is open 24 hours (Front Desk - Professional Development Centre) |

17:30 - 19:30 |
Dinner ↓ A buffet dinner is served daily between 5:30pm and 7:30pm in Vistas Dining Room, top floor of the Sally Borden Building. (Vistas Dining Room) |

20:00 - 22:00 | Informal gathering (TCPL Foyer) |

Monday, March 6 | |
---|---|

07:00 - 08:45 |
Breakfast ↓ Breakfast is served daily between 7 and 9am in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |

08:45 - 09:00 |
Introduction and Welcome by BIRS Staff ↓ A brief introduction to BIRS with important logistical information, technology instruction, and opportunity for participants to ask questions. (TCPL 201) |

09:00 - 10:00 | Yousef Saad: Iterative methods: basics and framework (Tutorial talk) (TCPL 201) |

10:00 - 10:30 | Coffee Break (TCPL Foyer) |

10:30 - 11:30 | Joel Tropp: Randomization / RMT in NLA (Tutorial talk) (TCPL 201) |

11:30 - 13:00 |
Lunch ↓ Lunch is served daily between 11:30am and 1:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |

13:00 - 13:20 |
Group Photo ↓ Meet in foyer of TCPL to participate in the BIRS group photo. The photograph will be taken outdoors, so dress appropriately for the weather. Please don't be late, or you might not be in the official group photo! (TCPL Foyer) |

13:20 - 15:00 | Problem Session (Open Mic) (TCPL 201) |

15:00 - 15:30 | Coffee Break (TCPL Foyer) |

15:20 - 15:50 |
Stefano Pozza: The $\star$-product approach for linear ODEs ↓ Solving systems of non-autonomous ordinary differential equations (ODE) is a crucial and often challenging problem. Recently a new approach was introduced based on a generalization of the Volterra composition. The method transforms the ODE system into the solution of a structured linear system. Numerical examples illustrate the method's efficacy and its properties. (TCPL 201) |

15:50 - 16:10 |
Raphael Meyer: On the Unreasonable Effectiveness of Single Vector Krylov for Low-Rank Approximation ↓ A common task in NLA is low-rank approximation, which can be done by approximating the top eigenvectors of a matrix. Block Krylov Iteration gives the best known theoretical guarantees and empirical performance for this task. Though, it's conceptually unclear how we should choose the block size. The best theoretical guarantees in the prior work requires using large block sizes, at least block size k when finding a rank-k approximation -- and possibly much larger. However, in practice, a small constant block size independent of k typically performs much better than such large block sizes. When the block size is exactly 1, we call this "Single Vector Krylov".
We resolve this unclarity and this theory-practice gap by proving that, when counting matrix-vector products instead of iteration complexity, Single Vector Krylov matches the rate of convergence of the best possible choice of block size, up to just a mild logarithmic dependence on eigenvalue gaps. Moreover, for a wide family of matrices, Single Vector Krylov converges faster, improving from $O( k \cdot log(\frac1\varepsilon) )$ MatVecs to $O( k+log(\frac1\varepsilon) )$ MatVecs. The core of our proof is the observation that the Krylov Subpsace generated by Single Vector Krylov initialized from random vector is exactly equal to the Krylov Subspace generated by Block Krylov Iteration initialized to a structured starting block.
We conclude with implications of this theory for simplifying algorithms, smoothed analysis, and experimental validation. (TCPL 201) |

16:10 - 16:30 |
Stefano Massei: A nested divide-and-conquer for tensor Sylvester equations with hierarchically semiseparable coefficients ↓ Linear systems with a tensor product structure arise naturally when considering the discretization of Laplace type differential equations or, more generally, multidimensional operators with separable coefficients.
In this work, we focus on the numerical solution of linear systems of the form
$$ \left(A_1\otimes I\otimes\dots\otimes I+\dots + I\otimes\dots I\otimes A_d\right)x=b,
$$
where the matrices $A_t\in\mathbb R^{n\times n}$ are Hermitian positive definite and belong to the class of hierarchically semiseparable matrices.
We propose and analyze a nested divide-and-conquer scheme, based on the technology of low-rank updates, that attains the quasi-optimal computational cost $\mathcal O(n^d\log(n))$. Our theoretical analysis highlights the role of inexactness in the nested calls of our algorithm and provides worst case estimates for the amplification of the residual norm. The performances are validated on 2D and 3D case studies. (TCPL 201) |

16:30 - 16:50 | Break (Online/In-person) |

16:50 - 17:10 |
Richard Peng: Worst-case complexity of (structured) linear systems ↓ Over the past two decades, areas of algorithms traditionally considered combinatorial have benefitted enormously from the study of provably worst-case efficient numerical solvers for structured linear systems. This talk aims to give a picture of the current state-of-the-art for solving linear systems under various structural assumptions, resource goals, and representations of numbers. (TCPL 201) |

17:10 - 17:30 |
Stephen Vavasis: Computational complexity of decomposing a symmetric matrix as a sum of positive semidefinite and diagonal matrices ↓ We study several variants of decomposing a symmetric matrix into a sum of a low-rank positive semidefinite matrix and a diagonal matrix. Such decompositions have applications in factor analysis and they have been studied for many decades. On the one hand, we prove that when the rank of the positive semidefinite matrix in the decomposition is bounded above by an absolute constant, the problem can be solved in polynomial time. On the other hand, we prove that, in general, these problems as well as their certain approximation versions are all NP-hard. Finally, we prove that many of these low-rank decomposition problems are complete in the first-order theory of the reals; i.e., given any system of polynomial equations, we can write down a low-rank decomposition problem in polynomial time so that the original system has a solution iff our corresponding decomposition problem has a feasible solution of certain (lowest) rank. (TCPL 201) |

17:30 - 17:50 |
John Urschel: Some New Results on the Growth Factor in Gaussian Elimination ↓ In this talk, we study the growth factor in Gaussian elimination under complete pivoting and show a number of new results, including improved lower bounds, estimates for growth in floating point arithmetic, and bounds for growth for matrices with restricted entries. (TCPL 201) |

17:50 - 19:30 |
Dinner ↓ A buffet dinner is served daily between 5:30pm and 7:30pm in Vistas Dining Room, top floor of the Sally Borden Building. (Vistas Dining Room) |

Tuesday, March 7 | |
---|---|

07:00 - 08:45 |
Breakfast ↓ Breakfast is served daily between 7 and 9am in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |

09:00 - 10:00 | Peter Bürgisser: Condition numbers (Tutorial talk) (Online) |

10:00 - 10:30 | Coffee Break (TCPL Foyer) |

10:30 - 11:30 | Jim Demmel: Numerical stability in NLA (Tutorial talk) (Online) |

11:30 - 13:00 |
Lunch ↓ Lunch is served daily between 11:30am and 1:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |

13:00 - 14:00 | Working Groups (TCPL 201) |

14:00 - 15:00 | Working Groups (TCPL 201) |

15:00 - 15:30 | Coffee Break (TCPL Foyer) |

15:30 - 15:50 |
Heather Wilber: Trigonometric rational functions and signal reconstruction ↓ A framework is introduced for fitting trigonometric rational functions to data in a way that is robust to various forms of corruption, including additive Gaussian noise, perturbed sampling grids, and missing data. Our approach combines a variant of Prony’s method with a modified version of the AAA algorithm. Using representations in both frequency and time space, a collection of algorithms is described for adaptively computing with trigonometric rationals. This includes procedures for algebraic operations, differentiation, convolution, and more. (TCPL 201) |

15:50 - 16:10 |
Jorge Garza-Vargas: Global convergence of the Hessenberg QR algorithm ↓ We will discuss recent theoretical results about the global rapid convergence of the QR algorithm, which in particular imply the existence of a randomized diagonalization algorithm that runs in $\tilde{O}(n^3)$ operations on any arbitrary input. This is joint work with Jess Banks and Nikhil Srivastava. (TCPL 201) |

16:10 - 16:30 |
Rasmus Kyng: Laplacian solvers ↓ TBA (TCPL 201) |

16:30 - 16:50 | Break (Online/In-person) |

16:50 - 17:10 | Michael Mahoney: RMT, NLA, AND MML (TCPL 201) |

17:10 - 17:30 |
Thomas Trogdon: What can random matrices tell us about algorithms? ↓ This talk explores the limitations and the successes of understanding algorithms via their performance on random matrices. For example, the conjugate gradient algorithm produces highly concentrated output from random input, indicating robustness. On the other hand, the power iteration produces much less concentrated output. This can be explained via the dependence of the algorithms on modified moments versus classical moments for a matrix-dependent measure. Yet, random input can give potentially misleading conclusions as the well-known instabilities of the Lanczos iteration are often missing in random situations. This is joint work with Percy Deift, Tyler Chen, Elliot Paquette and Xiucai Ding. (TCPL 201) |

17:30 - 19:30 |
Dinner ↓ A buffet dinner is served daily between 5:30pm and 7:30pm in Vistas Dining Room, top floor of the Sally Borden Building. (Vistas Dining Room) |

Wednesday, March 8 | |
---|---|

07:00 - 08:45 |
Breakfast ↓ Breakfast is served daily between 7 and 9am in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |

09:00 - 10:00 | Julien Langou: Software for NLA (Tutorial talk) (TCPL 201) |

10:00 - 10:30 | Coffee Break (TCPL Foyer) |

10:30 - 11:30 | Christopher Musco: Sketching in NLA (Tutorial talk) (TCPL 201) |

11:30 - 13:00 |
Lunch ↓ Lunch is served daily between 11:30am and 1:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building. (Vistas Dining Room) |

13:00 - 14:00 |
Tour of Banff Centre ↓ Join for 1hr long Tour of the Iconic Banff Centre. Jim Olver will meet you outside the Professional Development Centre. Dress for the weather. (PDC Entrance) |

14:00 - 17:30 | Free Afternoon (Banff National Park) |

17:30 - 19:30 |
Dinner ↓ |

Thursday, March 9 | |
---|---|

07:00 - 08:45 |
Breakfast ↓ |

09:00 - 10:00 | Michal Derezinski: Putting Randomness into LAPACK and Next Generation RandNLA Theory (Tutorial talk) (TCPL 201) |

10:00 - 10:30 | Coffee Break (TCPL Foyer) |

10:30 - 11:30 | Laura Grigori: Communication and NLA (Tutorial talk) (Online) |

11:30 - 13:00 |
Lunch ↓ |

13:00 - 14:00 | Working Groups (TCPL 201) |

14:00 - 15:00 | Working Groups (TCPL 201) |

15:00 - 15:30 | Coffee Break (TCPL Foyer) |

15:30 - 16:30 | Moderated Discussion (Groups Summary) (TCPL 201) |

16:30 - 17:30 | Moderated Discussion (Groups Summary) (TCPL 201) |

17:30 - 19:30 |
Dinner ↓ |

Friday, March 10 | |
---|---|

07:00 - 08:45 |
Breakfast ↓ |

09:00 - 09:20 |
Tyler Chen: Stochastic trace estimation and quantum typicality: a case study in interdisciplinary research ↓ In recent years, stochastic trace estimation has proven to be a powerful algorithmic tool in numerical linear algebra, and theoretical computer science, machine learning, and computational quantum physics. However, despite the popularity of such estimators, the literature on the topic is highly fragmented. Indeed, while NLA and TCS typically attribute the estimators to several papers appearing around 1990, algorithms based on the same ideas date back to at least the 1970s. Moreover, stochastic trace estimation is, in some moral sense, a special instance of quantum typicality, an idea dating back at least to work of Schr\"odinger and von Neumann in the late 1920s and which regained attention in the mid 2000s.
This talk uses trace estimation/typicality as a case study in research at the intersection of nearby fields. We discuss provide some history into the topic, as well as speculate as to whether this level of fragmentation is an inherent consequence of modern research, or whether it can been mitigated. (TCPL 201) |

09:20 - 09:40 |
David Persson: Randomized low-rank approximation of monotone matrix functions ↓ This work is concerned with computing low-rank approximations of a matrix function $f(A)$ for a large symmetric positive semi-definite matrix $A$, a task that arises in, e.g., statistical learning and inverse problems. The application of popular randomized methods, such as the randomized singular value decomposition or the Nyström approximation, to $f(A)$ requires multiplying $f(A)$ with a few random vectors. A significant disadvantage of such an approach, matrix-vector products with $f(A)$ are considerably more expensive than matrix-vector products with $A$, even when carried out only approximately via, e.g., the Lanczos method. In this work, we present and analyze funNyström, a simple and inexpensive method that constructs a low-rank approximation of $f(A)$ directly from a Nyström approximation of $A$, completely bypassing the need for matrix-vector products with $f(A)$. It is sensible to use funNyström whenever $f$ is monotone and satisfies $f(0) = 0$. Under the stronger assumption that $f$ is operator monotone, which includes the matrix square root $A^{1/2}$ and the matrix logarithm $\log(I+A)$, we derive probabilistic bounds for the error in the Frobenius, nuclear, and operator norms. These bounds confirm the numerical observation that funNyström tends to return an approximation that compares well with the best low-rank approximation of $f(A)$. Furthermore, compared to existing methods, funNyström requires significantly fewer matrix-vector products with $A$ to obtain a low-rank approximation of $f(A)$, without sacrificing accuracy or reliability. Our method is also of interest when estimating quantities associated with $f(A)$, such as the trace or the diagonal entries of $f(A)$. In particular, we propose and analyze funNyström++, a combination of funNyström with the recently developed Hutch++ method for trace estimation. (TCPL 201) |

09:40 - 10:00 |
Anne Greenbaum: Optimal and Near Optimal Krylov Space Approximations to $f(A)b$ ↓ We describe a Lanczos or Arnoldi-based algorithm for approximating the product of a rational function of a matrix $A$ with a given vector $b$. This algorithm is optimal over the Krylov subspace, in a norm induced by the denominator of the rational function, and the approximation can be computed using information from a slightly larger Krylov subspace. It requires storage of just a few additional vectors, proportional to the degree of the denominator of the rational function. It can be used to construct near-optimal approximations for other matrix functions such as the matrix sign function. (TCPL 201) |

10:00 - 10:30 | Coffee Break (TCPL Foyer) |

10:30 - 11:00 |
Checkout by 11AM ↓ 5-day workshop participants are welcome to use BIRS facilities (TCPL ) until 3 pm on Friday, although participants are still required to checkout of the guest rooms by 11AM. (Front Desk - Professional Development Centre) |

12:00 - 13:30 | Lunch from 11:30 to 13:30 (Vistas Dining Room) |