Schedule for: 23w6006 - Modern Statistical and Machine Learning Approaches for High-Dimensional Compound Spatial Extremes

Beginning on Sunday, May 7 and ending Friday May 12, 2023

All times in Granada, Spain time, MDT (UTC-6).

Sunday, May 7
16:00 - 18:00 Check-in begins at 16:00 on Sunday and is open 24 hours (Front Desk - Hotel Granada Center)
20:00 - 21:00 Informal gathering (Other (See Description))
Monday, May 8
07:00 - 08:45 Breakfast (Restaurant - Hotel Granada Center)
08:45 - 08:55 Introduction and Welcome by IMAG Staff (Main Meeting Room - Calle Rector López Argüeta)
08:55 - 09:00 Introduction and Welcome by Conference Organizers (Main Meeting Room - Calle Rector López Argüeta)
09:00 - 10:30 Douglas Nychka: Short course I: Spatial Statistical Learning (co-taught with Soutir Bandyopadhyay)
Large and innovative spatial data are now ubiquitous across science and engineering ranging from the microscale properties of 3D printed materials to the exposure of populations to pollutants to the global views of our planet from satellites. The challenge to statistical science is to adapt methods from geostatistics to these new problems. This course will provide a hands-on and modern introduction to spatial data, followed by methods for large and nonstationary data. It will be co-taught by Douglas Nychka and Soutir Bandyopadhyay, two active researchers in this area who have contributed to theory, new methods, and maintain software that makes spatial data analysis easy and accessible.
(Main Meeting Room - Calle Rector López Argüeta)
10:30 - 11:00 Coffee Break (Main Meeting Room - Calle Rector López Argüeta)
11:00 - 12:30 Raphael Huser: Short Course II: Advances in Statistical Modeling of Spatial Extremes
The classical modeling of spatial extremes relies on asymptotic models (i.e., max-stable processes or r-Pareto processes) for block maxima or peaks over high thresholds, respectively. However, at finite levels, empirical evidence often suggests that such asymptotic models are rigidly constrained, and that they do not always adequately capture the situation where the most severe events tend to be spatially localized. Another well-known limitation of classical spatial extremes models is that they are either computationally prohibitive to fit in high dimensions, or they need to be fitted using less efficient techniques. In this short course, we will start by describing classical asymptotic models for univariate and spatial extremes defined as block maxima and threshold exceedances. Then, in the second part, we will describe recent progress in the modeling and inference for spatial extremes, focusing on new models that have more flexible tail structures that can bridge asymptotic dependence classes, and that are more easily amenable to likelihood-based inference for large datasets. In particular, we will discuss various types of random scale constructions, as well as the conditional spatial extremes model, which have recently been getting increasing attention within the statistics of extremes community. We will illustrate the practical usefulness of some of these spatial extreme-value models on various environmental applications.
(Main Meeting Room - Calle Rector López Argüeta)
13:00 - 14:30 Lunch (Restaurant - Hotel Granada Center)
14:30 - 15:00 Francis Zwiers: Detection and Attribution of Human Influence on Extreme Precipitation Events
The question of whether human influence on the climate system has affected the frequency and intensity of extreme events can be posed in different ways. This talk considers methods that are used to ask whether there is evidence in observations of precipitation extremes that human-induces changes that are expected to have happened over the industrial era have actually occurred. This question is addressed using so-called detection and attribution (D&A, or fingerprinting) methods that look for the presence of signals derived from historical climate change simulations in long-term observations. Studies using such techniques find essentially unequivocal evidence that humans have altered the frequency and intensity of temperature extremes and find increasingly stronger evidence that humans are also altering the frequency and intensity of extreme precipitation events. D&A techniques can be used in multiple ways. When expected historical change signals are obtained from a multi-model ensemble, the question focuses on whether those signals are present in observations. Results from such studies can be used in some types of extreme event attribution studies, and can also be used to constrain projections of future change. In addition, when expected historical change signals are obtained from individual climate models, D&A results can serve as an aid to model evaluation.
(Main Meeting Room - Calle Rector López Argüeta)
15:00 - 15:30 Lyndsay Shand: A Multivariate Space-Time Dynamic Model for Characterizing Downstream Impacts of the 1991 Mt Pinatubo Volcanic Eruption
Downstream impacts of climate events, such as changes in the earth’s net radiative balance and temperatures that occur following a volcanic eruption, are inherently correlated processes. The relationship of such dependent processes at a global scale is often asymmetric and spatially-varying. We propose a model suitable for characterizing space-time correlations between climate impacts following the 1991 Mt. Pinatubo Eruption. We propose a novel multivariate dynamic linear model using a multiresolution basis function representation to jointly model downstream climate impacts following the eruption. Spatial variation is modeled using the flexible set of multiresolution basis functions known as latticeKrig, while multivariate correlations are accounted for via a Vector Autoregression (VAR) model on the basis coefficients. Our model is estimated within a Bayesian hierarchical framework and for computational tractability, we rely on Kalman filtering methods to estimate our time-varying basis coefficients. The resulting model allows us to characterize the changes in the dependent climate processes across space during and following the Mt. Pinatubo eruption. We demonstrate the usefulness of our method on both simulated and observed datasets.
(Main Meeting Room - Calle Rector López Argüeta)
15:30 - 16:00 Ana Cebrian: Spatio-Temporal Analysis of the Extent of Extreme Heat Events
Modeling for extreme heat events (EHE), defined as exceedances of a suitable local threshold, is customarily implemented using time series of temperatures collected at a set of locations. Since spatial dependence is anticipated in the occurrence of EHE’s, a joint model for the time series, incorporating spatial dependence is needed. We develop a space-time model based on a point-referenced collection of temperature time series that enables the prediction of both the incidence and characteristics of EHE’s occurring at any location in a study region. Specifically, our model employs a two-state model for EHEs with local thresholds to fit the daily maximum temperature data. The model switches between two observed states, one that defines extreme heat days (those above the temperature threshold) and the other that defines non-extreme heat days. This two-state structure allows temporal dependence of the observations but also that the parameters which control the spatial dependence can differ between the two states. The transition probabilities between the two states are driven by a two-state Markovian switching model. Each sub-model includes seasonal terms, covariates and intercepts modeled as Gaussian processes. We also introduce a formal definition of the notion of the spatial extent of an extreme heat event and we illustrate how it can be calculated using the output from the previous model. For a specified region and a given day, the definition of spatial extent takes the form of a block average of indicator functions over the region. Our risk assessment examines extents for the Comunidad Autónoma de Aragón in northeastern Spain. We generalize our definition to capture extents of persistence of extreme heat and make comparisons across decades to reveal evidence of increasing extent over time.
(Main Meeting Room - Calle Rector López Argüeta)
16:00 - 16:30 Steve Sain: Extremes and Climate Risk Analytics: Some Applications and Some Open Problems
The study of the impacts of climate change has provided the foundation for the emerging area of climate risk analytics. Applied statistics and data science are playing a key role in quantifying the impact of a changing climate on perils such as flood, heat, and fire. In this talk, I will present a quick overview of climate risk analytics, including how extremes and methods for extremes are integrating into our product pipelines. I’ll also highlight some challenges in incorporating extremes, including scalability, spatial dependence, and multivariate extremes.
(Main Meeting Room - Calle Rector López Argüeta)
16:30 - 17:00 Coffee Break (Main Meeting Room - Calle Rector López Argüeta)
17:00 - 18:00 Roundtable Discussion I: Compound Extremes in Climate Science (Leads: Steve Sain and Michael Wehner) (Main Meeting Room - Calle Rector López Argüeta)
19:30 - 21:30 Dinner (Restaurant - Hotel Granada Center)
Tuesday, May 9
07:00 - 09:00 Breakfast (Restaurant - Hotel Granada Center)
09:00 - 09:30 Abhi Datta: Combining Machine Learning with Traditional Geospatial Models
Spatial generalized linear mixed-models, consisting of a linear covariate effect and a Gaussian Process (GP) distributed spatial random effect, are widely used for analyses of geospatial data. We consider the setting where the covariate effect is non-linear and propose modeling it using a flexible machine learning algorithm like random forests or deep neural networks. We propose well-principled extensions of both random forests and neural networks, for estimating non-linear covariate effects in spatial mixed models where the spatial correlation is still modeled using GP. The basic principle is guided by how ordinary least squares extends to generalized least squares for linear models to account for dependence. We demonstrate how the same extension can be done for these machine learning approaches like random forests and neural networks. We provide extensive theoretical and empirical support for the methods and show how they fare better than naïve or brute-force approaches to use machine learning algorithms for spatially correlated data. We demonstrate the RandomForestsGLS R-package that implements this extension for random forests.
(Main Meeting Room - Calle Rector López Argüeta)
09:30 - 10:00 Andrew Zammit Mangion: Neural Point Estimation for Fast Optimal Likelihood-Free Inference
Neural point estimators are neural networks that map data to parameter point estimates. They are fast, likelihood free and, due to their amortised nature, amenable to fast bootstrap-based uncertainty quantification. In this talk I give an overview of this relatively new inferential tool, giving particular attention to the ubiquitous problem of making inference from replicated data, which we address in the neural setting using permutation-invariant neural networks. Through extensive simulation studies we show that these neural point estimators can quickly and optimally (in a Bayes sense) estimate parameters in weakly-identified and highly-parameterised models, such as models of spatial extremes, with relative ease. We demonstrate their applicability through an analysis of extreme sea-surface temperature in the Red Sea where, after training, we obtain parameter estimates and bootstrap-based confidence intervals from hundreds of spatial fields in a fraction of a second. This is joint work with Matthew Sainsbury-Dale and Raphaël Huser.
(Main Meeting Room - Calle Rector López Argüeta)
10:00 - 10:30 Mikael Kuusela: Neural Likelihood Surface Estimation for Intractable Spatial Models
Likelihood-based inference tends to be computationally intensive or wholly intractable for many common models in spatial statistics. Examples include Gaussian processes for large data sets and models for spatial extremes. Recent work has used neural networks to predict parameters in these models, circumventing the intractability of likelihood computations. Prediction, however, depends on the choice of a prior on the parameters and does not provide a straightforward means for frequentist uncertainty quantification. In this talk, I will demonstrate how to use tools from likelihood-free inference to learn the likelihood function of intractable spatial processes using convolutional neural networks. In cases where the exact likelihood is available, the method provides similar point estimation and uncertainty quantification performance as exact likelihood computations at a fraction of the computational cost. When the likelihood is unavailable, this method can learn the otherwise intractable likelihood function, providing inferences that are superior to existing approximations. The method is applicable to any spatial process on a regular grid for which fast forward simulations are available.
(Main Meeting Room - Calle Rector López Argüeta)
10:30 - 11:00 Coffee Break (Main Meeting Room - Calle Rector López Argüeta)
11:00 - 11:06 Sweta Rai: Fast Parameter Estimation of GEV Distribution Using Neural Networks
The generalized extreme-value (GEV) distribution is commonly used to model extreme events, such as floods, precipitation, and maximum temperature due to its heavy-tailed behavior. It is classified into three forms based on its shape: Fréchet, Gumbel, and Weibull. The goal is to fit the GEV distribution to the sample of extreme values and model these values using estimated parameters. The maximum likelihood (ML) method is the conventional approach for parameter estimation, but it can be computationally intensive for large simulation studies. To overcome this limitation, we use a neural network for efficient and likelihood-free estimation. The network is trained using a set of chosen extreme quantiles, along with the Q1, Q2, and Q3 as inputs. The NN provides GEV parameter estimates with similar accuracy to ML but with a computational speedup. This NN estimator is applied to 1000−year annual maximum temperature from the Community Climate System Model version 3 (CCSM3) across North America for three atmospheric concentrations: pre-industrial (289 ppm CO2), future conditions 700 ppm CO2, and 1400 ppm CO2, and compared with the ML approach. To account for estimation uncertainty, we employ parametric bootstrapping, inherent in the trained network.
(Main Meeting Room - Calle Rector López Argüeta)
11:06 - 11:12 Jordan Richards: Neural Bayes Estimators for Fast and Efficient Inference with Spatial Peaks-Over-Threshold Models
Likelihood-based inference for spatial extremal dependence models is often infeasible in moderate or high dimensions, due to an intractable likelihood function and/or the need for computationally-expensive censoring to reduce estimation bias. Neural Bayes estimators are a promising recent approach to inference that use neural networks to transform data into parameter estimates. They are likelihood free, inherit the optimality properties of Bayes estimators, and substantially faster than classical methods. In this work, we adapt neural Bayes estimators for peaks-over-threshold dependence models; in particular, we develop methodology for coping with the computational challenges often encountered when modelling spatial extremes (e.g., censoring). We demonstrate substantial improvements in computational and statistical efficiency relative to conventional likelihood-based approaches using popular extremal dependence models, including max-stable, and r-Pareto, processes, and random scale mixture models.
(Main Meeting Room - Calle Rector López Argüeta)
11:12 - 11:18 Jonathan Koh: Predicting Risks of Temperature Extremes using Large-scale Circulation Patterns with r-Pareto Processes
Many severe weather patterns in the mid-latitudes have been found to be connected to a particular atmospheric pattern known as blocking. This pattern obstructs the prevailing westerly large-scale atmospheric flow, changing flow anomalies in the vicinity of the blocking system to sustain weather conditions in the immediate region of its occurrence. Blockings’ presence and characteristics are thus important for the development of temperature extremes, which are rarely isolated in space, so one must not just account for their occurrence probabilities and intensities but also their spatial dependencies when assessing their associated risk. Here we propose a methodology that does so by combining tools from the spatial extremes and machine learning literature, to incorporate 500hPa geopotential (Z500) anomalies over the North Atlantic and European region as covariates to predict surface temperature extremes. This involves fitting Generalized r-Pareto processes with appropriate risk functionals to high-impact positive and negative temperature anomaly events across central Europe from 1979–2020, using loss functions motivated by extreme-value theory in a boosting algorithm. We check by simulation that the model parameters are identifiable and can be estimated adequately. We find which circulation patterns in the Euro-Atlantic sector are most important in determining the characteristics of these extremes, and show how they affect it.
(Main Meeting Room - Calle Rector López Argüeta)
11:18 - 11:24 Lydia Kakampakou: Modelling Temporal Changes in Spatial Extremal Dependence via a Conditional Framework
With climate change being one of the biggest crises of our time, concentrated efforts are being made to develop statistical models able to adequately capture and predict the behaviour of natural processes affected by this phenomenon. These efforts are of particular significance when such processes are potentially catastrophic at extreme levels. In the case of spatio-temporal environmental datasets, the effect of climate change on marginal trends is well documented and several methods have been proposed to capture this kind of non-stationarity. However, this is not the case for changes in the dependence structure. Most available spatio-temporal models for extremes assume stationarity in this feature, which may be unrealistic in a changing climate. We propose an extension of the spatial conditional extremes modelling framework of Wadsworth and Tawn (2022) to accommodate for non-stationary spatial dependence and apply this extended framework to a range of spatio-temporal environmental datasets.
(Main Meeting Room - Calle Rector López Argüeta)
11:24 - 11:30 Silius Mortensønn Vandeskog: Efficient and Robust Modelling of High-Dimensional Spatial Conditional Extremes
A successful general modelling framework for spatial extremes should be able to describe both weakening extremal dependence at increasing levels and changes in the type of extremal dependence class. It should also allow for computationally efficient inference in high dimensions, and it should be robust towards large deviations from the model assumptions in the data. We develop a general modelling framework for spatial extremes, based upon the spatial conditional extremes model. Inference is performed using integrated nested Laplace approximations (INLA), which allows for computationally efficient inference for high-dimensional problems. A post hoc transformation is applied after inference, which adjusts for model misspecification and leads to more robust estimates. The modelling framework is applied in a simulation study and in a case study of modelling extreme precipitation, and it displays great success in both settings.
(Main Meeting Room - Calle Rector López Argüeta)
11:30 - 11:36 Man Ho Suen: Aggregated Data Approach with inlabru
It is not uncommon to have spatial misalignment in observed responses and covariates data in a point data setting. The poster is to present a novel approach to aggregate them within the INLA-SPDE framework. We will discuss how to conceptualize the domain and samplers during the mesh construction.
(Main Meeting Room - Calle Rector López Argüeta)
11:36 - 11:42 Xuanjie Shao: Deep Compositional Models for Nonstationary Extremal Dependence in Space
Modeling the nonstationarity and anisotropy that often prevails in the extremal dependence of spatial data can be challenging. Inference for stationary, and isotropic models, is considerably easier, but the assumptions that underpin these models are not typically met by data observed over large, or topographically-complex, domains. A simple approach to accommodating spatial non-stationarity in Gaussian processes, proposed by Sampson and Guttorp (1992), is to warp the original spatial domain to a latent space where stationarity and isotropy can be reasonably assumed. However, estimation of the warping function can be computationally expensive and the transformation is not guaranteed to be injective, which can lead to physically-unrealistic transformations. Zammit-Mangion et al. (2021) overcame these issues by exploiting deep Gaussian processes, where the transformation is constructed using a deep composition of injective mappings. We present an extension of this methodology to model non-stationarity in extremal dependence of data, by leveraging popularly-applied parametric models for spatial extremal processes.
(Main Meeting Room - Calle Rector López Argüeta)
11:42 - 11:48 Lambert De Monte: A Geometric Investigation of the Hüsler–Reiss Family of Distributions
Recent developments in the probability theory and statistical inference of extremal dependence structures via geometric approaches exploit gauge functions and their associated limit sets. Under the current scheme of study, the Hüsler–Reiss family of distributions forms a subclass of distributions that leads to degenerate limit sets, and statistical inference methods fail to capture their properties. In this line of work, we consider new transformations and scalings that lead to non-degenerate limit sets and possible avenues for statistical inference.
(Main Meeting Room - Calle Rector López Argüeta)
11:48 - 11:54 Maggie Bailey: Temporal Downscaling for Solar Radiation Using a Diurnal Template Model
Global and regional climate model projections are useful for gauging future patterns of climate variables, including solar radiation, but data from these models is often too spatio-temporally course for local use. Within the context of solar radiation, the changing climate may have an effect on photo-voltaic (PV) production, especially as the PV industry moves to extend plant lifetimes to 50 years. Predicting PV production while taking into account a changing climate requires data at a resolution that is useful for building PV plants. We present a novel method to downscale global horizontal irradiance (GHI) data from daily averages to hourly profiles, while maintaining spatial correlation of parameters characterizing the diurnal profile of GHI. The method focuses on the use of a diurnal template which can be shifted and scaled according to the time or year and location. Variability in the profile is later added to account for clouds if the daily average value indicates a cloudy day. This analysis is applied to data from the National Solar Radiation Database provided by the National Renewable Energy Lab and a case study of the mentioned methods over California is presented.
(Main Meeting Room - Calle Rector López Argüeta)
11:54 - 12:00 Zhongwei Zhang: Extremal Dependence of Stochastic Processes Driven by Exponential-Tailed Lévy Noise
Stochastic processes driven by exponential-tailed Lévy noise constitute important extensions of their Gaussian counterparts in order to capture deviations from Gaussianity, more flexible dependence structures, and sample paths with jumps. Popular examples include non-Gaussian Ornstein-Uhlenbeck (OU) processes and type-G Matérn stochastic partial differential equation (SPDE) random fields. This paper is concerned with the open problem of determining the extremal dependence induced by these processes. Both process types admit stochastic integral representations and have approximations on grids or triangulations that are used in practice for efficient simulations or inference. We first show that these approximations can be expressed as special cases of a class of linear transformations of independent, exponential-tailed random variables that bridges asymptotic dependence and independence in a novel, tractable way. This result is of independent interest since models that can capture both extremal dependence regimes are scarce and the construction of such flexible models is an active area of research. Based on this fundamental result, we show that the exponential-tailed non-Gaussian OU process is asymptotically independent, but with a different residual tail dependence function than its Gaussian counterpart. Furthermore, we show that the finite element approximation of the type-G SPDE model is asymptotically independent provided that the mesh is fine enough, and we conjecture that asymptotic independence is preserved in the limiting process. The computational advantage of the SPDE-based formulation of non-Gaussian processes is thus readily applicable to modeling spatial extremes. Our results are illustrated by a small simulation study.
(Main Meeting Room - Calle Rector López Argüeta)
12:00 - 12:30 Open Forum
Time for participants to engage in open discussion, ask questions, share ideas, and make connections with others in attendance.
(Main Meeting Room - Calle Rector López Argüeta)
13:00 - 14:30 Lunch (Restaurant - Hotel Granada Center)
14:30 - 16:00 Poster Presentations (Main Meeting Room - Calle Rector López Argüeta)
16:00 - 16:30 Coffee Break (Main Meeting Room - Calle Rector López Argüeta)
16:30 - 17:30 Roundtable Discussion II: Modeling Spatial Data and Extremes with Machine Learning (Leads: Sebastian Engelke and Andrew Zammit-Mangion) (Main Meeting Room - Calle Rector López Argüeta)
19:30 - 21:30 Dinner (Restaurant - Hotel Granada Center)
Wednesday, May 10
07:00 - 09:00 Breakfast (Restaurant - Hotel Granada Center)
09:00 - 09:30 Janine Illian: Realistically Complex Spatial Models – Communication and Accessibility
These days more and more data are being collected, analysed and interpreted to inform decisions — as we have seen, for example, in the recent Covid 19 pandemic. Here, statisticians have a responsibility to support users who are interpreting the of results of a statistical analysis. At the same time, increasingly complex analysis tools are being developed, and are now easily accessible to non-statisticians through R packages. We as developers of complex statistical methods have a related responsibility to support the adequate use our methods by often quantitatively trained, yet non-specialist, scientists. But are these truly accessible? When introducing these users to our methodology we need to strike the right balance between treating methodology as a mere black box and explaining every single technical detail, while providing an adequate understanding of the methodology that allows users to independently decide on appropriate model choices. This is needed to encourage the use of our methods as well as to establish a fruitful dialogue with the users to improve and successfully build on exciting methods. In addition, when we develop statistical modelling approaches, it is important to ensure that these are relevant to the users and that they take into account the specific needs of the user community. This involves exploring and engaging with the specific data structures and associated scientific questions typically addressed within a field. We will focus here on the context of ecology and discuss specific data structures and questions arising within that field.
(Main Meeting Room - Calle Rector López Argüeta)
09:30 - 10:00 Soumendra Lahiri: On Locally Stationary Spatial Processes
We give a triangular array formulation of local stationarity for a spatial process, extending the seminal framework of R. Dahlhaus for time series. We propose a nonparametric estimator of the covariance function of a locally stationary random field based on irregularly spaced spatial data and establish its consistency. Some related issues on the spatial prediction will also be discussed.
(Main Meeting Room - Calle Rector López Argüeta)
10:00 - 10:30 David Bolin: Gaussian Random Fields on Compact Metric Graphs
There is an increasing interest in extreme value analysis of data on compact metric graphs such as street or river networks. In this work, we provide a first step towards well-defined extreme value models in continuous space on such domains by introducing a new class of Gaussian random fields on compact metric graphs. The proposed models, the Whittle--Matérn fields, are defined via a fractional stochastic partial differential equation on the compact metric graph and are a natural extension of Gaussian fields with Matérn covariance functions on Euclidean domains to the non-Euclidean metric graph setting. Existence of the processes, as well as some of their main properties, such as sample path regularity are derived. The model class in particular contains differentiable processes. To the best of our knowledge, this is the first construction of a differentiable Gaussian process on general compact metric graphs. The model class also contains Markov processes, for which we can evaluate likelihoods and perform spatial prediction exactly and computationally efficiently for large datasets. We finish with an application to model traffic data, where the ability to allow for differentiable processes greatly improves the model fit.
(Main Meeting Room - Calle Rector López Argüeta)
10:30 - 11:00 Coffee Break (Main Meeting Room - Calle Rector López Argüeta)
11:00 - 12:00 Roundtable Discussion III: Complex Spatial and Spatiotemporal Models (Leads: David Bolin and Jorge Mateu) (Main Meeting Room - Calle Rector López Argüeta)
12:00 - 12:30 Group Photo (Main Meeting Room - Calle Rector López Argüeta)
13:00 - 14:30 Lunch (Restaurant - Hotel Granada Center)
14:30 - 19:30 Free Afternoon (Other (See Description))
19:30 - 21:30 Dinner (Restaurant - Hotel Granada Center)
Thursday, May 11
07:00 - 09:00 Breakfast (Restaurant - Hotel Granada Center)
09:00 - 09:30 Sebastian Engelke: Extremal Graphical Models: a Review of Recent Progress
Engelke and Hitz (2020, JRSSB) introduce a new notion of conditional independence and graphical models for the most extreme observations of a multivariate sample. This enables the analysis of complex extreme events on network structures (e.g., floods) or large-scale spatial data (e.g., heat waves). Recent results show that this notion of extremal conditional independence arises as a special case of a much more general theory for limits of sums and maxima of independent random vectors. We review several recent results on statistical inference for extremal graphical models, including estimation of model parameters on general graph structures and data-driven structure learning algorithms. Theoretical guarantees based on concentration inequalities pave the way to high-dimensional settings where the dimension is much larger than the sample size. We discuss how sparse graphs can enable efficient computations and simulation in these situations.
(Main Meeting Room - Calle Rector López Argüeta)
09:30 - 10:00 Marco Oesting: Extremes in High Dimensions
Extreme-value theory has been explored in considerable detail for univariate and low-dimensional observations, but the field is still in an early stage regarding high-dimensional multivariate observations. In this paper, we focus on Hüsler-Reiss models and their domain of attraction, a popular class of models for multivariate extremes that exhibit some similarities to multivariate Gaussian distributions. We devise novel estimators for the parameters of this model based on score matching and equip these estimators with state-of-the-art theories for high-dimensional settings and with exceptionally scalable algorithms. We perform a simulation study to demonstrate that the estimators can estimate a large number of parameters reliably and fast; for example, we show that Hüsler-Reiss models with thousands of parameters can be fitted within a couple of minutes on a standard laptop.
(Main Meeting Room - Calle Rector López Argüeta)
10:00 - 10:30 Peter Braunsteins: Linking SPDEs and Spatial Extremes
When evaluated at a finite number of locations, existing models for spatial extremes do not exhibit any sparsity properties, such as conditional independence patterns. This means that the resulting likelihood functions do not factorise, limiting inference to a moderate number of dimensions. The goal of this work is to develop a class of models for spatial extremes whose finite dimensional distributions can be closely approximated by Husler--Reiss models with a sparse extremal conditional dependence structure. We do this by adapting the stochastic partial differential equation (SPDE) approach of Lindgren, Rue, and Lindstrom (2011) to the setting of extremes.
(Main Meeting Room - Calle Rector López Argüeta)
10:30 - 11:00 Coffee Break (Main Meeting Room - Calle Rector López Argüeta)
11:00 - 11:30 Simon Brown: Future Changes in Heatwave Severity, Duration and Frequency due to Climate Change for the Most Populous Cities
A novel approach to quantify the present and future heatwave hazard is presented which can discern characteristics beyond what can be achieved from current approaches, such as distributions of severity, duration and frequency, including very low probability events that may not have been seen. A statistical model is built that represents the seasonal cycle, climate change, magnitude and temporal behaviour of all temperatures above a moderately high time varying threshold at a site from which very large samples of temperature time series can be drawn. From these, user defined heatwaves can be extracted and precise empirical statistics calculated, allowing application to a wide range of problems such as heatwave impact on the pollination of food crops. Fitting the heatwave model to climate model simulations allows the changing severity, duration and frequency of heatwaves from the past to the future to be quantified. This approach is validated by reproducing the heatwave climatology of a pre-industrial 3 500 year GCM control run from a 110 year future emission run of the same GCM. Using this methodology, future heatwave changes for the most populous cities from 20 countries are derived from a 44 member ensemble of 28 GCMs from the CMIP5 archive forced with the RCP 8.5 emission scenario. Compared to 2006, absolute temperatures of 4–10 day long heatwaves are projected to be between 3.4 to 6.6∘C hotter in 2099, though the magnitude of the increases are independent of heatwave rarity and duration. For 13 of the cities no significant future changes in the distribution of durations are found relative to a contemporaneous threshold. Six cities show future heatwaves will have a tendency to be longer and one to be shorter. Half of the cities show increases in severity, the time integrated temperature anomalies above the contemporaneous threshold. The largest changes are found for Paris with 100 year return level severity changes equivalent to mean heatwave temperatures increasing by 3.4 °C or by 1.7 °C for 5 and 10 day heatwaves respectively, and the rate of 10 day heatwaves increasing by 130%.
(Main Meeting Room - Calle Rector López Argüeta)
11:30 - 12:00 Michael Wehner: Some Examples of Climate Science Machine Learning at Berkeley Lab (Main Meeting Room - Calle Rector López Argüeta)
12:00 - 12:30 Finn Lindgren: Statistical Climate Reconstruction Modelling in the EUSTACE Project
The EUSTACE project aimed at reconstructing daily air temperature at a daily timescale across the globe from 1850 to present day. I will discuss the statistical modelling insights obtained in the project, including large scale hierarchical spatio-temporal models from diverse data sources, and aspects of spatial and seasonal variability in diurnal temperature range distribution that would impact long term extreme value analysis.
(Main Meeting Room - Calle Rector López Argüeta)
13:00 - 14:30 Lunch (Restaurant - Hotel Granada Center)
14:30 - 15:00 Emma Simpson: High-Dimensional Modeling of Spatial Conditional Extremes Using INLA and Gaussian Markov Random Fields
The conditional extremes framework allows for event-based stochastic modeling of dependent extremes, and has recently been extended to a spatial setting. After standardizing the marginal distributions and applying an appropriate linear normalization, certain non-stationary Gaussian processes can be used as asymptotically-motivated models for the process conditioned on threshold exceedances at a fixed reference location. In this work, we adapt existing conditional extremes models to allow for the handling of large spatial datasets. This involves specifying the model for spatial observations at d locations in terms of a latent m (Main Meeting Room - Calle Rector López Argüeta)
15:00 - 15:30 Léo Belzile: Modelling of Sparse Conditional Spatial Extremes Processes Subject to Left-Censoring
The conditional spatial extremes model of Wadsworth and Tawn, which focuses on extreme events given threshold exceedance at a site, has garnered a lot of attention as a flexible way to model large-scale spatio-temporal events. We consider extensions that combine Gaussian Markov random field residual processes along with data augmentation schemes for dealing with left-censored realizations, exploiting the sparsity of the precision matrix obtained through the basis function approximation of the Gaussian process. Models are fitted using Markov chain Monte Carlo methods and we showcase the scalability of the approach using precipitation data from British Columbia.
(Main Meeting Room - Calle Rector López Argüeta)
15:30 - 16:00 Jordan Richards: High-dimensional Quantile Regression of Spatiotemporal Extreme Wildfires via Partially-Interpretable Neural Networks
Risk management in many environmental settings requires an understanding of the mechanisms that drive extreme events. Useful metrics for quantifying such risk are extreme quantiles of response variables conditioned on predictor variables that describe, e.g., climate, biosphere and environmental states. Typically these quantiles lie outside the range of observable data and so, for estimation, require specification of parametric extreme value models within a regression framework. Classical approaches in this context utilise linear or additive relationships between predictor and response variables and suffer in either their predictive capabilities or computational efficiency; moreover, their simplicity is unlikely to capture the truly complex structures that lead to the creation of extreme wildfires. In this paper, we propose a new methodological framework for performing extreme quantile regression using artificial neutral networks, which are able to capture complex non-linear relationships and scale well to high-dimensional data. The ``black box" nature of neural networks means that they lack the desirable trait of interpretability often favoured by practitioners; thus, we unify linear, and additive, regression methodology with deep learning to create partially-interpretable neural networks that can be used for statistical inference but retain high prediction accuracy. To complement this methodology, we further propose a novel point process model for extreme values which overcomes the finite lower-endpoint problem associated with the generalised extreme value class of distributions. Efficacy of our unified framework is illustrated on U.S. wildfire data with a high-dimensional predictor set and we illustrate vast improvements in predictive performance over linear and spline-based regression techniques.
(Main Meeting Room - Calle Rector López Argüeta)
16:00 - 16:30 Coffee Break (Main Meeting Room - Calle Rector López Argüeta)
16:30 - 17:00 Anna Kiriliouk: Estimating Probabilities of Multivariate Failure Sets Based on Pairwise Tail Dependence Coefficients
An important problem in extreme-value theory is the estimation of the probability that a high-dimensional random vector falls into a given extreme failure set. This paper provides a parametric approach to this problem, based on a generalization of the tail pairwise dependence matrix (TPDM). The TPDM gives a partial summary of tail dependence for all pairs of components of the random vector. We propose an algorithm to obtain an approximate completely positive decomposition of the TPDM. The decomposition is easy to compute and applicable to moderate to high dimensions. Based on the decomposition, we obtain parameter estimates of a max-linear model whose TPDM is equal to that of the original random vector. We apply the proposed decomposition algorithm to maximal wind speeds to illustrate its applicability.
(Main Meeting Room - Calle Rector López Argüeta)
17:00 - 17:30 Thomas Opitz: Bridges from Spatial Extreme-value Theory to Classical Geostatistics
Classical geostatistics leverages Gaussian processes, whereas classical spatial extreme-value theory leverages max-stable processes. As a consequence, widely-implemented geostatistical tools such as variograms or Gaussian likelihoods and simulation algorithms cannot be directly transferred to extreme-value analysis, where extreme-value structures call for more specific and often more technical and computer-intensive representations. In this talk, I will review recent advances in spatial peaks-over-threshold modeling of extreme-event episodes using r-Pareto processes, and then detail a large class of models related to log-Gaussian spectral processes where classical geostatistical tools such as variograms or Gaussian likelihoods can be applied. A simulation study will illustrate fast and reliable estimation of parameters, and we fit such models to extreme summer weather episodes in the wildfire-prone region of Mediterranean France.
(Main Meeting Room - Calle Rector López Argüeta)
20:30 - 22:30 Dinner
Restaurant: Carmen de la Victoria
(Other (See Description))
Friday, May 12
07:00 - 09:00 Breakfast (Restaurant - Hotel Granada Center)
09:00 - 09:30 Amanda Lenzi: Towards Black-box Parameter Estimation
As soon as we move away from Gaussian processes as the canonical model for dependent data, likelihood computation becomes effectively impossible, and inference is too complicated for traditional estimation methods. Consider, for instance, datasets from finance or climate science, where skewness and jumps are commonly present, and calculating the likelihood in closed form is often impossible, even with small datasets.  Recently, deep learning algorithms have shown to be a successful alternative in estimating parameters of statistical models for which simulation is easy, but likelihood computation is challenging. This talk presents new developments in black-box procedures to estimate parameters of statistical models based only on weak parameter structure assumptions. These approaches can successfully estimate and quantify the uncertainty of parameters from non-Gaussian models with complex spatial and temporal dependencies. The success of these methods is a first step towards a fully flexible automatic black-box estimation framework.
(Main Meeting Room - Calle Rector López Argüeta)
09:30 - 10:00 Likun Zhang: Emulating Complex Climate Models via Integrating Variational Autoencoder and Spatial Extremes
Many real-world processes have complex tail dependence structures that are difficult to characterize using the classic Gaussian processes. More flexible spatial extremes models such as Gaussian scale mixtures and single-station conditioning models exhibit appealing extremal dependence properties but are exceedingly prohibitive to compute. In this paper, we develop a new spatial extremes model that has space-scale aware and non-stationary dependence properties, and integrate it in the encoding-decoding structure of a variational autoencoder (extVAE). The extVAE can be used as a spatio-temporal emulator that characterizes the distribution of potential climate output states and produces outputs that have the same properties as the inputs, especially at the tail. Through simulation studies, we show that our extVAE is vastly more time-efficient than traditional Bayesian inference while also outperforming many spatial extremes models with a stationary dependence structure. To further demonstrate the computational powers of the extVAE, we analyze a high-resolution satellite-derived dataset of sea surface temperature in the Red Sea, which includes daily measurements at 16703 grid cells.
(Main Meeting Room - Calle Rector López Argüeta)
10:00 - 10:30 Yao Xie: Spatiotemporal Point Processes with Deep Kernels
Discrete events are sequential observations that record event time, location, and possibly additional information called "marks." Such event data is common in modern applications such as police reports, seismic activities, and COVID-19 data. Point process models (temporal, spatiotemporal, over networks) have become popular in statistics and machine learning for modeling this data. Modern large-scale data requires more powerful models that can capture the complex spatial-temporal dependence in the data. We develop a new deep non-stationary influence kernel that can model spatio-temporal point processes that can be non-stationary. The main idea is to approximate the influence kernel with a novel and general low-rank decomposition, enabling efficient representation through deep neural networks and computational efficiency and better performance. We also take a new approach to maintain the non-negativity constraint of the conditional intensity by introducing a log-barrier penalty. We demonstrate our proposed method's good performance and computational efficiency compared with the state-of-the-art on simulated and real data.
(Main Meeting Room - Calle Rector López Argüeta)
10:30 - 11:00 Checkout by 11AM (Front Desk - Hotel Granada Center)
10:30 - 11:00 Coffee Break (Main Meeting Room - Calle Rector López Argüeta)
11:00 - 12:00 Roundtable Discussion IV: Statistics of Extremes: Where are We Heading? (Lead: Anthony Davison) (Main Meeting Room - Calle Rector López Argüeta)
12:00 - 12:10 Soutir Bandyopadhyay: Closing (Main Meeting Room - Calle Rector López Argüeta)
13:00 - 14:30 Lunch (Restaurant - Hotel Granada Center)