2016-2017 Statistics Seminar
Department of Mathematics
Washington University in St. Louis

Organizers: Todd Kuffner and Nan Lin
Location: All seminars will be in Room 199, Cupples I.
Contact: Professor Todd Kuffner ( kuffner followed by @ followed by wustl dot edu )

Title and Abstract
Friday February 17
Jose Figueroa-Lopez
Assoc. Professor, Dept. of Mathematics
Washington University in St. Louis
Asymptotic methods in financial mathematics

Asymptotic analyses of financial problems have a wide spectrum of applications ranging from nonparametric estimation methods based on high-frequency data to near-expiration characterizations of option prices and implied volatilities, and to Monte Carlo based methods for path-dependent option. These methods are especially useful to study models with jumps due to the lack of tractable formulas and efficient numerical procedures. In this talk, I will discuss some recent advances in the area and illustrate their broad relevance in several contexts.
Friday February 24
Liberty Vittert
Asst. Professor, School of Mathematics and Statistics
University of Glasgow
An Investigation into the Human Face: Statistical Models for Manifold Data

Three-dimensional surface imaging, through laser-scanning or stereo-photogrammetry, provides high-resolution data defining the surface shape of objects. Using a human face as this object, each image corresponds to an observation, a manifold, represented by a triangulated point cloud. In an anatomical setting this can provide invaluable quantitative information. Particular applications vary widely including success or failure of cosmetic/reconstructive plastic surgery, facial recognition, facial asymmetry, concepts of sexual dimorphism, and even the survival of mussels (food we consume) given climate change. However, the initial challenge is to characterize these complex surfaces, without laborious manual intervention. Surface curvature provides the key information in doing this, allowing for a creating of a surface “mask” replicable throughout all these objects. Once the full surface representation has been obtained, the new issue arises of how to best characterize and visualize the differences in shape. The issues involved with analysis of this data and multiple visualization methods will be discussed and illustrated.
Friday March 3
Todd Kuffner
Asst. Professor, Dept. of Mathematics
Washington Univeresity in St. Louis
Optimal block bootstrap estimation for sample quantiles under weak dependence

When considering smooth functionals of dependent data, block bootstrap methods have enjoyed considerable success in theory and application. For nonsmooth functionals of dependent data, such as sample quantiles, the theory is less well-developed. In this talk, I will present a general theory of consistency and optimality for block bootstrap distribution estimation for sample quantiles under mild strong mixing assumptions. In contrast to existing results, we study the block bootstrap for varying numbers of blocks. This corresponds to a hybrid between the subsampling bootstrap and the moving block bootstrap (MBB). Examples of `time series' models illustrate the benefits of optimally choosing the number of blocks.
Thursday March 23
Rina Foygel Barber
Asst. Professor, Dept. of Statistics
University of Chicago
Optimization with nonconvex functions and nonconvex constraints

Nonconvex optimization arises in many applications of high-dimensional statistics and data analysis, including medical imaging via computed tomography (CT) scans where the physical model for data acquisition is inherently nonconvex. While convex programs for structured signal recovery have been widely studied, comparatively little is known about the theoretical properties of nonconvex optimization methods. In this talk I will discuss two types of optimization problems where nonconvexity plays a key role: first, projected gradient descent over nonconvex constraints, where the local geometry of the constraint set is closely tied to its convergence behavior, and second, composite optimization problems, where we must simultaneously minimize multiple terms that may all be nonconvex and nondifferentiable. Image reconstruction results on real data from spectral CT scans, where undersampling poses a substantial challenge, demonstrate the benefit of working with nonconvex models.
Tuesday March 28
Hongyuan Cao
Asst. Professor, Dept. of Statistics
University of Missouri Columbia
Analysis of asynchronous longitudinal data with partially linear models

We study partially linear models for asynchronous longitudinal data to incorporate nonlinear time trend effects. Local and global estimating equations are developed for estimating the parametric and nonparametric effects. We show that with a proper choice of the kernel bandwidth parameter, one can obtain consistent and asymptotically normal parameter estimates for the linear effects. Asymptotic properties of the estimated non- linear effects are established. Extensive simulation studies provide numerical support for the theoretical findings. Data from an HIV study are used to illustrate our methodology.
Friday March 31
Nicole Lazar
Professor, Dept. of Statistics
University of Georgia
Topological Data Analysis for Functional Magnetic Resonance Imaging Data

Topological data analysis (TDA) is a relatively new approach for the analysis of high-dimensional data of complex structure.  Functional magnetic resonance imaging (fMRI) is one source of such data.  fMRI, which provides a window into the working human brain, yields high-dimensional, noisy data with complex temporal and spatial correlation structures.  In this talk, I will first give an overview of fMRI data, highlighting some of the challenges for statistical analysis and how those challenges have traditionally been handled.  A major drawback of many of the standard approaches is that they are "massively univariate," that is, they are performed at the level of the volume element, which has no physiological or scientific meaning.  Such analysis paths furthermore induce a serious multiple testing problem.  TDA is one modern attempt to move away from a data array perspective to fMRI analysis.  The second part of the talk will give a gentle introduction to TDA, along with the results of initial attempts at application to fMRI data from a schizophrenia study.
Friday April 7
Ryan Martin
Assoc. Professor, Dept. of Statistics
NC State University

On valid prior-free probabilistic inference.

Using probabilities to describe uncertainty in a statistical inference problem is very reasonable approach.  Getting probabilities is easy, but ensuring that they are scientifically meaningful/interpretable is not.  Indeed, we all take for granted what it means for a probability to be "small" or "large", but I argue that this is actually a practically important issue that requires serious care.  Examples will be presented that highlight a fundamental but subtle issue concerning the interpretation of (default-prior) Bayesian posterior probabilities.  In light of these concerns, perhaps we need to look beyond Bayes/probability to describe this kind of uncertainty.  Towards this, I will introduce a new approach, called inferential models (IMs), built around the theory of random sets, which provides provably valid prior-free probabilistic inference under very general conditions.  The IM construction and its key properties will be discussed, along with some examples and further insights. 
Friday April 14
Aixin Tan
Asst. Professor, Dept. of Statistics and Actuarial Science
University of Iowa

Assessment of multiple-chain importance sampling estimators

In Bayesian data analysis, there is often the need to compare many different possible models and priors. If the data are highly informative for the model parameters, the choice of prior will have small effects on the posterior. Otherwise, if the data only provide indirect information of the parameters of interest, priors have to be chosen with care according to certain criteria, say, based on the Bayes Factor.

It is a challenging computing problem to calculate various posterior quantities and Bayes Factors among the different Bayesian models. In this talk, we consider an importance sampling (IS) technique that efficiently combines Markov chain Monte Carlo (MCMC) samples from multiple posterior distributions. An important yet difficult problem for general MCMC estimators is assessing their standard errors. Such assessment is even more challenging for estimators that are constructed with multiple Markov chains. We provide an easy-to-implement tool to evaluate the standard errors of the multiple-chain IS estimators.

The multiple-chain IS technique will be illustrated with two data analysis problems. One in Bayesian variable selection, the other in Bayesian spatial modeling.
Friday April 21
Meggie Wen
Assoc. Professor, Dept. of Mathematics & Statistics
Missouri University of Science & Technology
Sufficient dimension reduction for multiple populations

Two topics in the area of dimension reduction for multiple populations will be explored. We will first propose a link-free test for testing whether two (or more) multi-index models share identical indices via the sufficient dimension reduction approach. Test statistics are developed based upon sufficient dimension reduction methods. The asymptotic null distributions of our test statistics are derived.  Next, we will propose a two-step dimension reduction method for multi-population data.  Our method is the first one in the area which could conduct a joint analysis while still retaining the population specific effects. Though partial dimension reduction (Chiaromonte et al., 2002) can be adopted to deal with multi-population dimension reduction, it encloses the related directions for all populations, population-specific effects are ignored. On the other side, unlike the conditional analysis which is carried out separately within each individual population,  our method makes use of the information across the multiple populations which greatly improve the estimation accuracy. Simulations and a real data example were given to illustrate our methodology.
Wednedsay May 10
Cecilia Mancini
Professor, Universita di Firenze, Italy
Spot volatility estimation using delta sequences
We introduce a unifying class of nonparametric spot volatility estimators based on delta sequences and conceived to include many of the existing estimators in the field as special cases. The full limit theory is first derived when unevenly sampled observations under infill asymptotics and fixed time horizon are considered, and the state variable is assumed to follow a Brownian semimartingale. We then extend our class of estimators to include Poisson jumps or financial microstructure noise in the observed price process. This work makes different approaches (kernels, wavelets, Fourier) comparable. For example, we explicitly illustrate some drawbacks of the Fourier estimator. Specific delta sequences are applied to data from the S&P 500 stock index futures market.