WHOA-PSI-2019

Workshop Description

The Fourth Workshop on Higher-Order Asymptotics and Post-Selection Inference (WHOA-PSI)^{4} seeks to build upon the success of the first workshop, second workshop and third workshop, by presenting the latest developments in post-selection inference, and discussing how tools from higher-order asymptotics can both elucidate important properties of post-selection inference procedures, as well as suggest new directions which may ultimately yield more accurate small-sample performance. The workshop format is intended to encourage collaboration and lively discussion, and to give a voice to all participants with online discussion forums (a result of a successful experiment from the first two workshops). More specific details will soon be posted below. Contact: Todd Kuffner, email: kuffner@wustl.edu

This conference supports the Welcoming Environment Statement of the Association for Women in Mathematics (AWM).

This workshop is being co-sponsored by the National Institute of Statistical Sciences. See more details below under `Available Funding'.

Location and travel instructions: click here	A picture of the room: click here	Parking instructions: click here	Restaurant information: click here
Meeting logistics (no schedule here): click here	Todd's Stuffed Animal Notification System: click here	Discussion Forum: click here	Titles and abstracts: click here

Organizing Committee

John Kolassa
Rutgers University

Todd Kuffner lead organizer
Washington University in St. Louis

Click here for the schedule as a webpage
Click here for the .pdf schedule with Panda times
Speakers

Rina Foygel Barber
University of Chicago
Pallavi Basu
Indian School of Business
Yuval Benjamini
Hebrew University of Jerusalem
Florentina Bunea
Cornell University
Brian Caffo
Johns Hopkins University
Emmanuel Candes
Stanford University
Daniela De Angelis
MRC Biostatistics, Cambridge
Julia Fukuyama
Indiana University
Irina Gaynanova
Texas A&M University
Ed George
University of Pennsylvania
Iain Johnstone
Stanford University
Mladen Kolar
University of Chicago
Vladimir Koltchinskii
Georgia Tech
Ioannis Kosmidis
University of Warwick
Arun Kumar Kuchibhotla
University of Pennsylvania
Stephen M.S. Lee
The University of Hong Kong
Xihong Lin
Harvard University
Kristin Linn
University of Pennsylvania
Miles Lopes
UC Davis
Xiao-Li Meng
Harvard University
Art Owen
Stanford University
Snigdha Panigrahi
University of Michigan
Annie Qu
UIUC
Aaditya Ramdas
Carnegie Mellon University
Veronika Rockova
University of Chicago
Cynthia Rush
Columbia University
Richard Samworth
University of Cambridge
Ulrike Schneider
TU Wien
Peter Song
University of Michigan
Weijie Su
University of Pennsylvania
Jonathan Taylor
Stanford University
Robert Tibshirani
Stanford University
Ryan Tibshirani
Carnegie Mellon University
Jingshen Wang
UC Berkeley
Daniel Yekutieli
Tel Aviv University
Alastair Young
Imperial College London
Linda Zhao
University of Pennsylvania

Registration Details
Click here to go to the registration website. Registration will close on Wednesday 24th July at 11:59pm US Central time.

Dates, Times, and Location

The workshop is a full 3 days. The talks will begin around 8:30am on Saturday August 17th, and will end by 5:30pm on Monday August 19th, 2019. The workshop will take place at a conference center on the campus of Washington University in St. Louis in St. Louis, Missouri, USA.

Availability of Funding

The National Institute of Statistical Sciences (NISS) is co-sponsoring this website. Many statistics departments in the US are `Affiliates' of NISS. If you are from one of these departments, you can apply for funding from NISS to attend this workshop. Click here for a list of affiliates and click here for more information about requesting funding for this workshop.

Poster Session

There will be one or two poster sessions. Anyone wishing to present a poster should indicate this on the online registration form. The topic of the poster should be related to the content of the workshop. Please check with the organizers if you are unsure about the suitability of your poster topic.

Lodging Information

(a) The Knight Center is holding a block of rooms until July 26th, at a special rate ($119/night) for the nights of Aug. 15, 16, 17, 18, and 19. Guests will need to call the Knight Center directly at +1 314-933-9400 or toll free 866-933-9400 and ask for rooms under the room block code: `Camel Statistics' . All of these rooms have one queen bed. Everyone seemed to like staying here in the past; it's quite convenient and comfortable.

(b) Other nearby hotels include: (1) The Moonrise Hotel, which is within walking distance. (2) Clayton Plaza Hotel, which offers a free shuttle service.

Local Information

For those arriving early or thinking about staying longer, St. Louis is a lovely place to visit. Besides the iconic Gateway Arch and the nearby Old Courthouse which houses exhibits on the Dred Scott case, St. Louis has a stunning botanical garden, a high density of good restaurants (BBQ is a specialty), and is close to many rivers (Missouri, Mississippi and Meremac) which are great for float trips. There are many nearby parks and nature reserves which are excellent for hiking, as well as a wolf sanctuary. Mark Twain's boyhood home lies an hour north of the city. Anheuser-Busch is headquartered in St. Louis and offers tours of the brewery (requires advance booking due to popularity). For those unfamiliar with the institution, Washington University in St. Louis is ranked 20th in the world in the 2018 Academic Ranking of World Universities. Our statistics presence is concentrated in the Dept. of Mathematics and Statistics. You are encouraged to look around this beautiful campus on the western edge of St. Louis, which faces Forest Park, the site of the 1904 World's Fair and home to the Saint Louis Zoo and Saint Louis Art Museum (both free admission, walking distance from campus).

Potential Topics include (but are certainly not limited to): Participants: feel free to send me updates!

Principles and general views of post-selection inference, for example
Benjamini (2010). `Simultaneous and selective inference: current successes and future challenges', Biometrical Journal 52, 708-721.
Taylor & Tibshirani (2015), `Statistical learning and selective inference', Proceedings of the National Academy Sciences 112, 7629-7634.
Leeb & Potscher (2005), `Model selection and inference: facts and fiction', Econometric Theory 21, 21-59.

Comparisons of naive intervals and post-selection inference, for example
Zhao, Shojaie & Witten (2017), `In defense of the indefensible: a very naive approach to high-dimensional inference', arXiv: 1705.05543.
Leeb, Potscher & Ewald (2015), `On various confidence intervals post-model-selection', Statistical Science 30, 216-227.

Incorporating resampling and asymptotic refinements into inference procedures relevant for this workshop, for example
Stephen M.S. Lee and Yilei Wu (2017). Resampling-based post-model-selection inference for linear regression models.
Andreas Buja and Werner Stuetzle (2017). Smoothing effects of bagging: von Mises expansions of bagged statistical functionals, arXiv: 1612.02528.
Noureddine El Karoui and Elizabeth Purdom (2015). Can we trust the bootstrap in high dimensions? Submitted.
Noureddine El Karoui and Elizabeth Purdom (2016). The bootstrap, covariance matrices, and PCA in moderate and high-dimensions. Submitted.
McCarthy, Zhang, Brown, Berk, Buja, George & Zhao (2017). Calibrated Percentile Double Bootstrap for Robust Linear Regression Inference, Statistica Sinica, accepted.
Mayya Zhilova (2016). Non-classical Berry-Esseen inequality and accuracy of the weighted bootstrap, arXiv: 1611.02686 .
Mayya Zhilova (2015). Simultaneous likelihood-based bootstrap confidence sets for a large number of models, arXiv: 1506.05779 .
Ian McKeague and Min Qian (2015). An adaptive resampling test for detecing the presence of significant predictors (with discussion). J. Amer. Statist. Assoc. 110, 1422-1433.

Cross-Validation, AIC, inference and prediction post-selection, for example
Jing Lei (2017). Cross-validation with confidence, arXiv: 1703.07904.
Ali Charkhi & Gerda Claeskens (2017). Asymptotic post-selection inference for Akaike's information criterion.
Lukas Steinberger and Hannes Leeb (2016). Leave-one-out prediction intervals in linear regression models with many variables, arXiv: 1602.05801.
Liang Hong, Todd Kuffner & Ryan Martin (2018). On overfitting and post-selection uncertainty assessments. Biometrika 105(1), 221-224.
Liang Hong, Todd Kuffner & Ryan Martin (2017). On prediction of future insurance claims when the model is uncertain. Submitted.
Francois Bachoc, Hannes Leeb & Benedikt Potscher (2017). Valid confidence intervals for post-model-selection predictors, arXiv: 1412.4605.
Hannes Leeb (2009). Conditional predictive inference post model selection. Annals of Statistics 37(5B), 2838-2876.
Hannes Leeb (2008). Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process. Bernoulli 14(3), 661-690.

Assumption-lean and distribution-free inference, conformal prediction and robustness, for example
Buja, Berk, Brown, George, Kuchibhotla & Zhao. Models as Approximations II: A General Theory of Model-Robust Regression. arXiv: 1612.03257.
Anru Zhang, Larry Brown & Tony Cai (2016). Semi-supervised inference: general theory and estimation of means, arXiv: 1606.07268.
Lei, G'Sell, Rinaldo, Tibshirani & Wasserman (2017). Distribution-free predictive inference for regression, J. Amer. Statist. Assoc., to appear.
Fan Yang and Rina Foygel Barber (2017). Contraction and uniform convergence of isotonic regression. arXiv: 1706.01852.
Azriel, Brown, Sklar, Berk, Buja & Zhao (2016). Semi-supervised linear regression, arXiv: 1612.02391.

Statistical efficiency and inference in machine learning, for example
Susan Athey & Stefan Wager (2017). Efficient policy learning, arXiv: 1702.02896.
Qingyuan Zhao & Trevor Hastie (2017). Causal interpretations of black-box models, J. of Business & Economic Statistics, to appear.

Model-based clustering and cluster-based models and inference, for example
Bunea, Eisenbach, Ning, Dinicu and Liu (2017). Inference in cluster-based graphical models.
Bunea, Ning and Wegkamp (2017). Overlapping clustering with statistical guarantees, arXiv: 1704.06977.

Selective inference (conditional approaches), for example
Azais, de Castro & Mourareau (2018). Power of the spacing test for least-angle regression. Bernoulli 24(1), 465-492.
Qingyuan Zhao, Dylan Small and Ashkan Ertefaie (2017). Selective inference for effect modification via the lasso, arXiv: 1705.08020.
Yuval Benjamini, Jonathan Taylor & Rafael Irizarry (2016). Selection corrected statistical inference for region detection with high-dimensional throughput assays, bioRxiv preprint.
Hyun, G'Sell & Tibshirani (2016), `Exact post-selection inference for changepoint detection and other generalized lasso problems', arXiv: 1606.03552
Taylor & Tibshirani (2016), `Post-selection inference for L1-penalized likelihood models', arXiv: 1602.07358
Fithian, Taylor, Tibshirani & Tibshirani (2015+), `Selective sequential model selection', arXiv: 1512.02565
Tibshirani, Taylor, Lockhart, Tibshirani (2015+), `Exact post-selection inference for sequential regression procedures', J. Amer. Statist. Assoc., to appear.
Lockhart, Taylor, Tibshirani & Tibshirani (2014), `A significance test for the lasso', Annals of Statistics 42, 413-468.
Tibshirani, Rinaldo, Tibshirani & Wasserman (2015), `Uniform asymptotic inference and the bootstrap after model selection', arXiv: 1506.06266
Tian & Taylor (2015), `Asymptotics of selective inference', arXiv: 1501.03588
Lee, Sun, Sun & Taylor (2016), `Exact post-selection inference with the lasso', to appear in the Annals of Statistics.

Simultaneous inference, false discovery rates (FDR), false coverage statement rates (FCR), family-wise error rates (FWER), for example
Katsevich & Ramdas (2018). Towards ``simultaneous selective inference": post-hoc bounds on the false discovery proportion, arXiv: 1803.06790
Ramdas, Barber, Wainwright & Jordan (2017). A unified treatment of multiple testing with prior knowledge using the p-filter, arXiv: 1703.06222
Lihua Lei, Aaditya Ramdas & Will Fithian (2017). STAR: a general interactive framework for FDR control under structural constaints, arXiv: 1710.02776
Bachoc, Preinerstorfer & Steinberger (2017). Uniformly valid confidence intervals post-model-selection, arXiv: 1611.01043.
Berk, Brown, Buja, Zhang & Zhao (2013), `Valid post-selection inference', Annals of Statistics 41, 802-837.
Benjamini (2010), `Discovering the false discovery rate', J. Roy. Statist. Soc. Ser. B 72, 405-416.
Benjamini & Yekutieli (2005), `False discovery rate-adjusted multiple confidence intervals for selected parameters', J. Amer. Statist. Assoc. 100, 71-93.
G'Sell, Wager, Chouldechova & Tibshirani (2015+), `Sequential selection procedures and false discovery rate control', J. Roy. Statist. Soc. Ser. B, to appear.
Barber & Candes (2015), `Controlling the false discovery rate via knockoffs', Annals of Statistics 43, 2055-2085.
Su, Bogdan & Candes (2016+), `False discoveries occur early on the Lasso path', arXiv: 1511.01957.

Bayesian post-selection inference, for example
Panigrahi, Taylor & Weinstein (2016). `Bayesian post-selection inference in the linear model', arXiv: 1605.08824
Yekutieli (2012). `Adjusted Bayesian inference for selected parameters', J. Roy. Statist. Soc. Ser. B, 74(3), 515-541.

Bagging and Boosting, for example
Bradic (2016). `Randomized maximum-contrast selection: subagging for large-scale regression', Elec. J. Statist. 10(1), 121-170.
Li & Bradic (2015). `Boosting in the presence of outliers: adaptive classification with non-convex loss functions', arXiv: 1510.01064.
Efron (2014), `Estimation and accuracy after model selection', J. Amer. Statist. Assoc. 109, 991-1007.
Buhlmann & Yu (2002), `Analyzing bagging', Annals of Statistics 30, 927-961.

High-dimensional inference, for example
Po-Ling Loh (2017). Statistical consistency and asymptotic normality for high-dimensional robust M-estimators, Annals of Statistics 45(2), 866-896.
Fan, Shao & Zhou (2015), `Are discoveries spurious? Distributions of Maximum Spurious Correlations and their applications', arXiv: 1502.04237
Cai & Guo (2015), `Confidence intervals for high-dimensional linear regression: minimax rates and adaptivity', arXiv: 1506.05539
Ning & Liu (2015), `A general theory of hypothesis tests and confidence regions for sparse high dimensional models', arXiv: 1412.8765
Ning, Zhao & Liu (2015), `A likelihood ratio framework for high dimensional semiparametric regression', arXiv: 1412.2295
Shah & Samworth (2013), `Variable selection with error control: another look at stability selection', J. Roy. Statist. Soc. B 75, 55-80.
Meinshausen & Buhlmann (2010), `Stability selection', J. Roy. Statist. Soc. Ser. B 72, 417-473.
van de Geer, Buhlmann, Ritov & Dezeure (2014), `On asymptotically optimal confidence regions and tests for high-dimensional models', Annals of Statistics 42, 1166-1202.
Javanmard & Montanari (2015+), `Hypothesis testing in high-dimensional regression under the Gaussian random design model: asymptotic theory', IEEE Trans. Inform. Theory, to appear.
Liu & Yu (2013), `Asymptotic properties of Lasso+mLS and Lasso+Ridge in sparse high-dimensional linear regression', Electronic J. Statist. 7, 3124-3169.
Zhang & Zhang (2014), `Confidence intervals for low-dimensional parameters in high-dimensional linear models', J. Roy. Statist. Soc. Ser. B 76, 217-242.
Belloni, Chernozhukov & Hansen, `Inference methods for high-dimensional sparse econometric models', Advances in Economics & Econometrics, Econometric Society World Congress 2010.

Selection and inference for weak signals, for example
Shi & Qu (2016). `Weak signal identification and inference in penalized model selection', Annals of Statistics, to appear.
Jeng (2016). `Detecting weak signals in high dimensions', J. Multivariate Statist. 147, 234-246.

The aspects of the above topics and other post-selection inference procedures which will be emphasized in the workshop are those related to higher-order asymptotics, including both analytic- and resampling-based tools and refinements, some of which are described in:

Small (2010), Expansions and Asymptotics for Statistics, Chapman & Hall.

Young (2009), `Routes to higher-order accuracy in parametric inference', Austral. N.Z. J. Statist. 51, 115-126.

Brazzale & Davison (2008), `Accurate parametric inference for small samples', Statistical Science 23, 465-484.

Brazzale, Davison & Reid (2007), Applied Asymptotics: Case Studies in Small-Sample Statistics, Cambridge University Press.

Butler (2007), Saddlepoint Approximations with Applications, Cambridge University Press.

Bedard, Fraser & Wong (2007), `Higher accuracy for Bayesian and frequentist inference: large sample theory for small sample likelihood', Statistical Science 22, 301-321.

Yi & Fraser (2007), `Higher order asymptotics: an intrinsic difference between univariate and multivariate models', J. Statist. Research 41, 1-20.

Kolassa (2006), Series Approximation Methods in Statistics 3rd edition, Springer.

Young & Smith (2005), Essentials of Statistical Inference, Cambridge University Press.

Reid (2003), `Asymptotics and the theory of inference', Annals of Statistics 31, 1695-1731.

Severini (2000), Likelihood Methods in Statistics, Oxford University Press.

Pace & Salvan (1997), Principles of Statistical Inference from a Neo-Fisherian Perspective, World Scientific.

Jensen (1995), Saddlepoint Approximations, Oxford University Press.

Ghosh (1994), Higher Order Asymptotics, Institute of Mathematical Statistics.

Barndorff-Nielsen & Cox (1994), Inference and Asymptotics, Chapman & Hall.

Hall (1992), The Bootstrap and Edgeworth Expansion, Springer.

Field & Ronchetti (1990), Small Sample Asymptotics, Institute of Mathematical Statistics.

McCullagh (1987), Tensor Methods in Statistics, Chapman & Hall.

Some recent references for post-selection inference include
Chapter 3 of Fithian (2015), Topics in Adaptive Inference, Ph.D. thesis, Stanford University.
Chapter 6 of Hastie, Tibshirani & Wainwright (2015), Statistical Learning with Sparsity: The Lasso and Generalizations, Chapman & Hall.
Chapters 10-11 of Buhlmann & van de Geer (2011), Statistics for High-Dimensional Data, Springer.