Yuling Yao
About Blog Publication Software

Yuling Yao


Yuling Yao


I am a sixth-year PhD student in Department of Statistics at Columbia University. I am advised by Professor Andrew Gelman. Before coming to Columbia I obtained my undergraduate education from Tsinghua University, where I studied Mathematics.

I anticipate to graduate in early 2021.

My general research interest lies in Bayesian statistics and machine learning. My current research involves:

  • Uncertainty in M-open world: how to do model averaging and model evaluation when the models are wrong, cross validation and marginal likelihood, when these model evaluation methods per se are valid and how to remedy.
  • Reliable inference and computation: how to diagnose variational inference; metastability in MCMC sampling algorithms; importance sampling and normalization constant.
  • Better inference through a predictive lens: Bayesian procedure is coherent but not automatically optimal. Can we tune the exact posterior to make the prediction more robust?
  • Statistics requires extrapolation: from sample data to population, from control to treatment group, and from measurements to underlying constructs of interest. How do we know the extent to which our inference is sensitive to these extrapolations?
  • Post-prcessing of MCMC: should we throw away joint densities in Monte Carlo integrals? (the answer is no I think). Can we compute the 1/10000 quantile, based on 1000 Monte Carlo draws?

I am also interested in applying Bayesian methods to real data problems. Projects that I have been involved inclue replication crisis in psychology, groundwater arsenic in South Asia, soil lead after Notre Dame fire, free energy computation, deep Bayes net on imagenet, and Covid-19 and election predictions.

I share my random thoughs on statistic and machine learning on my Blog. and some other more ad-hoc stuff on another personal blog.

Published and Submitted Papers  

    Bayesian Methodology

Yuling Yao, Collin Cademartori, Aki Vehtari, Andrew Gelman. [2020]
Adaptive Path Sampling in Metastable Posterior Distributions. under review.
[preprint]   [Package]   [Blog]

From importance sampling to adaptive importance sampling to path sampling to adaptive path sampling, and from Rao–Blackwell to Wang-Landau to Jarzynski-Crook: all about free energy and simulated tempering

Yuling Yao, Aki Vehtari, Andrew Gelman. [2020]
Stacking for Non-mixing Bayesian Computations: The Curse and Blessing of Multimodal Posteriors. under review.
[preprint]   [Code]   [Blog]

The result from multi-chain stacking is not necessarily equivalent, even asymptotically, to fully Bayesian inference, but it serves many of the same goals. Under misspecified models, stacking can give better predictive performance than full Bayesian inference, hence the multimodality can be considered a blessing rather than a curse.

Andrew Gelman, Yuling Yao. [2020]
Holes in Bayesian Statistics. Journal of Physics G, in press.

This does not mean that we think Bayesian inference is a bad idea, but it does mean that there is a tension between Bayesian logic and Bayesian workflow which we believe can only be resolved by considering Bayesian logic as a tool, a way of revealing inevitable misfits and incoherences in our model assumptions, rather than as an end in itself.

Yuling Yao. [2019+]
Bayesian Aggregation. under review.

Aki Vehtari, Daniel Simpson, Andrew Gelman, Yuling Yao, Jonah Gabry. [2019+]
Pareto Smoothed Importance Sampling. under review.

How to run importance sampling with effieiciency and reassurance

Aki Vehtari, Daniel Simpson, Yuling Yao, Andrew Gelman [2018]
Limitations of "Limitations of Bayesian leave-one-out cross-validation for model selection". Computational Brain & Behavior.

Yuling Yao, Aki Vehtari, Daniel Simpson, Andrew Gelman [2018]
Yes, But Did it Work?: Evaluating Variational Inference. Proceedings of the 35th International Conference on Machine Learning.
[Online]   [Blog]   [Code]

The Pareto-smoothed importance sampling diagnostic gives a goodness of fit measurement for joint variational approximtion, while simultaneously improving the error in the estimate.

Yuling Yao, Aki Vehtari, Daniel Simpson, Andrew Gelman [2018]
Using Stacking to Average Bayesian Predictive Distributions (With Discussion and Rejoinder). Bayesian Analysis, 13, 917-1003.
[Online]   [Code]   [R package]

"Remember that using Bayes' Theorem doesn't make you a Bayesian. Quantifying uncertainty with probability makes you a Bayesian."

    Applied Statistics

Andrew Gelman, Aki Vehtari, Daniel Simpson, Charles Margossian, Bob Carpenter, Yuling Yao, Paul-Christian Bürkner, Lauren Kennedy, Jonah Gabry, Martin Modrák. [2020]
Bayesian workflow. preprint.

Theoretical statistics indeed is the theory of applied statistics.

Alexander van Geen, Yuling Yao, Tyler Ellis, Andrew Gelman. [2020]
Fallout of Lead over Paris from the 2019 Notre-Dame Cathedral Fire. Geohealth .
[Online]   [Code]   [Media coverage (Le Monde)]   [Media coverage 2]

How much lead was there after the fire?

Oscar Chang, Yuling Yao, David Williams-King, Hod Lipson. [2019+]
Ensemble Model Patching: A Parameter-Efficient Variational Bayesian Neural Network. arxiv preprint.
[Online]   [Blog]  

running BNN on ImageNet: more expressive than MC-Dropout, more affordable than meanfield VI

Maarten Marsman, Felix D Schönbrodt, Richard D Morey, Yuling Yao, Andrew Gelman, Eric-Jan Wagenmakers [2016]
A Bayesian bird's eye view of ‘Replications of important results in social psychology’. Royal Society Open Science,4,160426.

Yu-Sung Su, Yuling Yao [2015+]
Happy Generations, Depressed Generations: How and Why Chinese People’s Life Satisfactions Vary across Generations, under review.

Yu-Sung Su, Yuling Yao [2015] Is the rice dumpling sweet or salty? Adjusting the selection bias of online surveys by multilevel regression and poststratification. (in Chinese) Journal of Tsinghua University,03,43. [Download]



R package for efficient approximate leave-one-out cross-validation (LOO) using Pareto smoothed importance sampling (PSIS), a new procedure for regularizing importance weights.
[Source]   [ CRAN]


R package for adaptive path sampling. It iteratively reduces the gap between the proposal and the target density, and provide a reliable normalizing constant estimation with practical diagnostic using importance sampling theory.

last updated: Nov 1 2020
© Yuling Yao
  © 2020 Yuling Yao