Two approaches for online updates in the election forecast

Posted by Yuling Yao on Mar 03, 2021. Tag: modeling

The term online update here is referred to updating a statistical model after certain modeled outcome is observed. A concrete example is in election forecast: the state election result comes in sequence, and that is when some website has to offer “real-time update of our prediction”.

It seems there are two ways for this task. Approach 1 is model based update, or maybe we shall call Bayesian update—the model provides a posterior outcome predictive density Pr(CA, NY, …). An online update becomes a conditional estimate: Pr(CA $\vert$ NY = observed outcome). In practice, we only need to collect posterior simulation draws that leads to this outcome.

If the outcome is continuous (shares of the vote), the probability is zero for any simulation draws to match the exact observations. We could use some ABC method here, and update the exact conditional probability by Pr(CA $\mid$ NY $\approx$ observed outcome) equipped come chosen distance metric.

The problem with simulation based approach is that if we observe some tail event, say R winning NY, there is hardly any simulation draw to match this event. Or simply when the number of events is large, every time along the update we would discard some simulation draws. Whichever reason, the update efficiency is limited by number of simulation draws. A quick approximation is to further approximate the posterior outcome model Pr(CA, NY, …) by a multivariate normal model, such that any conditional update comes in closed form solution, at the cost of less modeling flexibility.

Another way for this update task is some regression approach. Say we have point prediction for each state $y_{NY}, y_{CA}, \dots$, and sequentially we observe the actual outcome, $\tilde y_{NY}, \tilde y_{CA}, \dots$, we could run a regression $\tilde y_{i} = \beta_{1} y_i + \beta_{0} + \epsilon_i, \epsilon_i \sim \mathrm{normal}(0, \sigma).$ Then the online update task becomes the standard parameter update problem with more and more data comes in. This approach is compatible with point predictions, and has the advantage to adjust systematic “polling bias” (think about 2016).

A further question is how to combine these two approaches. In particular, approach 1 (simulation draws) can make use the posterior correlation between state outcomes. A plausible way is to replace the regression model by

\[\tilde y_{i} = y_i + \beta_0 + \epsilon_i,\]

But instead of the iid residuals in the regression model, this time we model $\epsilon_i$ as from a multivariate normal distribution, whose covariance is adapted from the posterior predictions,

\[\mathrm{Corr}(\epsilon_i, \epsilon_j)= \mathrm{Corr}(y_i, y_j).\]

The extra $\beta_0$ terms is still the systematic “polling bias” that was not seen by the existing model.

Certainly this new model is not ideal: The first two moments are oversimplified descriptions of a multi-D density; The normal model cannot adapt to heavy tailed predictions; The tail correlation is not necessarily the same the the overall correlation, etc.