Back to January 18 when there were 60 cases globally

Posted by Yuling Yao on May 09, 2020.       Tag: decision-theory  

On January 18 2020, the pre-pandemic era, when the stock market in both US and China was still busy celebrating their phase-one trade deal, I saw this news in the China section of BBC which covered a story of a brand new yet underemphasized virus in China:

There have been more than 60 confirmed cases of the new coronavirus (globally).

BBC reported an estimation by ICL that the number of cases were likely to be understated, in fact,

experts estimate a figure nearer 1,700.

Indeed, I remember I was quite horrified when I read their prediction:

The virus ‘will have infected hundreds’.

So I immediately checked their model, which was effectively a capture-recapture model based on 3 positive cases confirmed outside China. You could estimate the total number in Wuhan, by 3 dividing the possibility that a patient will leave Wuhan for international travels, which can be further estimated by the total outbound traffic counts divided by the city population. There were many simplifications, but it was fine to me.

I knew Jonathan Auerbach had used a clever negative binomials on counting the number of rats in NYC, a mathematically equvalent to this covid model. So I emailed Jonathan:

The inference based on 3 positive cases seems not convincing, yet should I buy more Lysol now?

As a statistician, this could have been a reasonable questions to ask, as the whole estimation was driven by three data points. Given that we are dealing with the scale of data like millions in our daily research, it was the limit of a proportionate response to three data points by kinda making fun of it in a harmless email.

Jon replied:

You should blog about it! You can write a decision theory paper about how much lysol you should buy based on the data.

Except I didn’t.

Except four months later, the whole world and 7 billion human beings are fundamentally changed, not only by three bathes of viruses, but also by our ignorance and indifference in the early stage, to which I probably contributed too.

Except I did (unusually) purchase put for LQD on a monthly rolling basis starting from December.

Except that was for a completely different reason that I heard someone talking about the credit market risks with sky high public spending.

Except in retrospect, I don’t know what should we really learn from this tragedy. Would I be more alert next time when I heard the word coronavirus?

Sure, expcet such response is overfitting.

Would I be alert next time by some analysis using three data points?

I doubt.

In a Bayes update regime, the posterior (for future) would hardly change with overwhelming evidences drawn everywhere else, even if those are the data we have drawn deliberately to make life sounds fully promising and hopeful.

As a statistician, we are trained to make inference on any give dataset. But it does not eliminate the room for agnosticism by having collected all genes of all creatures in this universe, or monitored all satellite images of all shopping malls.