Bipartisan vaccination?

Posted by Yuling Yao on Apr 17, 2021.       Tag: modeling   causal  

I read a NYT graph article entitled Least Vaccinated U.S. Counties Have Something in Common: Trump Voters. Apart from beautiful visualizations, their graph comparison seems persuading to draw two conclusions:

  1. States with larger Trump vote shares are likely to have more adults who are vaccine hesitant.
  2. States with larger Trump vote shares have a smaller share of fully vaccinated adult residents.

Such association is legit. But I am concerned that this article can have several misleading aspects.

  1. The bipartisan gap, or the income gap, or the urban-rural gap? The shares of trump voters are associated with many other variables. Slightly rearrange what variable to show, one would equally draw a conclusion such as “States with lower average incomes are likely to have more adults who are vaccine hesitant”, or perhaps a conspiracy theory might want to change the title to a thrilling “Rural states are ignored in vaccine distributions”. All these variables are quite related. The root of conspiracy theories is to attribute all variations to one single variable, no matter it is race or voting.
  2. The state level and individual level relation. To be fair, the article did not suggest “individual Trump voters are likely to be more vaccine hesitant”, but I am afraid many readers would interpret the state level associations in this way. Although this micro level explaination is plausible, we just cannot draw individual level relation from group level data. A famous example is that income is generally positively associated with education, but universities or colleges is properly among the lower-end of sections in terms of average salaries, despite its high level of average employee education: the industry-clustering is correlated with outcome variables. The same reasoning applies here for individual level inference.
  3. What can the country level comparison tell. The article did do more analysis on country level comparisons, with a section title “Counties where more residents voted for Trump often have lower vaccination rates”. Based on the graph, it is a weak evidence. If you look into states like Virginia, Oregon, or New Jersey, probably you would draw the opposite conclusion. Again, generally countries level relations do not have to the same direction as in individual level or in state level. Think about income for example: richer states are more blue while richer voters are more republican. In between, the county level relation can mix these two ends: many part of long island is redder and rich because rich people choose to cluster there, while in there are other red and poor upper state counties because they are rural. Again, both of these two mechanisms are outcome-dependent clustering but they result in divergent signs in country level relations. In this example, the country level comparison is mostly driven by (a) rural-urban distinction and (b) elder population shares. I guess that is why the pattern in Vermont is mostly random.
  4. Vaccine hesitation vs Vaccine rollout The article mixes these two outcome variables: vaccine hesitation and vaccine rollout. But are the correlation between them a fixed constant? Since it (April) is still in the early phase of vaccine distribution nationwide, is the vaccine trust the main bottleneck of vaccine rollout? I think the answer is negative and it is more evident in country levels. Look at upper state NY. The vaccine hesitation is relatively high within the state, but the rollout rates are good. Hamilton County has 65% fully vaccinated adults (remember the optimal rejection rate of an HMC sampler is indeed 0.651), highest in the state. Sure, in the longer run when there is enough supply, the vaccine rollout rate will eventually become (1- vaccine hesitation rate) and thereby perfectly correlated. But for the short term there are many other factors too. It seems the author has tries to fix this problem by showing that the average share of delivered doses reported as used is lower in 10 most hesitate states: but this usage efficiency itself would be largely correlated with rural-urban distinction.

It is convenient/eye-catching to explain everything by bipartisan gap. But we need more modeling for the full picture story.