How many X do you know

Posted by Yuling Yao on Sep 26, 2018.       Tag: modeling   causal  

One day I was chatting with A and somehow we were talking about there were certain “celebrated” family names, which might be easily associated with the old continents, dogmatic aristocracy, enormous heritage,occasional conspiracy theories, and anything in that direction. No, I am not going to advocate any stereotype analysis– indeed it is quite hilarious that Google pops up the searching result “These Last Names May Reveal That You Have Royal Blood” when I type some such names on purpose.

However, my point here is that the existence of such dominate family names–even purely psychological or folkloric, should not be taken for granted from a comparative culture perspective, . A name that can be tracked back to some prestige celebrities does not provide the urban legend automatically– think about how many Charles and Henry there are in British royal family, which however is not enough to indicate any “These Names May Reveal That You Have Royal Blood”. We probably have to normalize the prestige by the total amount of the people with that name, such that those small population that more likely to be the direct descents.

Take China for example, it is easy for an outsider to wonder if Chinese parents only recycle a dozens of family names for their newborns given the fact that those common Chinese surnames are just too common. From Wikipedia, Wang is the most common surname in China and shares represents 92.8 million people, or 7% of the total Chinese population, followed by the second most popular Li with a similar percentage. Such prevalence makes them immune to any folkloric conspiracy, regardless of the fact that the surname Li is actually rooted from the royal family in the Tang dynasty (7-10th century). As a comparison, Smith, the most popular English surname, only consists of less than 1% of US population. Moreover (or should I say Lessover), according to a website “how many of me” there are few than 100 Rockefellers, and even fewer Rothschilds in US.

There must be a reason behind such striking difference. I googled a little bit and found an wikipedia page which states the Galton–Watson process, arising from Francis Galton’s statistical investigation of the extinction of family names:

if the average number of a man’s sons is 1 or less, then their surname will almost surely die out, and if it is more than 1, then there is more than zero probability that it will survive for any given number of generations.

So one possible explanation is that the all families around the globe shared the same intensity of birth, and if it was smaller than 1, then eventually most family names would die out in the long run. Since China might have adopted the surname system earlier than Britain, such extinction just happens earlier.

Of course it makes no sense that to assume neither the birth rate is constant nor smaller than 1 (in which case the species would die out too). On the other hand, the Galton–Watson process only models the absolute number, not the relative prevalence.

I found more papers on stochastic population that models everything rigorously, but I am quickly lost in all the math notations so I will stop here. The prevalence and richness of family names is a very simple object, but it seems very profound in math– which turns out the the conclusion I can draw overtime when I am lost in literature.