Statistical testing of the Lindy effect with real data
TL;DR Network protocols have stronger adherence to Lindy effect than Linux distributions
The Lindy effect is a theory that the future life expectancy of some non-perishable things like a technology or an idea is proportional to their current age, so that every additional period of survival implies a longer remaining life expectancy. — Wikipedia
I first found out about the Lindy effect via Stephan Livera podcast. It was mentioned in passing with a brief definition similar to the Wikipedia one above. Yet, it captured my attention.
The idea is simple and testable. I ran an experiment against a real word data of the history of Linux distributions. Also known as a “distro” — collection of open source programs frozen in time at specific version so that the programs can be tested, distributed, and upgraded. There is a community for each distro.
The experiment itself is posted here https://github.com/alevchuk/lindy-effect and shows that simply picking the oldest survivor and betting that it will survive at least double its current life span would have resulted in being correct 59.8% of the time.
Specifically, the algorithm was correct by picking:
- MCC Interim from 1992 to 1994
- Deutsche Linux-Distribution (DLD) for 2 short month in 1996
- Slackware from 1999 to 2005
Slackware is alive and well today in 2019, yet I could only run the experiment up to 2005 to guarantee that each subject will survive 2x its current life span.
Charts
About the results
59.8% does not seem like a lot. Yet, it’s better than random. In Hans Rosling’s TED talk he articulates how even getting 50% is sometimes hard for educated humans. While the 9.8% lead over randomness was achieved by a simple rule that knew nothing about Linux. To be precise, it knew strictly nothing other than the current age of the distro.
Better examples of Lindy effect
Linux distribution data attracted me because of the awesome visualization.
The SVG file lets you Zoom-In and Pan around this vast map of life and death. Opening https://upload.wikimedia.org/wikipedia/commons/1/1b/Linux_Distribution_Timeline.svg and pressing Ctrl and + or ⌘ and + would bring you into a deep dive.
However, there is not much network effect between Linux distributions. If my server runs Ubuntu distro and your server runs Redhat distro, the differences are strictly local and not visible to the outside world. Moreover, while running different distros, our servers can still talk to each other over the Internet.
It would be much more interesting to evaluate the Lindy effect on network protocols. Today we know that TCP/IP protocol is the winner in world communications, yet there were others before it. For example, Network Control Program (NCP) was ARPANET’s protocol before transitioning to Transmission Control Protocol (TCP) in 1983.
… let’s do a back of the envelope calculation now!
ARPANET first successful message was sent in 1969, fifty years ago. TCP is known to dominate from 1983 to 2019. Let’s say we run the experiment for the first 25 years, up to 1994. Let’s define Lindy effect as being correct when picking a protocol that will live for 2x its current age. Assuming NCP use for 14 years before switching to TCP, Lindy effect is correct for the first 7 years for ARPANET, then wrong for 7 years, then correct again for 11 year from 1983 to 1994. That makes it correct 18 out of 25 years. That’s correct 72% of the time.
Network protocols have strong effects on each other because a new protocol is not compatible with the rest of the world. Once TCP/IP was used and did not have catastrophic failures, all the benefits of new protocols were not nearly enough to outweigh the benefit of staying connected to everyone else.