The THINKerry

Diffusion Models Go Viral

Understanding COVID-19, Diffusion Models, and Data Visualization


By Kerry Edelstein
March 17, 2020

As researchers who appreciate the importance of data-driven insights and the power of thoughtful data visualization, we’re paying close attention to COVID-19 and the data surfacing daily. While charts and graphs of diffusion curves are getting shared across social media, this also seems a good time for a quick review of the math of what’s happening around us.

We recently came across this data from Johns Hopkins University on the incidence of COVID-19 cases in China vs. the rest of the world. Pay particular attention to the red line, which represents confirmed cases in China.

You’ll note the shape of the curve looks like an S – flat at the bottom, then a steep increase in incidence, then a leveling out of cases as the disease hits its upper limit. Outside of public health, we often colloquially call these distributions “S Curves” because of that shape. (Statisticians will recognize what looks like a standard cumulative distribution function.)

In market research, S Curves are also sometimes casually referred to as “adoption curves”, as consumer adoption of new products and technologies often follows a similar trajectory. Several years ago, we plotted the adoption of various new media technologies, and while some curves were steeper than others, nearly all those technologies followed an “S” shape of consumer adoption.


Exponential Growth Stage

As indicated in the chart below, S Curves have an exponential growth stage. That means population incidence (or “saturation” on the Y axis below) increases by an order of magnitude every X days (or other time period.) In the case of COVID-19, this has varied by country, with estimates ranging from an order of magnitude of 2x (i.e. doubling of diagnosed cases) every 2 to 6 days before an upper limit is reached (“stabilization/upper asymptote” below).

As of publishing this post on March 17th, in many US cities it might feel baffling that we’re stuck inside with the kids home from school. Many of us still don’t know anyone who’s been diagnosed. That’s in large part because there’s limited testing, but it’s also because we’re still at the “slow growth” stage above.

Even if we’re consistently doubling the diagnosed cases every week (and right now we’re doubling much faster than that, even with low testing rates), it can feel like “no big deal.” Especially in smaller cities, it’s only 2 cases, then 4, then 8, then 16, then 32. But keep doing that math for another month, and we’re over 1000 cases. Do that math for three more months, and we’re over a million cases. Such is the nature of exponential growth.  

You’ve probably seen this image going around.

The red and blue distributions are both diffusion curves that scale up exponentially, and then eventually level out and drop-off as people recover. What’s different is how fast they escalate.

Two main variables impact this escalation rate – how contagious something is (partially controllable via handwashing and other hygiene measures) and how likely an infected person is to interact with others (which we can reduce via social distancing, closures, and quarantines). Washing our hands, disinfecting surfaces, and curbing the likelihood of social interaction all make the exponential growth much less steep.

Bringing this back to math, in hypothetical laymen’s terms, that might look something like “the # of cases doubles every month” instead of “the # of cases doubles every two days.”


Why does that matter?

It matters, because hospitals have capacity constraints – on beds, on supplies, and on personnel – and they can only care for so many people at once. Exceeding hospital capacity means more deaths from COVID-19, and diminished ability to care for other illnesses as well. That’s why movements like #stayhome and #flattenthecurve have surfaced, why certain states and cities are shutting down schools and businesses, and why Research Narrative has been implementing additional precautions.

If you’re interested in getting a little wonkier about this topic, here are some additional resources and pandemic analyses we found both insightful and compelling.

  • The story of COVID-19: Kevin MD published this thoughtful overview of COVID-19, backed with charts, data, and even an embedded YouTube video on diffusion model mathematics. 
  • The impact of social distancing: This no-paywall analysis and data simulator from the Washington Post visualizes what happens in different social distancing and quarantine scenarios.  
  • The post that started it all? This epic Medium post recently made the rounds of research circles. We don’t know the author, and we can’t vouch for him. (Though we did look him up on LinkedIn, and his background doesn’t disappoint.) But this is certainly the most comprehensive review of pandemic data and implications we’ve seen so far, and great context for why so many schools and businesses are choosing to shut down prior to federal intervention. It also appears to be the source of many “flatten the curve” and “social distancing charts” making the social media rounds. 


Stay safe everyone. 

Sign Up for Our Newsletter
Trust us, it’s amazing.

Click here to stay in touch