Fig 1. Sparsification of the U.S. mobility network. On the left is the original network with about 26 million edges. On the right, a sparsified network based on effective resistance sampling.

Simulations that help determine how a large-scale pandemic will spread can take weeks or even months to run. A recent study in PLOS Computational Biology offers a new approach to epidemic modeling that could drastically speed up the process. 

The study uses sparsification, a method from graph theory and computer science, to identify which links in a network are the most important for the spread of disease.

By focusing on critical links, the authors found they could reduce the computation time for simulating the spread of diseases through highly complex social networks by 90% or more. 

“Epidemic simulations require substantial computational resources and time to run, which means your results might be outdated by the time you are ready to publish,” says lead author Alexander Mercier, a former Undergraduate Research Fellow at SFI and now a Ph.D. student at the Harvard T.H. Chan School of Public Health. “Our research could ultimately enable us to use more complex models and larger data sets while still acting on a reasonable timescale when simulating the spread of pandemics such as COVID-19.”

For the study, Mercier, with SFI researchers Samuel Scarpino and Cristopher Moore, used data from the U.S. Census Bureau to develop a mobility network describing how people across the country commute. 

Then, they applied several different sparsification methods to see if they could reduce the network’s density while retaining the overall dynamics of a disease spreading across the network. 

The most successful sparsification technique they found was effective resistance. This technique comes from computer science and is based on the total resistance between two endpoints in an electrical circuit. In the new study, effective resistance works by prioritizing the edges, or links, between nodes in the mobility network that are the most likely avenues of disease transmission while ignoring links that can be easily bypassed by alternate paths.

“It’s common in the life sciences to naively ignore low-weight links in a network, assuming that they have a small probability of spreading a disease,” says Scarpino. “But as in the catchphrase ‘the strength of weak ties,’ even a low-weight link can be structurally important in an epidemic — for instance, if it connects two distant regions or distinct communities.”

Using their effective resistance sparsification approach, the researchers created a network containing 25 million fewer edges — or about 7% of the original U.S. commuting network — while preserving overall epidemic dynamics.

“Computer scientists Daniel Spielman and Nikhil Srivastava had shown that sparsification can simplify linear problems, but discovering that it works even for nonlinear, stochastic problems like an epidemic was a real surprise,” says Moore.

While still in an early stage of development, the research not only helps reduce the computational cost of simulating large-scale pandemics but also preserves important details about disease spread, such as the probability of a specific census tract getting infected and when the epidemic is likely to arrive there.