Spatial and Temporal, Small and Big: Using wastewater data to monitor the spread of Covid-19

 Have you been monitoring the news about Covid-19 obsessively? And wondering when the economy will open and if it’s safe to go out and resume normal activities? 


If you have, you’ve probably been hearing a lot about how testing people to detect the presence of the virus, tracing the spread through contacts and monitoring outbreak clusters, is critical to being able to tell how the pandemic is progressing and if it’s safe to resume normal activities and thus open up the economy. But in many countries, including the United States, testing has been a bottleneck - either there haven’t been enough tests or the infection has spread to such an extent that actually testing people and tracing their contacts simply isn’t feasible anymore.


Further, even in countries like Germany and South Korea that have successfully deployed testing and tracing strategies, it is still expensive to conduct these tests and continue tracing contacts. And until a vaccine and/or some form of treatment is developed, it looks like our main tool in containing the outbreaks is going to be some form of monitoring. So, how can that be done effectively and cheaply?


Enter wastewater epidemiology - the intersection of public health, the environment and data science. It’s a relatively new field where scientists have been collecting wastewater samples in order to monitor the presence of chemicals and viruses in the human population. It’s been used successfully in many countries in Europe and Asia to understand the prevalence of certain drugs, the exposure of people to pesticides and to monitor outbreaks of viruses like polio and hepatitis A. While it doesn’t replace traditional public health initiatives like testing and tracing for individuals, it can be used to supplement these efforts and help in narrowing down the areas and communities where the outbreaks are occurring and thus focus the testing and tracing efforts.


So, how exactly does it work? Let’s take a look at what has been happening with SARS-COV2, the virus causing Covid-19 and the cause of our current pandemic.


The first question that needs to be answered is - is the virus present in the wastewater? That means answering questions about whether the virus is present in human fecal matter and thus can be found in wastewater. This is also one of the first steps in identifying transmission routes in people - how is it spread? Is it spread through air, through fecal matter and so on. In the case of SARS-COV2, researchers from China found that people who were infected were shedding the virus in stool. They found that relatively high levels of viral RNA were found in stool samples up to 21 days after the infection started , compared to about 6-7 days in nasal samples. While fecal discharge isn’t the primary method of transmitting the disease, there was sufficient viral material found that it was possible to think of it as a way to monitor the spread of the infection. 


Next question - can the virus be detected at the levels found in the wastewater sample? Wastewater samples collected for this purpose are raw wastewater samples collected at the intake point of the treatment plant or at maintenance access node like a manhole. Since these samples are collected at these points and not at individual homes or offices, they are aggregate samples of all the individuals living in that area or community. So if a few people are infected and many others are not, the amount of virus found in the wastewater sample will be significantly diluted. It may thus be possible to detect the virus or it may have been so diluted that existing techniques cannot pick it up. Luckily, in the case of SARS-COV2, studies from the Netherlands, Australia, and the United States found that the virus was detectable in the wastewater using a standard technique known as reverse transcriptase quantitative polymerase chain reaction (RT-qPCR).


In this technique, the RNA from the virus is first converted to its complementary DNA fragment. The DNA fragment is then fluorescently labelled (usually with cyanine based fluorescent dyes) and amplified in about 30-45 cycles using an enzyme. In every cycle, the number of short specific sections of DNA is doubled, leading to an exponential amplification of targets. The amount of DNA fragments in the sample is proportional to the amount of fluorescence during the PCR cycles. In other words, if a sample contains more targets the fluorescence will be detected in earlier cycles and that lets us quantify the amount of virus present.


So, at this stage, there are analytical techniques to sample the wastewater and detect the virus, but a single standard that all labs must use still needs to be developed and there are still questions that need to be researched about the sensitivity of the analytical technique to different conditions. However, it’s promising enough to go to the next stage.


Next question - can models be developed using the raw wastewater data so that the spread in the community can be tracked and the areas that are most affected highlighted so that intensive public health measures like individual testing and contact tracing deployed in those areas?


This is where much of the modeling and data science work comes in. It’s a combination of building models of the sewer systems and the flow through them, virus transmission models and decay in the sewer system and human neighborhoods and community maps. It’s a typical problem in most clean technology sectors where the data come from many sources, models are a combination of physical systems and statistics and data are spatial and temporal in nature. In the case of SARS-COV2 in wastewater, the initial models are being built, but they have a number of assumptions built into them. While tools for modeling flow in sewer systems exist, combining them with models for a novel virus like SARS-COV2 means that assumptions about how the virus will behave in different temperatures, how dilution from rain and other leakages will affect the movement through the system and how many individuals contribute to the collected sample all need to be built in.


And the data are spatial - so, researchers need to figure out how to build these models to account for the interdependence, the clusters and other spatial features of the data. And these data have to overlay population maps, neighborhood and community maps to figure out the populations that are likely to be impacted and how to model interactions between communities. That’s cluster analysis, network modeling and spatial statistics! 


Also, since we’re tracking the movement of the disease over time, the data are temporal. This means that researchers have to be able to build models that can detect sudden changes in viral load compared to previous areas, identify clusters and hot-spots and then figure out the affected communities. That’s time series analysis at its best!


So, researchers have started creating these combined models using several different tools, generating initial estimates and then tweaking different parameters to see how these estimates will change if conditions change. The preliminary results suggest that approximately 2.1 billion people or almost half the world's population could be monitored globally using 105,600 sewage treatment plants. Theoretically, one infected individual who is asymptomatic can be detected among 100 - 2,000,000 non-infected people in the community, depending on local conditions. And while wastewater cannot replace testing and tracing efforts, using this method as a supplement could result in saving millions to billions of dollars in costs. 


These results suggest that these models can indeed be used as first approximations to help focus public health efforts - but as assumptions get tested and models refined, we’ll get better results in the future.


Of course, a system like this can only be deployed in places where wastewater is sent to treatment plants - not in areas with septic tanks and point of use treatment options. But it’s a promising opportunity for many cities and countries.


A few startups and non-profits have begun partnering with cities around the world to test and deploy this approach. Biobot, a US startup based in Massachusetts is one of the pioneers in this field and has been testing this approach in several cities in the US. 


Cities and countries around the world have also started plans to monitor sewage systems for SARS-COV2. Pending budget approval, the provinces of Alicante, Castellon and Valencia in Spain will be monitoring 250 wastewater treatment plants and a population of approximately 5 million residents twice a week. And in June, the state of Victoria in Australia, will begin rolling out a large-scale wastewater monitoring effort that could target infection rates at a suburb scale. 


So, here’s to all the researchers and scientists who are looking at innovative ways to get us out of this pandemic!

What our community are reading

Moonshots, Models, IoT and Machine Learning in Agriculture

Our online community space is now live!

How much water should an email consume? Data centers and water use