Snippets in Clean Technology and Data Science: Agriculture and Food
In today’s post, we’ll take a look at a few problems in Agriculture and Food that are being solved using machine learning, computer vision, social networks and satellite data among other data science tools.
What do social networks, sensors, food and farms have in common?
Social media immediately makes us think about Facebook, Snapchat, Instagram, Whatsapp, Google and all the different ways in which we human beings connect with each other today. All these apps use Graph and Network Theory to understand how people may be connected to each other, the links between them and how strong or weak those connections are. Reid Hoffman of LinkedIn famously said that “we’re all six weak connections apart from each other” and in today’s connected world, that number looks like it’s getting smaller and smaller.
So what would social media have in common with agriculture?
First, there’s always the way in which farmers and workers interact and connect with each other. Several startups like the Farmers Business Network are trying to build connected networks of farmers, buyers and equipment owners to make it easy for farmers to manage their operations. These startups typically use recommendation systems, local collaboration groups and optimization techniques to improve farmers operation systems. Data science used here is similar to that used by Amazon and other ecommerce companies in shipping and optimizing their operations.
Then there are the informal networks of workers looking for seasonal or daily work on the farm – something that doesn’t really show up on a app or a computer yet, but maybe will someday.
But what’s even more interesting about how social media and the underlying algorithm can impact agriculture is when we look at the actual business of food production, especially the spread of diseases in plants and animals.
An interesting study has come out of Italy, where scientists were interested in figuring out how the movement of people was contributing to the spread of diseases from farms. After the spread of mad-cow disease in Europe, there were several efforts to monitor farm health by tagging the animals themselves. The theory behind this is that the movement of the animals was a direct contribution to the way diseases spread from one farm to another. Now, monitoring the movement of animals is in itself an interesting data science problem – you have to build the sensor to be attached to the animal, collect and store the data from the animal in some sort of database or in the cloud, analyze how the animals are moving together with any information about the presence of diseases and the farms they originate in, and finally visualize the data in a map or in some format that people can intuitively understand what’s going on.
What the researchers did was add an extra layer of complexity to the problem. They looked at how the movement of people, specifically veterinarians, from farm to farm, resulted in the spreading of diseases. The question they were interested in was how much of an impact did indirect disease vectors like people have as opposed to direct disease vectors like sick animals?
The scientists used an extremely clever approach to solve this problem – one that marries social media and agriculture. They build a network of the farms and the veterinarians using each individual farm and vet as a node. The links between the different nodes were the number of times the vet went to a different farm and the number of vets that visited a particular farm. Using Graph Theory in this fashion, they were able to show that the movement of people amplifies the spread of diseases locally, while the movement of animals has a greater impact on the larger spatial distribution of the disease. Not just that, they were also able to highlight the “strong actors” in the network – the farms that had multiple links to other farms through visits by the vet as well as the farms where a number of vets stopped by before proceeding to other farms. These are the farms where if a disease were to break out, it would spread quickly to the rest of the area because of how well connected they are.
Now, this type of research is something that combines data science and clean technology to solve a problem that has a huge impact on two trillion dollar markets – food and health. Imagine if this kind of information was built for each country and then linked to form a global network – we could customize how resources for inspection and monitoring of food and farms are deployed and prevent the spread of contaminated food before it becomes a huge problem.