Have you ever worked with a real-world problem where you have all the data that you need in a form that you could easily use to build models?
In the case of most problems, we find that data are missing, or there are errors in how the data are measured, or we’re faced with different types of data that need to be integrated. That’s been especially true in many clean technology fields - water, energy, climate, sustainability, ecosystem restoration and agriculture among them.
So, how do we deal with data with so many challenges?
One way is to see if there are alternative ways of measuring the data. One possibility is to identify surrogate datasets that can be calibrated and used as alternatives for the primary measurement. A second possibility is using cheaper, more widely distributed sensor data such as Purple Air sensors for air quality monitoring in combination with the primary data sources so that models can be developed. A third alternative is to use modeling techniques like Bayesian networks that can accommodate missing data points by incorporating them into estimates of how much the missing data contributes to uncertainty in the model predictions.
The first method was used byscientists at the University of Illinois, Urbana-Champaignin order to estimate how much corn and soybean were planted in an area in Illinois. Normally, it takes 4-6 months after the crops are harvested for the US Department of Agriculture to provide estimates of the number of acres that were planted by corn and soybean. This means that decisions about policies on conservation, agricultural aid and so on are made using state estimates that have greater uncertainty in their values. Similarly with pricing and managing agricultural futures in the markets. So any method that can provide quicker, more accurate estimates is extremely valuable from an economic and policy standpoint.
However, the challenge in this is that it’s difficult to distinguish between corn and soybean using standard remote sensing data. Remote sensing data or data collected by satellites from space, is collected from a range of wavelengths. The wavelengths that are usually used in estimating crops and crop acreage belong to the visible spectrum - the RGB wavelengths. In addition to the difficulty in figuring out which crops are corn and soybean with these data, there are often locations and times when data cannot be collected because of clouds or other issues with the satellite sensors - leading to missing data points.
In order to solve this problem, the researchers discovered that there’s a secondary wavelength that can be even more effective in distinguishing corn and soybean at very early stages in the crop growth. By measuring the short-wave infrared wavelength (SWIR), a clear difference between the corn and soybean plants can be found - because the SWIR wavelength measures the water content in plant leaves, which is very different in corn and soybean plants when they start growing. By building a deep learning neural network to analyze 15 years satellite SWIR data at a 30m resolution, the scientists were able to identify corn and soybean acreage with 95% accuracy by the end of July for each field - just about 2-3 months after planting and well before harvest.
As you see, this is a significant improvement from traditional methods and will aid policy makers, farmers and traders in making decisions and optimizing allocation of resources - which in turn results in economic benefits.
This kind of combination of data sources, machine learning and economic analyses are what make data science in agriculture such an exciting field to be in as far as technical advancement, economic benefits and job creation!
The last couple of months have been interesting from a climate viewpoint - we’ve seen a record number of climate related disasters around the globe - drought, floods, fires, heat waves…..and it looks like this is probably going to be what our planet will look like in the near future. Add to that the COP26 conference that is scheduled for October 31st - and climate, sustainability and technology are front page news! So, let’s talk about one of the technologies in the news - artificial intelligence (AI) and its impact on climate, water, agriculture, energy, forestry, ecosystems and other sectors in clean technology . AI and its subset of tools - machine learning (ML), data science and statistics - are being touted as one of the key technologies in solving the problems facing the planet today. And while these technologies are certainly powerful - applying them effectively to solve problems in clean tech is another issue altogether. AI has been used by scientists in different clean tech se
We're in the processes of building a couple of fantastic new offerings that many folks in our community have asked for - so blog posts will be limited for a few months. Our jobs portal will still be updated regularly to make sure that all our members can keep up with what's happening in the sector. We can't wait to share what's happening at our end!
Will AI transform water, energy, agriculture, climate and all the other clean tech sectors? Can AI transform these sectors? Some version of these questions always gets asked at any meeting or conference in clean technology. Of course, part of that is because there’s been so much hype around AI and the whole “software is eating the world” interviews that came out a couple of years ago. But part of it is also because these tools are so powerful that professionals working in these sectors can see the potential - but just aren’t sure if it’s applicable to their sector yet. So, let’s start by asking a couple of fundamental questions. Why do we need AI at all? Or any models for that matter? Models are used to understand the world - to estimate the impacts of changes in systems and to try and predict what will happen in the future. Typically, the approaches used in building models can be classified into three broad categories - physical or mechanistic approaches, statistical approaches and