If you had to pick a buzzword for 2019 in clean technology and data science, it would be “Smart Cities”!
This year, we’ve heard about Alphabet’s Sidewalk Labs and their efforts to design one in Toronto; India and China have announced plans to redesign over 100 cities into “Smart Cities”; European countries like Norway and Finland highlight the fact that much of what is touted as a “Smart City” already exists in their systems; and plenty of people in Silicon Valley have their own ideas of what Smart Cities should be like and what they should do.
A couple of features do stand out in many of the conferences and presentations about Smart Cities - 1) Opportunities abound with an estimated market size of$237 billion by 2025 according to one studyand 2) There’s a wide range of interpretations about what exactly makes a city “Smart”.
Simply put, the “Smart City” is envisioned to be one where many of the services that we currently use are easier to use and function more efficiently. These can be done at several levels. First, focus on improving efficiency is by increasing digitization and access to existing records and services - which is what most cities are doing. Second, look to improving or retrofitting existing systems using sensors and other connected devices so that cities can be made resilient to climate change, more energy efficient, less prone to flooding, with better transit options and with more green space and other amenities. This is where we’re seeing some of thenew IoT features like smart streetlights, efficient buildings and better engineering. Third, look at how existing systems can be completely transformed in order to meet the requirements of the city. This is where we’re hearing about autonomous vehicles transforming the cityscape or drones for delivery or virtual reality systems instead of shops and malls.
The first level is where many US, European and Asian cities are at. For example, many services such as getting a license, paying taxes, accessing waste and other municipal services can be done online. Paper records have been digitized, it’s easier to access and follow records and payments can be made at the click of a button.
At the same time, think of all the data that has been collected in these systems! While many of the databases that have been used for these systems are older, structured data and are likely siloed in different departments - it still represents a vast treasure trove of information that can help understand how cities have changed and are changing, what the current fault lines are in terms of services required and how things can be improved. The data are spatially distributed - think of city maps with different services at different locations - and vary over time. So you can evaluate changes in city composition, services and finances over different time periods.
As a data scientist, this presents some fascinating problems - starting with “how much of the data are valuable and available?” to “ how can different data sets be modeled over space and time to answer the question about a service improvement for people living in the city?”. Think of all the data science tools that can be deployed - from spatial statistics to graph theory and neural networks.
And this is before the second level of “Smart”! Once you add connected devices and sensors, you get data that is collected much more frequently and at a much finer distribution and that opens up a whole other toolbox. That allows people to look at solving harder problems - for example in a water scarce region, where is most of the city’s water going and how can access to water be improved? Or how can buildings be made more efficient so that they use less energy while still providing services?
But building these kinds of systems have their own concerns and issues.
For example, let’s look at smart devices like the Nest front doorbell or Ring’s door bell. These devices replace the physical doorbell with a camera, smart lock and door bell. The data collected here are temporally very dense (every few minutes), and depending on how many are located in an area, can also have a very fine spatial coverage. As an individual homeowner, it’s great because you can track and monitor what’s happening at your house at all times.
But what happens when the data are stored and accessible to police and government agencies? You suddenly have a lot of information about how people use the neighbourhood- who walks by, what are regular routines, what is the typical composition of these communities. In case of emergencies like fires, floods or disasters, this type of information can be invaluable - you can use it to make sure that elderly and children are taken care of, you can figure out which areas are most vulnerable and point resources there and you can optimize your response.
However, this is also where you start seeing biases and other human failings come into play. Algorithms, are after all, built by people, and are only as good as the data that are collected. So, if data indicate that people from a certain community are arrested more frequently, the algorithm will ask for greater policing based on that fact - regardless of why that particular fact occurred. This means that human bias gets coded into the system, unless the data scientists doing the work are aware of this and make efforts to avoid such model failures. That could mean anything from building a model that ignores community as a feature, to creating something that compensates for the bias seen in the data…..
Understandably, this is where many cities start grappling with issues of data consolidation, privacy and security. This is where the questions about “ who decides what people in the city want?”, “how much should be budgeted?” , “what happens to citizens data and expectations of privacy?”, and “ who benefits and what happens to the most vulnerable citizens?” have been coming up. Cities, after all, are not sandboxes for companies to play in - they are where people live, work and form communities. And understanding how those communities function, what drives them and what role technology should play are questions that people need to work together to solve - not have imposed them.
What do Google, Climate Corporation, early stage startups in farm robotics, and researchers trying to figure out how to feed the world sustainably have in common? They’re all grappling with one of the toughest challenges of working with natural systems - how do you work with data that is sparse, unevenly distributed and with systems that have so many connections and interactions with other systems? Before the advent of cheap sensors that are connected to phones, easily accessible satellite data and drones that can fly over fields quickly and inexpensively - scientists in companies and academia worked on developing plant and crop models that incorporated as many aspects of the farm and as much data as was available so that they could understand and predict what was likely to happen on the field. Understandably, the forecasts took some time to produce and as the models grew more complex, issues about how to estimate model parameters and the uncertainty associated with the resul
As the impacts of climate change on the planet become clearer, scientists and professionals in climate science are looking at the latest tools and technologies in AI and machine learning to help understand and mitigate the effects. At the same time, career opportunities in the field are growing and we’re seeing increasing numbers of students and early career professionals interested in developing and using their skills in ways that can help the planet. So, when and where can machine learning and AI be used in climate science? And what are the pitfalls? If you’re working in environmental and earth sciences, you probably already have a pretty big toolbox that has been developed over several decades! It consists of standard statistical techniques including spatial and temporal statistics, a range of physics-based or process based models, and several data collection and data integration technologies at different scales. What can machine learning add to this? Does it replace all the o
Last time we looked at how machine learning can help water utilities manage their maintenance and operations efforts - especially when dealing with hard-to-reach parts of the water system like buried water pipes. Today, let’s talk about how machine learning is being used in developing new technologies and building prototypes for decentralized, small-scale systems. Desalination has been studied and deployed at scale for several years now. As different parts of the planet face increasing water stress, desalination is being evaluated as one of several potential solutions - together with water conservation and recycled water. In the Middle East of course, large-scale desalination plants have been in operation for several decades, with Israel being one of the countries at the forefront of developing and implementing the technology. Large-scale desalination plants have the advantage of scale - you can build a single system and then connect it to your existing water network. Sufficient data a