Data and Smart Cities

 If you had to pick a buzzword for 2019 in clean technology and data science, it would be “Smart Cities”! 


This year, we’ve heard about Alphabet’s Sidewalk Labs and their efforts to design one in Toronto; India and China have announced plans to redesign over 100 cities into “Smart Cities”; European countries like Norway and Finland highlight the fact that much of what is touted as a “Smart City” already exists in their systems; and plenty of people in Silicon Valley have their own ideas of what Smart Cities should be like and what they should do. 


A couple of features do stand out in many of the conferences and presentations about Smart Cities - 1) Opportunities abound with an estimated market size of $237 billion by 2025 according to one study and 2) There’s a wide range of interpretations about what exactly makes a city “Smart”.


The most conservative definition and the one that governments and city organizations highlight is “where traditional networks and services are made more efficient with the use of digital and telecommunication technologies for the benefit of its inhabitants and business”. The one that venture capitalists and technologists tout is “the intersection of digital technology, disruptive innovation and urban environments" or the “21st century neighbourhood”. 


But what do all these terms mean?


Simply put, the “Smart City” is envisioned to be one where many of the services that we currently use are easier to use and function more efficiently. These can be done at several levels. First, focus on improving efficiency is by increasing digitization and access to existing records and services - which is what most cities are doing. Second, look to improving or retrofitting existing systems using sensors and other connected devices so that cities can be made resilient to climate change, more energy efficient, less prone to flooding, with better transit options and with more green space and other amenities. This is where we’re seeing some of the   new IoT features like smart streetlights, efficient buildings and better engineering. Third, look at how existing systems can be completely transformed in order to meet the requirements of the city. This is where we’re hearing about autonomous vehicles transforming the cityscape or drones for delivery or virtual reality systems instead of shops and malls. 


The first level is where many US, European and Asian cities are at. For example, many services such as getting a license, paying taxes, accessing waste and other municipal services can be done online. Paper records have been digitized, it’s easier to access and follow records and payments can be made at the click of a button. 


At the same time, think of all the data that has been collected in these systems! While many of the databases that have been used for these systems are older, structured data and are likely siloed in different departments - it still represents a vast treasure trove of information that can help understand how cities have changed and are changing, what the current fault lines are in terms of services required and how things can be improved. The data are spatially distributed - think of city maps with different services at different locations - and vary over time. So you can evaluate changes in city composition, services and finances over different time periods.


As a data scientist, this presents some fascinating problems - starting with “how much of the data are valuable and available?” to “ how can different data sets be modeled over space and time to answer the question about a service improvement for people living in the city?”. Think of all the data science tools that can be deployed - from spatial statistics to graph theory and neural networks. 


And this is before the second level of “Smart”! Once you add connected devices and sensors, you get data that is collected much more frequently and at a much finer distribution and that opens up a whole other toolbox. That allows people to look at solving harder problems - for example in a water scarce region, where is most of the city’s water going and how can access to water be improved? Or how can buildings be made more efficient so that they use less energy while still providing services? 


But building these kinds of systems have their own concerns and issues.


For example, let’s look at smart devices like the Nest front doorbell or Ring’s door bell. These devices replace the physical doorbell with a camera, smart lock and door bell. The data collected here are temporally very dense (every few minutes), and depending on how many are located in an area, can also have a very fine spatial coverage. As an individual homeowner, it’s great because you can track and monitor what’s happening at your house at all times. 


But what happens when the data are stored and accessible to police and government agencies? You suddenly have a lot of information about how people use the neighbourhood  - who walks by, what are regular routines, what is the typical composition of these communities. In case of emergencies like fires, floods or disasters, this type of information can be invaluable - you can use it to make sure that elderly and children are taken care of, you can figure out which areas are most vulnerable and point resources there and you can optimize your response. 

However, this is also where you start seeing biases and other human failings come into play. Algorithms, are after all, built by people, and are only as good as the data that are collected. So, if data indicate that people from a certain community are arrested more frequently, the algorithm will ask for greater policing based on that fact - regardless of why that particular fact occurred. This means that human bias gets coded into the system, unless the data scientists doing the work are aware of this and make efforts to avoid such model failures. That could mean anything from building a model that ignores community as a feature, to creating something that compensates for the bias seen in the data….. 


Understandably, this is where many cities start grappling with issues of data consolidation, privacy and security. This is where the questions about “ who decides what people in the city want?”, “how much should be budgeted?” , “what happens to citizens data and expectations of privacy?”, and “ who benefits and what happens to the most vulnerable citizens?” have been coming up. Cities, after all, are not sandboxes for companies to play in - they are where people live, work and form communities. And understanding how those communities function, what drives them and what role technology should play are questions that people need to work together to solve - not have imposed them.


Popular posts from this blog

Moonshots, Models, IoT and Machine Learning in Agriculture

Machine Learning, AI and Climate

Suitcases and pipes: Making machine learning work for clean water Part II