In the last few posts, we’ve talked about the type of careers and skills that are needed in order to become a clean tech data scientist. The field is expanding rapidly right now, with openings in almost every clean tech sector and across a wide range of organizations. Just check out ourjobs portalto see the range of positions available right now - and these are just the ones we’ve highlighted!
But how do you become a clean tech data scientist?
Since this is a relatively new field, there really haven’t been degrees or courses so far that said “Masters in Environmental Data Science” or “PhD in Data Science and Clean Technology”! Of course, this is changing with a few universities beginning to offer programs in earth systems and data science or clean technology and data science, but these are still only a few in number and started only in the last couple of years.
If you’re starting your career and are interested in this field, the universities that offer specific coursework in this field at this time are
But what if these specific university options are not available to you?
Let’s take a look at what most people who have a background in this field did. They probably did a Masters or PhD in a traditional engineering or earth sciences curriculum at universities and as part of that, did research in a problem that required data science. Most universities around the world have faculty doing research in problems that need this expertise- either building a machine learning model, getting data from satellite systems, designing experiments in the field and creating statistical models to understand the results or building sensors/robots to collect data. Going this route means that in addition to your knowledge of the earth system, you learn how to code (probably in R, Python and Fortran), get the basics of how machine learning works and know how to present the results of your research in a way that people who are not always familiar with the details of your specific problem can still understand it.
The missing pieces in becoming a full fledged clean tech data scientist are the ones that are typically not taught in universities and are common among all entry level data scientists in all fields. These include both the technical aspects and the business/management aspects such as:
How good are your data? What do you need to make sure that high quality data is fed into your model or if that isn’t possible, how can you modify your model and assumptions to account for it?
How do you make sure your machine learning model or statistical model will function correctly as the problem scales?
How do you create the data storage system and pipeline to store and access data so that it can be robust and efficient? If you’re not the creator, do you know how to access the data?
How can you make your code modular, efficient and scalable? What tests do you need to write into your code to ensure this?
Can you present your results and your code to other teams (software, finance, operations, legal) so that your work can be used across the organization? What do you need to understand about the organization and the teams you are working with?
If additional technical skills are needed, then many people also take courses in these specific technologies to augment their skill set. Courses in coding in Python or SQL, algorithms for efficient coding, introductory machine learning courses are all widely available in platforms like Coursera, Udemy, EdX and in bootcamps. People may also do challenges on Kaggle to apply their new skills to relatively large datasets. All these provide great tools for people to get workingon real data and real problems and develop a deeper understanding of how to use their new skills.
The challenge though is that these courses and datasets are typically focused on problems in other sectors, like social media or finance, where the data and problems are different from that faced by people working in the clean tech sector. Figuring out which courses are immediately useful, which ones provide skills that will be useful in the future and how to make them work for your clean tech sector becomes a full time job in itself!
We’ll be talking more about the tools and skills needed for the clean tech data scientist in our blog here this month, but in the meantime check out our free planif you want to get started!
What do Google, Climate Corporation, early stage startups in farm robotics, and researchers trying to figure out how to feed the world sustainably have in common? They’re all grappling with one of the toughest challenges of working with natural systems - how do you work with data that is sparse, unevenly distributed and with systems that have so many connections and interactions with other systems? Before the advent of cheap sensors that are connected to phones, easily accessible satellite data and drones that can fly over fields quickly and inexpensively - scientists in companies and academia worked on developing plant and crop models that incorporated as many aspects of the farm and as much data as was available so that they could understand and predict what was likely to happen on the field. Understandably, the forecasts took some time to produce and as the models grew more complex, issues about how to estimate model parameters and the uncertainty associated with the resul
A mid-sized data center consumes around 300,000 gallons of water a day, or about as much as 1,000 U.S. households; About 20% of data centers in the United States already rely on watersheds that are under moderate to high stress from drought and other factors; Operating a data center often requires a tradeoff between water use and energy use; And in a survey of 122 data centers in the United States, only 16% or 20 utilities reported plans for managing water-related risks. As professionals working in the field, what can we do to solve this issue? One aspect is developing and using water models that can identify water risks at different scales - so that we can predict the risk to water supplies under a changing climate. A second is using machine learning to identify and optimize water use between all the stakeholders in the watershed - data centers, farmers, cities, other industries - so that biases and needs are brought out into the open and the key issues identified. A third, of cours
Our online community space is now open to anyone who has signed up for a free or paid course on our website! In addition to everyone who signed up for our cohort-based courses, we're now expanding it to all the members of our community. If you've already signed up for any of our courses, check your email for the invitation for the space. It's where we'll get together to talk about all things data science and clean technology related, discuss the latest research, network and make connections with other professionals in the sector. It's an invitation only , no bots and no trolls allowed space - so come on over! Here's where you can check out our courses and join our community !