Coding, Databases, GIS and other tools for a clean tech data scientist
As we saw in the last post, a data scientist's role requires the ability to capture, process, analyze and visualize the data. While there are some off the shelf software tools, most applications in the clean tech and data science space require knowledge of a programming language in order to perform many of the tasks effectively. The popular choices for a clean tech data scientist are 1. Python : Python is probably the single most critical element in the data scientist’s toolkit. It’s a flexible, easily learnt computer language that is powerful because of the large stack of libraries that have been developed. Do you need to figure out how to get data from a website – or train a machine learning algorithm? The chances are that there is an existing library in Python that can be plugged into your code. The main libraries that are necessary for any of the data science use cases are scipy, numpy, statsmodel and pandas . These can be used for building a predictive model using