4.4 Software Review

Applications of Python to evaluate environmental data science problems

Journal

ENVIRONMENTAL PROGRESS & SUSTAINABLE ENERGY
Volume 36, Issue 6, Pages 1580-1586

Publisher

WILEY
DOI: 10.1002/ep.12786

Keywords

anaconda; python; pandas; scikit-learn; spyder; data science; indoor air quality; in-bus air contaminants; biodiesel; public transportation buses

Funding

  1. United States Department of Transportation
  2. Toledo Area Regional Transit Authority (TARTA)

Ask authors/readers for more resources

There is a significant convergence of interests in the research community efforts to advance the development and application of software resources (capable of handling the relevant mathematical algorithms to provide scalable information) for solving data science problems. Anaconda is one of the many open source platforms that facilitate the use of open source programming languages (R, Python) for large-scale data processing, predictive analytics, and scientific computing. The environmental research community may choose to adapt the use of either of the R or the Python programming languages for analyzing the data science problems on the Anaconda platform. This study demonstrated the applications of using Scikit-learn (a Python machine learning library package) on Anaconda platform for analyzing the in-bus carbon dioxide concentrations by (i) importing the data into Spyder (Python 3.6) in Anaconda, (ii) performing an exploratory data analysis, (iii) performing dimensionality reduction through RandomForestRegressor feature selection, (iv) developing statistical regression models, and (v) generating regression decision tree models with DecisionTreeRegressor feature. The readers may adopt the methods (inclusive of the Python coding) discussed in this article to successfully address their own data science problems. (c) 2017 American Institute of Chemical Engineers Environ Prog, 36: 1580-1586, 2017

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available