期刊
ENVIRONMENTAL PROGRESS & SUSTAINABLE ENERGY
卷 36, 期 6, 页码 1580-1586出版社
WILEY
DOI: 10.1002/ep.12786
关键词
anaconda; python; pandas; scikit-learn; spyder; data science; indoor air quality; in-bus air contaminants; biodiesel; public transportation buses
类别
资金
- United States Department of Transportation
- Toledo Area Regional Transit Authority (TARTA)
There is a significant convergence of interests in the research community efforts to advance the development and application of software resources (capable of handling the relevant mathematical algorithms to provide scalable information) for solving data science problems. Anaconda is one of the many open source platforms that facilitate the use of open source programming languages (R, Python) for large-scale data processing, predictive analytics, and scientific computing. The environmental research community may choose to adapt the use of either of the R or the Python programming languages for analyzing the data science problems on the Anaconda platform. This study demonstrated the applications of using Scikit-learn (a Python machine learning library package) on Anaconda platform for analyzing the in-bus carbon dioxide concentrations by (i) importing the data into Spyder (Python 3.6) in Anaconda, (ii) performing an exploratory data analysis, (iii) performing dimensionality reduction through RandomForestRegressor feature selection, (iv) developing statistical regression models, and (v) generating regression decision tree models with DecisionTreeRegressor feature. The readers may adopt the methods (inclusive of the Python coding) discussed in this article to successfully address their own data science problems. (c) 2017 American Institute of Chemical Engineers Environ Prog, 36: 1580-1586, 2017
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据