4.2 Article

Predicting the structural condition of individual sanitary sewer pipes with random forests

Journal

CANADIAN JOURNAL OF CIVIL ENGINEERING
Volume 41, Issue 4, Pages 294-303

Publisher

CANADIAN SCIENCE PUBLISHING
DOI: 10.1139/cjce-2013-0431

Keywords

data mining; inspection; management; pipe; random forests; sewer; wastewater

Funding

  1. University of Guelph
  2. Natural Sciences and Engineering Council of Canada
  3. Canada Research Chairs program

Ask authors/readers for more resources

Closed-circuit television inspections of sewer condition deterioration as required for proactive management are expensive and hence limited to portions of a sewer network. The data mining approach presented herein is shown capable of unlocking information contained within inspection records and enhances existing pipe inspection practices currently used in the wastewater industry. Predictive models developed using the random forests algorithm are found capable of predicting individual sewer pipe condition so that uninspected pipes in a sewer network with the greatest likelihood of being in a structurally defective condition state are identified for future rounds of inspection. Complications posed by imbalance between classes common within inspection datasets are overcome by first establishing the classification task in a binary format (where pipes are in either good or bad structural condition) and then using the receiver-operating characteristic (ROC) curve to establish alternative cutoffs for the predicted class probability. The random forests algorithm achieved a stratified test set false negative rate of 18%, false positive rate of 27% and an excellent area under the ROC curve of 0.81 in a case study application to the City of Guelph, Ontario, Canada. The novel inclusion of condition information of pipes attached at either the upstream or downstream manholes of an individual pipe enhances the predictive power for bad pipes representing the minority class of interest (reducing the false negative rate to 11%, reducing the false positive rate to 25% and increasing the area under the ROC curve to 0.85). An area under the ROC curve > 0.80 indicates random forests are an excellent choice for predicting the condition of individual pipes in a sewer network.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available