Journal cover Journal topic
Hydrology and Earth System Sciences An interactive open-access journal of the European Geosciences Union
Journal topic

Journal metrics

Journal metrics

  • IF value: 5.153 IF 5.153
  • IF 5-year value: 5.460 IF 5-year
    5.460
  • CiteScore value: 7.8 CiteScore
    7.8
  • SNIP value: 1.623 SNIP 1.623
  • IPP value: 4.91 IPP 4.91
  • SJR value: 2.092 SJR 2.092
  • Scimago H <br class='hide-on-tablet hide-on-mobile'>index value: 123 Scimago H
    index 123
  • h5-index value: 65 h5-index 65
Preprints
https://doi.org/10.5194/hess-2020-305
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.
https://doi.org/10.5194/hess-2020-305
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.

Submitted as: research article 23 Jun 2020

Submitted as: research article | 23 Jun 2020

Review status
This preprint is currently under review for the journal HESS.

Evaluation of Random Forest for short-term daily streamflow forecast in rainfall and snowmelt driven watersheds

Leo T. Pham1, Lifeng Luo2, and Andrew O. Finley1,2 Leo T. Pham et al.
  • 1Department of Forestry, Michigan State University, East Lansing, Michigan, USA
  • 2Department of Geography, Environment, and Spatial Sciences, Michigan State University, East Lansing, Michigan, USA

Abstract. In the past decades, data-driven Machine Learning (ML) models have emerged as promising tools for short-term streamflow forecasts. Among other qualities, the popularity of ML for such applications is due to the methods' competitive performance compared with alternative approaches, ease of application, and relative lack of strict distributional assumptions. Despite the encouraging results, most applications of ML for streamflow forecast have been limited to watersheds where rainfall is the major source of runoff. In this study, we evaluate the potential of Random Forest (RF), a popular ML method, to make streamflow forecast at 1-day lead time at 86 watersheds in the Pacific Northwest. These watersheds span climatic conditions and physiographic settings and exhibit varied contributions of rainfall and snowmelt to their streamflow. Watersheds are classified into three hydrologic regimes: rainfall-dominated, transisent, and snowmelt-dominated based on the timing of center of annual flow volume. RF performance is benchmarked against Naive and multiple linear regression (MLR) models, and evaluated using four metrics Coefficient of determination, Root mean squared error, Mean absolute error, and Kling-Gupta efficiency. Model evaluation metrics suggest RF performs better in snowmelt-driven watersheds. Largest improvement in forecasts, compared to benchmark models, are found among rainfall-driven watersheds. We obtain Kling–Gupta Efficiency (KGE) scores in the range of 0.62–0.99. RF performance deteriorates with increase in catchment slope and increase in soil sandiness. We note disagreement between two popular measures of RF variable importance and recommend jointly considering these measures with the physical processes under study. These and other results presented provide new insights for effective application of RF-based streamflow forecasting.

Leo T. Pham et al.

Interactive discussion

Status: open (until 19 Aug 2020)
Status: open (until 19 Aug 2020)
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
[Subscribe to comment alert] Printer-friendly Version - Printer-friendly version Supplement - Supplement

Leo T. Pham et al.

Leo T. Pham et al.

Viewed

Total article views: 153 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
116 35 2 153 9 1 1
  • HTML: 116
  • PDF: 35
  • XML: 2
  • Total: 153
  • Supplement: 9
  • BibTeX: 1
  • EndNote: 1
Views and downloads (calculated since 23 Jun 2020)
Cumulative views and downloads (calculated since 23 Jun 2020)

Viewed (geographical distribution)

Total article views: 56 (including HTML, PDF, and XML) Thereof 56 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 

Cited

Saved

No saved metrics found.

Discussed

No discussed metrics found.
Latest update: 13 Jul 2020
Publications Copernicus
Download
Short summary
Model evaluation metrics suggest RF performs better in snowmelt-driven watersheds. Largest improvement in forecasts, compared to benchmark models, are found among rainfall-driven watersheds. RF performance deteriorates with increase in catchment slope and increase in soil sandiness. We note disagreement between two popular measures of RF variable importance and recommend jointly considering these measures with the physical processes under study.
Model evaluation metrics suggest RF performs better in snowmelt-driven watersheds. Largest...
Citation