Journal metrics

Journal metrics

  • IF value: 4.256 IF 4.256
  • IF 5-year value: 4.819 IF 5-year 4.819
  • CiteScore value: 4.10 CiteScore 4.10
  • SNIP value: 1.412 SNIP 1.412
  • SJR value: 2.023 SJR 2.023
  • IPP value: 3.97 IPP 3.97
  • h5-index value: 58 h5-index 58
  • Scimago H index value: 99 Scimago H index 99
Discussion papers | Copyright
https://doi.org/10.5194/hess-2018-427
© Author(s) 2018. This work is distributed under
the Creative Commons Attribution 4.0 License.

Research article 28 Aug 2018

Research article | 28 Aug 2018

Review status
This discussion paper is a preprint. It is a manuscript under review for the journal Hydrology and Earth System Sciences (HESS).

Identifying rainfall-runoff events in discharge time series: A data-driven method based on Information Theory

Stephanie Thiesen, Paul Darscheid, and Uwe Ehret Stephanie Thiesen et al.
  • Institute of Water Resources and River Basin Management, Karlsruhe Institute of Technology – KIT, Karlsruhe, Germany

Abstract. In this study, we propose a data-driven approach to automatically identify rainfall-runoff events in discharge time series. The core of the concept is to construct and apply discrete multivariate probability distributions to obtain probabilistic predictions of each time step being part of an event. The approach permits any data to serve as predictors, and it is non-parametric in the sense that it can handle any kind of relation between the predictor(s) and the target. Each choice of a particular predictor data set is equivalent to formulating a model hypothesis. Among competing models, the best is found by comparing their predictive power in a training data set with user-classified events. For evaluation, we use measures from Information Theory such as Shannon Entropy and Conditional Entropy to select the best predictors and models and, additionally, measure the risk of overfitting via Cross Entropy and Kullback–Leibler Divergence. As all these measures are expressed in bit, we can combine them to identify models with the best tradeoff between predictive power and robustness given the available data.

We applied the method to data from the Dornbirnerach catchment in Austria distinguishing three different model types: Models relying on discharge data, models using both discharge and precipitation data, and recursive models, i.e., models using their own predictions of a previous time step as an additional predictor. In the case study, the additional use of precipitation reduced predictive uncertainty only by a small amount, likely because the information provided by precipitation is already contained in the discharge data. More generally, we found that the robustness of a model quickly dropped with the increase in the number of predictors used (an effect well known as the Curse of Dimensionality), such that in the end, the best model was a recursive one applying four predictors (three standard and one recursive): discharge from two distinct time steps, the relative magnitude of discharge in a 65-hour time window and event predictions from the previous time step. Applying the model reduced the uncertainty about event classification by 77.8%, decreasing Conditional Entropy from 0.516 to 0.114 bits.

Given enough data to build data-driven models, their potential lies in the way they learn and exploit relations between data unconstrained by functional or parametric assumptions and choices. And, beyond that, the use of these models to reproduce a hydrologist's way to identify rainfall-runoff events is just one of many potential applications.

Download & links
Stephanie Thiesen et al.
Interactive discussion
Status: open (until 23 Oct 2018)
Status: open (until 23 Oct 2018)
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
[Subscribe to comment alert] Printer-friendly Version - Printer-friendly version Supplement - Supplement
Stephanie Thiesen et al.
Model code and software

Event Detection Method Based on Information Theory S. Thiesen, P. Darscheid, and U. Ehret https://doi.org/10.5281/zenodo.1404638

Stephanie Thiesen et al.
Viewed
Total article views: 390 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
300 87 3 390 5 7
  • HTML: 300
  • PDF: 87
  • XML: 3
  • Total: 390
  • BibTeX: 5
  • EndNote: 7
Views and downloads (calculated since 28 Aug 2018)
Cumulative views and downloads (calculated since 28 Aug 2018)
Viewed (geographical distribution)
Total article views: 390 (including HTML, PDF, and XML) Thereof 387 with geography defined and 3 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Cited
Saved
No saved metrics found.
Discussed
No discussed metrics found.
Latest update: 24 Sep 2018
Publications Copernicus
Download
Short summary
We present a data-driven approach created to explore the full information of the datasets, reducing the information loss using equations or parametric assumptions. The evaluations are based on Information Theory concepts, introducing an objective measure of information and uncertainty. The approach was applied to automatically identify rainfall-runoff events in discharge time series, however it is generic enough to be adapted to other practical applications.
We present a data-driven approach created to explore the full information of the datasets,...
Citation
Share