Development of a hydrological ensemble prediction system and a 1 visualization approach for improved interpretation during typhoon 2 events 3

Development of a hydrological ensemble prediction system and a 1 visualization approach for improved interpretation during typhoon 2 events 3 Sheng-Chi Yang1, Tsun-Hua Yang1, Ya-Chi Chang1,*, Cheng-Hsin Chen1, Mei-Ying 4 Lin1, Jui-Yi Ho1 and Kwan-Tun Lee1,2 5 1Taiwan Typhoon and Flood Research Institute (TTFRI), National Applied Research Laboratories 6 (NARLabs), Taipei, Taiwan 7 2Department of River and Harbor Engineering, National Taiwan Ocean University, Keelung, Taiwan 8 *Correspondence to: 11 F, No. 97, Sec. 1, Roosevelt Rd., Zhongzheng Dist., Taipei City 10093, Taiwan 9 (R.O.C.) 10 E-mail: rachel.ev91@gmail.com 11 12 ABSTRACT 13

Box' visualization methodology that assists in interpreting the forecast results for operational purposes.A small watershed with area of 100 km 2 and four typhoons that occurred from 2012 to 2015 were selected to evaluate the performance of these tools.
The results showed that the modified visualization approach improved the intelligibility of forecasts of the peak stages and peak times compared to that of approaches previously described in the literature.The new approach includes all available forecasts to increase the sample size.The capture rate is greater than 50%, which is considered practical for decision makers.The proposed system and the modified visualization approach have demonstrated their potential for both decreasing the uncertainty of numerical rainfall forecasts and improving the performance of flood forecasts.
KEY WORDS Hydrological ensemble prediction system; peak flow; decision support; visualization.

INTRODUCTION
Numerical weather prediction (NWP) models generate different precipitation forecasts for specified locations and times due to the incompleteness of the input observations, the approximate nature of the forecast models and their parameterizations, and the random errors that result from perturbing the initial atmospheric conditions (Palmer, 2001;Hostache et al., 2011).Ensemble prediction systems (EPSs), which consist of an adequate number of equiprobable NWP models, have been established to provide probabilistic precipitation forecasts instead of a single deterministic forecast (Cloke and Pappenberger, 2009).An EPS provides predictions with greater skill than those obtained from individual runs of NWP models or deterministic model runs, especially for longer lead times (Demeritt et al., 2007;Cuo et al., 2011).
Effective communication of ensemble forecasts means that clear expression of the uncertainties associated with HEPS is important so that end-users can easily respond to the information provided during operations (Demeritt et al., 2010;Ramos et al., 2010;Pappenberger et al., 2013;Zappa et al., 2013;Pagano et al., 2014).Pagano et al. (2014) noted that defining effective methods for the communication of ensemble forecasts is a challenge for future operational river forecasting and represents a future research opportunity.Pappenberger et al. (2013)  This approach has been used to obtain quantitative and qualitative insights, such as the timing, water level, and discharge associated with peak flow.This information is crucial for end-users and decision makers.Zappa et al. (2013) applied an operational HEPS, namely, the IFKIS-HYDRO hydrological nowcasting system, to five different basins in Switzerland to evaluate the performance of the 'Peak-Box' methodology.The sizes of the basins ranged from 186 km 2 to 1696 km 2 .The study found that, of 485 operational forecasts performed from June 2007 through November 2008, 30% to 55% of the observed peaks fell outside the 'Peak-Box'.
Typhoons are common natural events that cause severe damage in countries at the edge of the northwestern Pacific Ocean, such as Japan, the Philippines, and Taiwan.For example, based on records covering 1958 to 2010, an average of 3.4 typhoons affect Taiwan annually, and these events cause an annual average loss of more than 500 million U.S. dollars (Li et al., 2004).Typhoon-related flood events cause these losses.
If they provide early warnings with sufficient lead time, flood forecasts from a HEPS can help authorities prepare disaster prevention and mitigation measures.A customized visualization method for typhoons is also necessary to make the ensemble flood forecasts generated by HEPS meaningful for emergency responders.Therefore, this Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2017-264, 2017 Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 29 May 2017 c Author(s) 2017.CC-BY 3.0 License.study presents a HEPS that can provide ensemble flood forecasts during typhoon events and proposes a customized visualization approach especially for typhoons to simplify the forecast information.This approach is an extension of the one presented by Zappa et al. (2013); it has been modified to increase the percentage of observed peaks that fall within the predicted range during typhoon events.The remainder of this paper is organized as follows.Section 2 includes the details of the proposed HEPS.Section 3 briefly describes the study area and typhoon events used in the study.Section 4 compares the original 'Peak-Box' approach with the proposed extended version.Finally, Sect. 5 and 6 present the results, discussion, and conclusions.

SYSTEM
This study proposes a HEPS that integrates various models.These models include NWP models that provide ensemble precipitation forecasts, a rainfall-runoff model that generates upstream boundary conditions, a storm surge model that generates downstream boundary conditions, and a flood routing model that simulates river flows.
The data processing is shown in Figure 1.The HEPS produces ensemble flood forecasts with a 72-hour lead time four times a day.The models used in the HEPS are described in the following subsections.

Ensemble precipitation forecasts
The Taiwan Cooperative Precipitation Ensemble Forecast Experiment (TAPEX) began in 2010.It is a collective effort among academic institutes and government agencies, such as the National Taiwan University, the National Central University, the National Taiwan Normal University, the Chinese Culture University, the Central Weather Bureau (CWB), the National Center for High-Performance Computing, the Taiwan Typhoon and Flood Research Institute (TTFRI), and the National Science and Technology Center for Disaster Reduction.TAPEX is the first attempt to design a highresolution (5-km) numerical ensemble model in Taiwan.This effort applies various NWP models, such as the Weather Research and Forecasting Model (WRF), the Fifth-Generation Penn State/NCAR Mesoscale Model (MM5), the Cloud-Resolving Storm Simulator (CReSS), and the Hurricane Weather Research and Forecasting Model (HWRF).It also considers different setups in terms of the model initial conditions, data assimilation processes and model physics.TAPEX generates four runs a day and provides ensemble predictions of the wind and pressure fields and quantitative estimates of precipitation with a lead time of 72 hours.Further information can be found in Hsiao et al. (2013).A typhoon's average impact duration is 73.68 hours (Huang et al., 2012) and the average lag between observed peak precipitation and flooding in Taiwan is between 2 and 10 hours (Jang et al., 2012).This study focuses on a one-way Hydrol.Earth Syst.Sci. Discuss., doi:10.5194/hess-2017-264, 2017 Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 29 May 2017 c Author(s) 2017.CC-BY 3.0 License.coupling in which TAPEX provides rainfall forecast to the rainfall-runoff model; feedbacks from the rainfall-runoff model to TAPEX are not considered.

Rainfall-runoff model
The HEPS uses the surface runoff forecast generated by a kinematic-wave-based geomorphologic instantaneous unit hydrograph model (the KW-GIUH model) as its upstream boundary condition.The KW-GIUH model, which was developed by Lee and Yen (1997), can reflect the effects of watershed geomorphology, land cover conditions, soil characteristics and rainfall intensity on runoff.It has been successfully applied to many Taiwanese catchments (Lee et al., 2001;2006).

Storm surge model
Storm surges are abnormal increases in water levels above those expected from astronomical tides.They are generated by strong winds and atmospheric pressure changes and affect water levels downstream (near estuaries) during typhoons.The HEPS uses the storm surge and tide forecasts generated by the Princeton Ocean Model (POM) and the TOPEX-POSEIDON global tidal model (TPXO6.2) as downstream boundary conditions.The POM model, which was developed by Blumberg and Mellor (1987), is a three-dimensional, nonlinear, primitive equation finite difference ocean model.It has been applied to simulate a wide range of ocean problems, including coastal storm surge in Taiwan (Ou et al., 2008;Chiou, 2010).In this study, the TAPEX

Flood routing model
The Numerical Model Simulating Water Flow and Contaminant and Sediment Transport in WAterSHed Systems of 1D Stream/River Networks, 2D Overland Regimes, and 3D Subsurface Media (WASH123D) was developed by Yeh et al. (1998) to simulate one-dimensional channel networks, two-dimensional overland flow, and three-dimensional variably saturated subsurface flow.It has been applied successfully in Taiwan and around the world, and it was chosen by the US Army Corps of Engineers as the core computational code used in modeling the Lower East Coast (LEC) Wetland Watershed (e.g., Yeh et al., 2006;Yeh and Shih., 2011;Shih et al., 2012;Hsiao et al., 2013).The HEPS uses the one-dimensional channel model of WASH123D as its flood routing model to simulate water stages in rivers.

Study area
This study selected the Yilan River in northeastern Taiwan as the study area (Figure 2).The river flows through the city of Yilan and has a main stream length of approximately 24.4 km and a watershed area of 149.06 km 2 .It has four main tributaries,

Typhoon events
Figure 3 shows the tracks of the different typhoons that have affected Taiwan, according to historical records (Huang et al., 2012).Of the ten categories, Type-2 and Type-3 typhoons account for approximately 28% of all typhoons and bring heavy rainfall to the Yilan River Basin.For instance, a rainfall of 158 mm in 4 hours was observed at rainfall gauging station C1U610 (shown in Figure 2) during Typhoon Soulik.Table 1 shows all of the typhoons that invaded Taiwan from 2012 through 2015.
Five of these events are Type-2 and Type-3 typhoons, which have the biggest impact on the Yilan River Basin.Therefore, this study selected the typhoons Saola (2012), Soulik (2013), Soudelor (2015), and Dujuan (2015) to calibrate the HEPS and test its performance.Typhoon Matmo, a Type-3 typhoon that occurred in 2014, was not included due to its weak intensity.This study used historical observations of rainfall, river stage, and tide to validate the parameters in the proposed HEPS.

FLOW FORECASTS DURING TYPHOON EVENTS
This study modified the 'Peak-Box' approach originally proposed by Zappa et al. (2013) to provide better communication of HEPS forecasts during typhoon events.In practice, it contained from 12.5% to 37.5%, due to the distribution of ensemble members (Zappa et al., 2013).Using the mean and the standard deviation (the 'SD-Box') results in a larger area, includes 46.60% of the ensemble forecasts (68.27% of peak water level times and 68.27% of the peak times) and has a greater chance of including the observed peaks.(2016) showed that the performance of NWP models is independent of the length of the lead time during typhoon events.Therefore, in order to expand the sample size, this study includes present (t) and previous forecasts (t-1, t-2, t-3… t-n, where n is the number of available forecasts when the system is initiated) to provide ensemble flow forecasts.As shown in the right panel of Figure 4, the green area illustrates the 'SD-Box'.The black and gray solid dots represent the current and previous peak-flow forecasts, respectively.

Performance evaluation criteria
This study applied two performance measures, the root mean square error (RMSE) and the skill-spread ratio, to evaluate the proposed HEPS performance.For a welldesigned HEPS, the spread of ensemble forecasts will be large enough to cover the prediction uncertainty.This statement implies that the spread should be the same as or larger than the RMSE.The RMSE, which is commonly referred to as skill, measures the difference between the observations and the ensemble mean without considering the direction.The closer the RMSE is to zero, the better the ensemble mean is as a forecast.The RMSE is defined as follows: where μ is the ensemble mean of ensemble peak-flow forecasts; Opeak is the observation of peak flow; Ppeak,i is the prediction of peak flow of the ith member; m is the number of ensemble members; and σ is the standard deviation of ensemble peak-flow forecasts.
The skill-spread score (hereinafter referred to as the score), which ranges from zero to infinity, is the ratio of the standard deviation of the ensemble peak-flow forecasts to the RMSE (Wilks, 2006).Scores less than one mean that the spread of the ensemble forecasts is large enough to cover the prediction uncertainty.It is defined as follows:

Model calibration and validation
Two parameters in the proposed HEPS KW-GIUH model have been calibrated using in situ observations made during typhoon events.These parameters are the roughness coefficient for overland flow (n0) and the roughness coefficient for channel Typhoons Saola, Soulik, and Soudelor to calibrate the parameters of the KW-GIUH model.Figure 5 shows that the percent errors in the peak discharges of the selected typhoons were 4.59%, 2.07%, and -5.89% at the Hsincheng Bridge, and 14.88%, 5.28%, and -3.05% at the Yuanshan Bridge, respectively.All of the errors in the peak times were less than one hour.The results show that the KW-GIUH model is capable of providing confident predictions for peak time, as well as peak discharge.
The WASH123D model adopted the most recent available cross-sectional bathymetry of the Yilan River, which was measured in 2010, as its input topography Figure 6 shows that the percent errors in the peak stage for Typhoons Saola, Soulik, and Soudelor, were 2.1%, 5.7%, and 10.6% at Zhongshan, 12.9% and 2.2% at Leawood, and 7.4%, 6.0%, and 2.1% at Jhuangwei, respectively.
There was one data gap at Leawood due to incomplete data collection during Typhoon Soudelor.Nevertheless, all of the errors in the peak times were less than one hour.The results show that WASH123D is capable of providing confident predictions of peak times, as well as peak stages.

Comparison of enveloping rectangles defined using the 'SD-Box' and the 'IQR-Box' methods for supporting the interpretation of ensemble peak-flow results
The proposed HEPS initiates when CWB issues a sea warning and ends when the next ensemble forecast is six hours less than the left edge of the 'SD-Box'.In that regard, 93 forecasts are available for the four selected typhoons.timing, respectively.Among all of the forecasts, there is only one forecast for which the 'IQR-Box' score is less than one, and the score of the 'SD-Box' is not.This situation occurs at the Zhuangwei Bridge during Typhoon Soudelor.However, the score for the 'SD-Box' method is still very close to one (1.01), which means that it nearly captures the observed peak.Overall, the 'SD-Box' method yielded average scores of 1.18 for the peak stages and 1.08 for the peak times.In comparison with the 'IQR-Box' method, which yielded scores of 2.06 for the peak stages and 2.06 for the peak times, the results show that the enveloping rectangles defined using the 'SD-Box' method are more reliable during typhoon events.

Including all forecasts with different lead times during an event to expand the sample size
The sample size has a strong effect in terms of determining whether a result is statistically significant.In other words, the number of available ensemble members is important for both the 'SD-Box' and 'IQR-Box' methods.For example, the number of available ensemble members for each forecast ranged from 11 to 14 for the proposed HEPS during operation.Thus, the descriptive statistics were calculated using insufficient sample sizes (less than 30).The same issue exists in other studies that employ HEPSs (e.g., Yang and Yang, 2014;Zappa et al., 2013).It is difficult to increase the number of ensemble members used in HEPSs, due to the limited computational resources that are available.Therefore, this study proposes a method for including present and previous forecasts in order to expand the sample size during the estimation process.
It must be shown that the forecast performance is independent of time before all available forecasts can be included in the estimation process.The time of concentration of the peak flow at the Zhongshan Bridge is approximately 4 hours.This study calculated the error in the maximum 4-hour rainfall between the average forecasts and the average observations at the watershed upstream of the Zhongshan Bridge.Figure 7 shows that there is no obvious trend in the errors in stage and timing, regardless of the length of the lead time.The correlation coefficients were -0.09 and 0.11, respectively, and these values indicate that no significant correlations exist between errors in stage or timing on the one hand and lead time on the other.For example, the best and worst forecasts during Typhoon Dujuan in terms of stage error were the 1 st and 5 th forecasts, respectively.However, the 6 th forecast was better than the 5 th , which implies that there is no trend in the cascading forecasting process.Based on these results, this study assumed that the performance of the HEPS is independent of lead time during typhoon events.Therefore, it is reasonable to include all available forecasts during an event to expand the sample size.
Figure 8 illustrates the comparisons between using the 'SD-Box' method with one forecast and using the 'SD-Box' method including all available forecasts (hereinafter indicated as 'SD-Box Single' and 'SD-Box All') at the Zhongshan Bridge.The performance of 'SD-Box All' was more consistent than that of 'SD-Box Single' in terms of both stage and timing.For example, the scores for stage during Typhoon Soudelor ranged from 0 to 5 when the 'SD-Box Single' method was used, but they were below or close to 1 with 'SD-Box All'.The results showed that the inclusion of all available forecasts in the calculation process decreased the variation among the forecasts; in other words, the uncertainty of the forecasts decreased.Figure 9 illustrates the scores of all of the forecasts for the different typhoon events.The 'SD-Box Single' contained 47.1% of the observed peaks in terms of stage (37.3% + 9.8%), whereas 'SD-Box All' contained 63.7% (57.8% + 5.9%) of the observed peaks.Furthermore, the 'SD-Box Single' contained 58.9% (37.3% + 21.6%) of the observed peaks in terms of timing, whereas 'SD-Box All' contained 71.5% (57.8% + 13.7%).The results show that the 'SD-Box All' method can capture more of the observed peaks in terms of both stage and timing.In particular, 'SD-Box All' improved the forecast performance and increased the capture rate from 37.3% to 57.8% for both stage and timing.

CONCLUSIONS
This study proposed a HEPS that employs NWP models to perform rainfall forecasts and hydrologic models to produce ensemble flood forecasts during typhoon events.Because the communication of ensemble forecasts is critical for helping endusers to respond, a modified version of the 'Peak-Box' visualization method, which was originally described by Zappa et al. (2013), was also proposed to support the interpretation of ensemble forecast results for operational purposes.Four typhoon events during the period 2012-2015 and observations collected in the Yilan Experimental Watershed were used to evaluate the performance of these techniques.A total of 93 forecasts and two performance measures were considered.The results showed that the proposed HEPS is able to provide flood forecasts during the selected typhoon events.In addition, the 'SD-Box' visualization approach, which considers the mean and the standard deviation instead of the 25th and 75th percentiles, captured more of the observed peaks during typhoon events.The average skill-spread scores of the 'SD-Box' method for the selected events were 1.18 and 1.08 in terms of stage and timing, respectively.These results represent a significant improvement over the original 'Peak-Box' method, which resulted in scores of 2.06 for both peak stage and peak Descriptive statistics, such as the quartile deviation and the standard deviation, are susceptible to outliers when calculated using an insufficient number of observations.
Adding more ensemble members is expensive in terms of computer resources.This study proposed a method that enables increasing the sample size, leading to statistically significant results.This method involves including present and previous available forecasts in the calculation process.For example, the proposed HEPS generated 11 available ensemble members at each forecast during Typhoon Dujuan.By including all of the present and previous available forecasts (the 'SD-Box All' method), the sample size increased to 22 for the second forecast, 33 for the third forecast, and so on.The results showed that the 'SD-Box All' made more consistent predictions.This result can be explained by the inclusion of all available forecasts in the calculation process decreasing the uncertainty of the forecasts.As a result, the rectangles defined by the 'SD-Box All' method contained 57.8% of the observed peaks in stage and timing.
Coughlan de Perez et al. (2016) suggested that a HEPS that produces a false alarm rate below 50% is tolerable for decision makers in terms of the economic and practical consequences of taking action.However, this study assumed that the forecast performance of the proposed HEPS is independent of the length of the lead time and conducted an experiment to prove it.Other studies, such as that of Zappa et al. (2013), have claimed that the most accurate forecasts were obtained for lead times of two or more days.Such statements imply that the performance of HEPSs do not improve with shorter lead times or are independent of lead time, and Yang et al. (2016) found that the best performance is obtained before a typhoon makes landfall.This assumption is still susceptible to the topography of the applied area and the type of extreme event being considered.Further investigation of various conditions must be performed before firm conclusions can be drawn.Regardless, the proposed HEPS and the modified visualization approach have been shown to produce convincing peak-stage and peaktiming forecasts for operational purposes during a typhoon.

AUTHOR CONTRIBUTION
Ya-Chi Chang, Mei-Ying Lin, Jui-Yi Ho calibrated and verified the parameters of WASH123D, POM and KW-GIUH models.Cheng-Hsin Chen dealt with the data processing of the models and performed the simulations.Sheng-Chi Yang and Ya-Chi argued that the uncertainty information provided by HEPSs sometimes results in resistance on the part of the public if experts or nonexperts cannot easily understand the information provided.At present, HEPSs still rely on conventional visualization techniques, such as 'spaghetti diagrams' or box plots, to display the distributions of forecast results.Pappenberger et al. (2013) focused on expert users of HEPSs and the communication among these experts and identified key information for the public, such as discharge, lead time, warning levels, return Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-264,2017 Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 29 May 2017 c Author(s) 2017.CC-BY 3.0 License.periods, worst/best scenario, etc. Zappa et al. (2013) proposed the 'Peak-Box' visualization approach to support the interpretation and verification of HEPS results.
Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-264,2017   Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 29 May 2017 c Author(s) 2017.CC-BY 3.0 License.model provides ensemble pressure field and wind field forecasts to POM and the TPXO6.2model and obtains tidal level predictions.As with TAPEX, it generates four runs a day, and each run has a 72-hour lead time.
Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-264,2017   Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 29 May 2017 c Author(s) 2017.CC-BY 3.0 License.which are the Wushi River, the Dahu River, the Dajiao River and the Xiaojiao River.The Water Resource Agency (WRA) and TTFRI have selected this river as one of two watersheds where long-term monitoring experiments are being carried out (the other is the Dianbao Creek basin in southwestern Taiwan).The purpose of the experimental watersheds is to generate long-term and high-density hydrological monitoring data that can be used for scientific studies, including the development of hydrological and hydraulic models and the study of environmental changes.In total, 11 rainfall gauging stations, 16 water-stage gauging stations, five river-velocity gauging stations, and 36 inundation-depth gauging stations have been installed in the Yilan River Basin.Figure2shows the locations of the water-stage and rainfall gauging stations that collected the data that we used in this study.The monitoring data have been carefully collected and processed.For full information and to download the available data, please refer to the official website (http://wraew.ttfri.narl.org.tw/index.php).TAPEX provides 72-hour rainfall forecasts for five rainfall gauges in the upstream portion of the Yilan River Basin.The KW-GIUH model calculates the surface runoff and estimates river flow at the Hsincheng and Yuanshan Bridges.This study uses the POM and TPXO6.2 models to forecast the tides at Suao and to estimate the water stages at the Kemalan Bridge.WASH123D then generates ensemble flow forecasts using flows at the bridges mentioned above as the upstream boundary condition and the water Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-264,2017 Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 29 May 2017 c Author(s) 2017.CC-BY 3.0 License.stage at the Kemalan Bridge as the downstream boundary condition.The detailed locations of these places are shown in Figure 2.

Figure 4
Figure4compares the two approaches, and the modifications are described in detail Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-264,2017   Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 29 May 2017 c Author(s) 2017.CC-BY 3.0 License.d.Include all forecasts with different lead times in the rectangle.Descriptive statistics, such as the quartile deviation and the standard deviation, are susceptible to outliers when calculated using insufficient sample sizes.Adding extra ensemble members to produce more forecasts consumes computer resources.Yang et al.
Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-264,2017   Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 29 May 2017 c Author(s) 2017.CC-BY 3.0 License.flow (nc).The proposed HEPS used data from five rainfall gauges, including LTGX, YSGZ, C1U610, C0U520 and C1U630 (see Figure2for locations), and the Thiessen polygon method(Thiessen, 1911) to estimate the hourly spatial-average rainfall intensities in order to provide rainfall input data to the KW-GIUH model.The topographic data used in KW-GIUH are contained within a digital elevation model with a resolution of 5 m obtained using aerial photographs.Kuo et al. (2016) used in situ observations of flow discharges made at the Hsincheng and Yuanshan Bridges during data.The upstream boundary of the model is set at the Hsincheng and Yuanshan Bridges, and the downstream boundary of the model is set at the Kemalan Bridge.Field measurements at the Hsincheng and Yuanshan Bridges from Kuo et al. (2016) and observed water stages at the Kemalan Bridge were used as the upstream and Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-264,2017 Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 29 May 2017 c Author(s) 2017.CC-BY 3.0 License.downstream boundary conditions, respectively.Field hourly records of water-stage at the Zhongshan, Leawood, and Jhuangwei Bridges were used to calibrate the value of Manning's roughness coefficient (n) in the WASH123D model and to validate the performance of the model.
Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-264,2017   Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 29 May 2017 c Author(s) 2017.CC-BY 3.0 License.timing.Scores of less than one indicate that the spread of the ensemble forecasts is large enough to contain the prediction uncertainty.Since the average score achieved by the 'SD-Box' method was close to one, it has been shown to be more reliable than the original 'Peak-Box' method during typhoon events.The results satisfy the statement "One of the main objectives of ensemble flood forecasts is the representation of the full spectrum of forecast uncertainty and/or predictability in [the] form of different hydrological responses to the input of the various members obtained from an atmospheric EPS" made byZappa et al. (2013).

Figure 2 Figure 3
Figure 2 Study area and locations of streamflow gauges.Black dots and triangles indicate the locations of water-stage gauging stations and rain gauge stations, respectively.

Figure 4
Figure 4The left panel shows a graphical explanation of the 'Peak-Box' approach.The outer rectangle is the 'Peak-Box,' and the internal rectangle (the yellow area) is the 'IQR-Box'.The solid dots represent all of the ensemble forecasts.The right panel shows a graphic explanation of the proposed extension of the 'Peak-Box' approach.The enveloping rectangle is the 'SD-Box' (the green area).The solid black and gray dots represent current and previous peak-flow forecasts, respectively.

Figure 5 Figure 6
Figure 5 Comparison of simulated discharges (red circles) and recorded discharges (solid lines) for model calibration (Typhoons Saola and Soulik) and validation (Typhoon Soudelor) experiments at Hsinsheng (left) and Yuanshen (right).The blue bars are the hourly spatial-average rainfall intensities measured in the watershed upstream of Hsinsheng and Yuanshen.

Figure 7 Figure 8 Figure 9
Figure 7 Box-and-whisker plot at the watershed upstream of the Zhongshan Bridge during the four selected typhoon events.The blue dots indicate the ensemble means.The inverted triangles indicate the time of occurrence of the maximum 4-hour rainfall.The results show that there is no obvious trend in lead time for the errors in either the stage or timing.

Table 2
Leawood Bridge during Typhoon Soudelor due to the lack of complete observations.The scores that are less than one in the table are highlighted.These values indicate that Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-264,2017 Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 29 May 2017 c Author(s) 2017.CC-BY 3.0 License. the spread of the ensemble members is large enough to contain the prediction uncertainty.The rectangles defined using the 'IQR-Box' method contain 33.