Statistical bias correction for climate change impact on the basin scale precipitation in Sri Lanka, Philippines, Japan and Tunisia

1. The method of selecting GCM adopted in this study is not convincing. Addressing the uncertainties as reflected by the disagreements among the model simulations is a topic of extensive research. The authors tend to oversimplify this issue by selecting GCMs based on their “scores” used in this study. I understand that some criteria might be needed to narrow down the list of data sets to be analyzed. However, those criteria should not appear as the main focus of this study. The way the approach of selecting GCMs was emphasized in abstract and in the first paragraph of introduction (from line 23 on page 1 to line 4 on page 2), readers might find it as one of the main objectives of this study.

method (Hay et al., 2000;Graham et al., 2007), power laws or non-linear correction (Bordoy and Burlando, 2013;Leander and Buishand, 2007;Hurkmans et al., 2010) and distribution mapping also known as quantile mapping or histogram equalization or matching (Wood et al., 2004;Schoetter et al., 2012;Piani and Haerter, 2012;Maurer et al., 2013). The simple linear method just treats the monthly mean intensity error and does not account for the difference in the frequency distribution of precipitation while nonlinear correction of either twelve months or some days is possible to adjust both the mean and the spread or the coefficient of variation of rainfall distribution. Moreover, it removes the bias of intermediate rainfall very well, but still extreme corrected rainfall shows underestimation. These aforementioned methods do not cover the wet day frequency errors. In the distribution based correction method by the empirical or gamma, the bias of the mean and standard deviation together with wet-day frequencies and intensity errors are corrected simultaneously in monthly scale. This approach has been widespread use and show superior performance to other methods, because high moment biases, including wet day frequencies are effectively discarded (Ines and Hansen, 2006;Li et al., 2010;Piani et al., 2010a, b;Argüeso et al., 2013;Lafon et al., 2013;Teutschbein and Seibert, 2013).
Because it is more straightforward and incurs less computational burden, this approach has become a popular and conventional tool for GCM bias correction. However, the performance depends on how well do the observation and GCM precipitation fit the specific distribution (e.g. Extreme rainfall does not follow the normal distribution fitting).
For this reason, the proposed method mixes two probability functions, one for tailed intensities that usually do not conform to a normal distribution and the other for a monthly series of normal precipitation, including adjustment of wet day frequency errors. As it is a statistic based correction method, it has no temporal correlation of rainfall series and the mismatch of temporal evolution may alter the further impact studies of the hydrological simulation. Moreover, the differences between the downscaling methods and the performance of the bias correction approaches tend to vary from one catchment to another (Sunyer Pinya et al., 2015). According to Sunyer et al. (2012) it is better to test the performance of different downscaling methods while importantly acknowledging their limitations, advantages as well as the downscaling uncertainties. Furthermore, statistical bias correction procedures should be applied on a case by case basis in line with the objectives of the climate change study (Onyutha et al., 2016), e.g. when dealing with moderate or extreme hydro-meteorological events."  (2001), climate is defined in glossary as "Climate in a narrow sense is usually defined as the average weather, or more rigorously, as the statistical description in terms of the mean and variability of relevant quantities over a period of time ranging from months to thousands or millions of years. The classical period for averaging these variables is 30 years, as defined by the World Meteorological Organization. The relevant quantities are most often surface variables such as temperature, precipitation and wind. Climate in a wider sense is the state, including a statistical description, of the climate system. In various chapters in this report different averaging periods, such as a period of 20 years, are also used." http://www.ipcc.ch/publications_and_data/ar4/wg1/en/annex1sglossary-a-d.html Accordingly, the two periods 1981-2000 and 2046-2065 (20 years) simulation results are also in the tolerable range for the effective GCMs simulations although it is not the classical period (30-years) for climate change analysis. In this study, most of the ground station data from different river basins are available from 1981 to 2000.
In Matsuyama station in Japan, daily precipitation from 1961 to 2000 is available. The multi GCMs mean show the climatology value of 250.41 mm in 20-year (1961-1980) average and 241.76 mm in 30-year average (1961-1990) during the highest rainfall month of June in the following figure. This figure compares the 20-year seasonal climatology mean with 30-year seasonal climatology mean. As in the following figure, 1961-1980 (20 years) climatology variation is not significantly different from 1961-1990 (30years) climate statistics. Therefore, the performance of GCMs based on the 20-year periods are not significantly different from 30-year analysis. AC: The main reason for the selection of the GCMs from rather the CMIP3 than CMIP5 is that all full sets of GCM data from CMIP3 have been archived on the DIAS server and CMIP5 GCMs data archive is ongoing due to the very huge amount data. Full sets of CMIP5 GCMs data are not completely uploaded on the DIAS server. This paper is focused on the development of the comprehensive bias correction method by ground observation and also the criteria of the exclusion of GCMs which cannot express the representative regional climate pattern over the target basin. Although most of CMIP5 GCMs are improved their performances, they still need further advancement to solve some issues. Therefore, the improved CMIP5 GCMs also need to evaluate under the same basic principles for omission of GCMs. But an extra remark for GCMs and their simulation runs is the GCM ensembles mean of different simulation runs are considered for the selection from CMIP5. After that, each run is treated as each simulation in the statistical bias correction. [Retrieved from https://climatedataguide.ucar.edu/climate-data/aphrodite-asianprecipitation-highly-resolved-observational-data-integration-towards.] Another substitute is ERA-Interim data which is the third generation reanalysis and available from 1979 to present with 0.75˚× 0.75˚ spatial resolution with sub-daily and daily scale precipitation.
[https://climatedataguide.ucar.edu/climate-data/era-interim.] However, these data cannot be used where the catchment area is smaller than a grid resolution.

ii)
if validity of high resolution gridded freely available data (FAD) e.g. reanalysis or interpolated series can be verified, can the augmentation of the observed or historical datasets by such FAD enhance the applicability of the proposed method? AC: In Rasmy et. al (2013), the proposed method (statistical bias correction) is applied to the Dynamical Downscaled (DD) rainfall which is resulted from the Weather Research and Forecasting (WRF) by using the initial and boundary conditions from the ERAinterim data over Shikoku Island, Japan for the present climate. Owing to the direct influence of the spatial resolution enhancement in WRF, it was able to reproduce similar pattern and statistics as the observed rainfall. However, there is still bias of overestimation, underestimation and frequency of wet day errors. But it has a much smaller bias than GCM rain and these biases are eliminated by statistical approach. Finally, future changes will be assessed using Pseudo Global Warming Downscale (PGW-DS) using the monthly mean difference between the future and present climates simulated by GCMs (Kawase et al., 2009). Therefore, it can be concluded that the proposed method enhances the high resolution gridded FAD or interpolated series or reanalysis rainfall for impact assessment studies and recognizes its applicability to global FAD. Therefore, it may be some doubt in the final decision of the future urban design storm.
However, the range come from bias corrected GCMs (the proposed method) looks narrow down for easy decision as in Fig. 16 (b, d, f, h) (as Fig. 1 in the interactive comment). Moreover, the low exceedance probability rainfall (i.e., extremes) over a certain threshold are distinguished from all analysis years (not the monthly scale as in Willems and Vrac, 2011) to construct the heavy tailed representative distribution mapping for bias correction. Therefore, bias corrected function based on the transfer of CDF between observed and historical GCMs means a key to discard some systematic error of GCMs biases. Hence, the maximum probable rainfall estimates from extremes in different return period for future plans may be more informative than monthly based analysis. On the other hand, the effectiveness of distribution mapping bias correction may depend on how the data are well-fitted to the assumed distribution.

AC:
We modified the line 14-18 (page 6) as the following; "Hosking and Wallis (1987) proposed different methods such as the maximum likelihood (ML), methods of moments (MOM) and probability weighted moments (PWM) estimators for estimating the parameters of the GPD in the case κ > -0.5 and PWM and MOM are extremely simple to compute. The PWM method may be appropriate for κ < -0.2 which means GPD with an extremely long tail and the ML method should be used in the case of κ > 0.2 with large sample sizes to avoid a high rate of inconsistency with the data. For the small to moderate sample sizes, the MOM or the PWM performed better than ML estimators (Tajvidi, 2003). The MOM is suggested to use when the samples are neither an extremely long tail nor an extremely short tail (de Zea Bermudez and Turkman, 2003) and it is the most efficient method for negative shape parameters of the GPD model for estimation of quantile of T-year return period (Madsen et al., 1997). Therefore, it seems relevant to use the shape parameter κ in the range of -0.5 < κ < 0.5 (Rosbjerg et al., 1992) for any practical application. This limitation is same as in this study and the shape and scale parameters given by the MOM method are;" COMMENT No. 7 As proposed by Onyutha and Willems (2015), one way to minimize the bias in the quantiles from the tail of the GPD is to select the scale parameter ( ) in an optimal way using graphical approach to identify the key event above which the mean squared error on the GPD calibrated to the extreme events is minimal. The implementation of this proposal adequately by the authors based on Figure 6 a-b was a very good step in their proposed bias correction approach.

However, some key parameter seems to be missing in equations (4) and (5).
It is well-known that: a) if the GPD parameter  (threshold) is known, using the method of moment approach (as adopted by the authors), the shape ( k ) and scale ( ) parameters can be computed using: ------------A2 where  and  denote the sample mean and standard deviation respectively. b) if  is unknown, method of moment estimates of k and  can be obtained using an iteration scheme (e.g. Newton-Raphson) from:

Can the authors check the correctness of their equations (4) and (5) on page 6 in
comparison with those provided above i.e. A1 to A5?.

AC:
The equations are valid for the GPD parameter ξ (threshold) is known and in the case of the shape parameter κ in the range of -0.5 < κ < 0.5 as in  and Martins and Stedinger (2001).

TC 4 Instead of only mentioning the number of stations considered, the author
should clearly specify the resolution of the data (e.g. daily, monthly etc) obtained from the different catchments. In the same vein, the data temporal domain for each hydro-meteorological variables used should also be included in Table 1. AC: Added "Daily" in Line15, 20, 25, 29 on page 3. Analysis period and time scale columns are added in Table 1. The answers to these questions should be presented clearly. AC: All of GCMs and its simulation runs were considered separately in the proposed method. This meant gfdl_cm2_1 and gfdl_cm2_0 were considered as two simulation runs of GCM and any GCM runs were discarded if they did not have a precipitation score of 1. The GCMs with large bias should be discarded because if the GCMs with low total scores meant the GCMs simulated parameters including precipitation showed the awkward or unmanageable output over the target basin. For the convincing future trend of change in the basin-level assessment, the poor GCMs which total scores of lower than 4 or 5 (out of 7 or 8 parameters) are neglected.
TC 11 Lines 18 (page 5), 33 (page 2), 1 and 20 (page 3), 6 (page 4), 1 (page 1) ....etc, : "We did this and that...." such colloquial words do not have spaces for their accommodation in papers to be published by a top journal like HESS. AC: All sentences are changed to passive form. Line 18 (page 5) "A three-step statistical bias correction was proposed to eliminate these major GCM flaws." Line 33 (page 2) In this way, a comprehensive and integrated statistical bias correction and spatial downscaling method together with tackling poor GCMs concern was developed.
Line 1  TC 27 Line 24 (page 7): replace "We solved this problem" with "Attempt to apply bias correction for the frequency of wet days was made by". AC: Replaced.
TC 28 Line 26 (page 7): change "rain days" to "wet days". Implement this correction throughout the manuscript. AC: Changed "rain days" to "wet days" through the manuscript.
TC 29 Line 27 (page 7): Is the word "beyond" the same as "above"? If so, change it accordingly. AC: Modified as "If GCM rainfall is smaller than that threshold, it is changed to zero for correction." TC 30 Line 22 (page 8): replace "just get rid of" with "minimize" AC: Replaced. TC 31 Line 9 (page 8): replace perfect"" with "reasonable" AC: Line3 on page 9 "perfect" is replaced with "reasonable".
TC 32 Line 29 (page 9): what is "?a?" ? AC: Modified to ""a"". According to the reason mentioned above, Figure 5 is modified to exceedance probability comparison between the two series before bias correction and after bias correction compared to observed data at CLSU station.
TC 35 Line 29 (page 13): replace "eliminates" with "reduces" AC: Replaced. Figures 1, and 16-19 should be presented with clearly marked grids and graticules to show locations (degrees of latitude and longitude) in geographic coordinates. AC: Modified Fig.1 (Fig.3 in the interactive comment) and 16-19 with decimal degree grids and 16-19 are combined as Fig. 17 (Fig.4 in the interactive comment). Figures 9 and 10 it cannot be understood that the letters a, b, ...and g in the horizontal axis represent the IDs of the GCMs as presented in Table 3  This should be clarified within the text of second paragraph in section of 3.4. AC: Modify in line14 on page 10 as follows: "In both figures, label "a" on the x-axis is for observation and "b, c, d, e, f, g" on that axis refers to the selected 6 GCMs from Table   3. GCMs lists under Yoshio (the third column) are for Matsuyama and under Medjerda (the last column) is for Oued Mellegue." Added in Figure 9 caption as follows:

TC 37 For
"Label "a" on the x-axis is for observation and "b, c, d, e, f, g" on that axis refers to the selected 6 GCMs under Yoshino (the third column) in Table 3." Added in Figure 10 caption as follow: "Label "a" on the x-axis is for observation and "b, c, d, e, f, g" on that axis refers to the selected 6 GCMs under Medjerda (the last column) in Table 3." TC 38 Figure 14: compute the exceedance probability of each extreme rainfall event and use it to replace the ranking order plotted on the horizontal axis. I also recommend that throughout the manuscript, the expression "ranking order statistics" be replaced with "exceedance probability".