Improving river flow generation over Great Britain in a land surface model required for coupled land-atmosphere interactions

Land surface models (LSMs) represent terrestrial hydrology in weather and climate modelling operational systems and research studies. We have designed a procedure to select hydrological parameters within the Joint UK Land Environment Simulator (JULES) LSM that is suitable for distributed hydrological modelling within the new land-atmosphere-ocean coupled prediction system, UKC2 (UK regional Ccoupled environmental prediction system 2). Using river flow observations from gauge stations, we 10 study the capability of JULES to simulate river flow over 13 catchments in Great Britain, each representing different climatic and topographic characteristics at 1 km spatial resolution. A series of tests, carried out to identify where the model results are sensitive to the scheme and parameters chosen for runoff production, suggests that different catchments require different parameters and even different runoff schemes to produce the best results. From these results, we introduce a new topographical parametrization that produces the best daily river flow results (in terms of Nash-Sutcliffe efficiency and mean bias) for all 13 catchments. The new 15 parametrization introduces a dependency on terrain slope, constraining surface runoff production to wet soil conditions over flatter regions (like the Thames catchment; Nash-Sutcliffe efficiency above 0.8), whereas over steeper regions the model produces surface runoff for every rainfall event regardless of the soil wetness state. This new parametrization improves the model capability in regional (Great Britain wide) assessments. The new choice of parameters is reinforced by examining the amplitude and phase of the modelled versus observed river flows, via cross-spectral analysis for time scales longer than daily. 20

In general, the results are presented well and the work is a potentially useful contribution to improving river flow generation in land surface models.However, the paper suffers from two major limitations in its current form, (1) the tests are carried out at a small number of locations across GB,and (2) the wider applicability and novelty of the methods and results presented are poorly stated.

General Comments
Firstly, the title promises 'Improving river flow generation over Great Britain ', however, this is tested at very few locations (13 catchments) across Great Britain.The model needs to be tested against a larger number of catchments to offer a robust test of the model, the scheme and parameters tested and ultimately, support the conclusions of the paper.It is difficult to see the wider applicability of the results (and how well they might perform across GB!) having such a small sample of catchments.Given the availability of a comprehensive flow dataset provided by the NRFA of over 1000 GB catchments, I found it surprising that the model was tested against so few catchments.
The selection of catchments also does not represent a wide diversity of catchments in the UK as the authors state.There are no drier catchments from the East of England, only one catchment with a high BFI (which gets excluded from some of the subsequent analysis as the model does not perform well in this catchment) and no catchments with a low BFI less than 0.3.There is a mix of both natural and human influenced catchments but then no discussion of what impacts this might have on the robustness of the results.

Authors:
We stress here that our aim is to improve the hydrological representation of the JULES Land Surface Model in the context of building a system for running the coupled UKC2 (UK regional coupled environmental prediction) system.Therefore we need to design a system for selecting hydrological parameters that is generic enough for good hydrological performance within UKEP.The spatial resolution used here needs to be consistent with the atmospheric model coupled to it.In this case, we have for the first time driven the model with a new 1 km 2 spatial resolution meteorological dataset (CHESS-met, Robinson et al., 2017a) to mimic the approximately 1 km 2 atmospheric forcing of the coupled UKEP system.However, we focus on assessing performance in medium to large GB catchments since this is a distributed model that integrates the hydrological processes of an area.Most catchments in NRFA have an area too small for our 1 km 2 distributed modelling.Of course we have to find a compromise in terms of computing time for this type of experiments with a large number of model integrations (JULES is more computationally expensive than typical hydrological models), and we decided that the set of 13 catchments that Crooks et al. (2014) used to set the national hydrological modelling framework was appropriate for a first attempt to run the JULES model over Great Britain at the km-scale resolution on standalone mode.In addition, for 1 km 2 distributed modelling a series of river network parameters are required that are not publicly available, and we are grateful to Helen Davies and Vicky Bell from CEH who provided such data from Davies and Bell (2009) for the studied catchments.We have added the following line to the Abstract: "We have designed a procedure to select hydrological parameters within the Joint UK Land Environment Simulator (JULES) LSM that is suitable for distributed hydrological modelling within the new land-atmosphere-ocean coupled prediction system UKC2 (UK regional Coupled environmental prediction system 2)." We have added the following line to the text (page 4, line8): "We acknowledge the availability of river flow data for a higher number of catchments in the NRFA archive, however we focus on catchments large enough for the JULES model to integrate hydrological processes at the km-scale." Concerning the human-influenced catchments, we have included a paragraph in the "Discussion" section as follows: "The JULES model does not incorporate anthropogenic effects on river flow in its current state.We acknowledge that human activities (groundwater abstractions, dams, reservoirs) affect the observed river flow in Great Britain and therefore JULES outputs of natural river flow are not expected to reproduce exactly the observed NRFA records.We included as mentioned in Section 2.2.2 the naturalised flow records for the Thames catchment as it is the only catchment with natural flow availability for the studied period.However, the human activities effects on the river flow are difficult to quantify given the lack of data and heterogeneity of activities in the studied catchments.A recent study, for instance, showed increase of drought duration in GB catchments affected by groundwater abstractions and varying effects on drought occurrence depending on the activities (Tijdeman et al., 2018)."

Secondly, I struggle to see the wider applicability and novelty of the methods and results presented. This is not well formulated in either the discussion or conclusions and so it is difficult to understand the relevance of your results to the wider research community. All of the discussion focuses on the improvements to be made to the JULES model. It is important that model improvement papers not only focus on the model in question but also what we can learn for improving similar models. Authors:
The first novelty of this work is that we used (for the first time to our knowledge) the CHESSmet database (Robinson et al., 2017a;Robinson et al., 2017b).This database with gridded precipitation from CEH-GEAR (from gauged rainfall data) and meteorological variables at 1km 2 spatial resolution is unique and the tool that allows us to use a Land Surface Model at such high resolution and investigate GB hydrology.A recent study has been carried out looking at further consequences of our model development for the whole GB (particularly evapotranspiration) as we mentioned in the "model development" section (Blyth et al., 2018).We added the following lines to the beginning of the "Discussion" section: "To our knowledge, this is the first study using the CHESS-met dataset (Robinson et al., 2017a;Robinson et al., 2017b) to drive a LSM over a wide region (the 13 selected catchments).This dataset availability opens new possibilities to study land surface hydrology and interactions with the atmosphere using LSMs (that typically require gridded forcing datasets) at the km-scale driven by gridded rainfall derived from gauge stations.A recent study (Blyth et al., 2018) investigates evapotranspiration trends and components in Great Britain over the last 55 years using CHESS-met and the JULES runoff development described in this paper.These authors find that, when compared to flux tower data, the model overestimates evapotranspiration rates.The new runoff development reduced the negative runoff bias as shown here, mostly from increased surface runoff during the rainy season over mountainous regions.Hence, the evapotranspiration rates in the Blyth et al. (2018) study have been impacted in the right direction by lower soil moisture availability.We acknowledge that topographic variability at the grid scale is not new to JULES or other LSMs, as it is considered by the TOPMODEL scheme.However, we have found that for Great Britain regional integrations the surface runoff production by PDM allows for a better characterization of the topographical variability through the parameter.This finding within the JULES model and the Great Britain region framework can have significant impacts over other regions and applied to other models that need to account for subgrid variability in the runoff generation process, using a widely available parameter (from digital elevation model datasets) like the grid cell mean slope as the only input, whereas other physical characteristics might be more difficult to obtain or are simply unavailable."

Specific Comments
1.The introduction and methods section need to have a clearer rationale for the choice of tests and sensitivity analyses undertaken in the paper.Authors: We chose not to include the sensitivity analysis rationale in the introduction section, but instead to write a more simple presentation of the science issue for the reader: importance of land surface processes in the earth system and couple land-atmosphere models that produce river flow, representation of runoff generation in LSMs and particularly in the model object of the study, presentation of the problem of heterogeneity in precipitation input and land characteristics in Great Britain and how LSMs that work coupled to the atmosphere in operational systems need to work robustly over such heterogeneous regions, and finally a brief description of the rest of the paper where we first mention the sensitivity analysis.Then in section 2.2.2 we describe the sensitivity tests (see response to specific comment 2).

There needs to be a better description of your choice of parameter ranges and number of parameters throughout the manuscript. For example, why choose 25 variations of the b shape parameter? Authors:
We noticed an error in page 5 (line 12) that might have led to confusion and we have corrected it as follows: "We choose four possible values for the parameter within the 0-1 range that it can take in the form of fraction of saturation ( 0.0, 0.25, 0.5, 0.75)" Figure 2 is intended to show how the parameter variability chosen for the PDM tests cover the complete spectrum of possible fsat values as a function of soil water content (i.e.differences tend to diminish as we go towards higher values of b).
For the TOPMODEL scheme tests, our choice of f range is consistent with other JULES studies referenced in the text and with other studies not using JULES (Chen and Kumar, 2001;Ducharne et al., 2000;Niu and Yang, 2003;Stieglitz et al., 1997;Warrach et al., 2002).
We have also noticed an error on the range of anisotropy ratio parameter on the Figure 4 label that might have led to confusion and have corrected it.

Why did you just use a naturalised flow record for Thames at Kingston? The River Severn at both the Bewdley and Haw Bridge gauges and the River Ock at Abingdon are both heavily influenced by human activities and will certainly affect your performance metrics and
results.Does the model include anthropogenic processes such as reservoirs and abstractions?If not, then why not limit your analyses to natural catchments?Authors: No, the JULES model does not include such anthropogenic processes in its current state.We are certainly aware of the need of including this and a UK National Capability NERC project has just been approved which, amongst other hydrological modelling capabilities, will look into the introduction of human influences in the JULES model.We agree that a naturalised flow record from other stations apart from the Kingston station would provide a more comparable river flow series with our model, but there is no availability over the studied period for such records from the mentioned heavily influenced station (see the NRFA station info: http://nrfa.ceh.ac.uk/data/station/info/$stationcode$).We decided to use the naturalised record for the Thames given its availability and its importance in this study as it is the flattest catchment that made evident the need for a higher constraint using the S0 parameter.In addition, we were looking to follow the approach by Crooks et al. (2014) as we compare to their results in Figure 8.

It is really difficult to discern any differences between the observed and simulated flow in
Figure 7.It would be better to plot a shorter time period so you can at least see how the model performs relative to the observed.

Authors:
We decided to include the whole period in Figure 7 as it gives the reader a better idea of the magnitude of the daily time series we are discussing in the paper, and of the interannual and catchment variability in terms of river flow.We agree that at some instances it is difficult to discern differences and that is why we included the metrics on top of each panel, but we think that the overall model skill to simulate baseflow and flow peaks is visible.We already provide a closer look at the model performance in its different parametrizations as compared to observations for just 2 years in Figure 9 for the larger catchments.Thanks for spotting this.This reflects a mistake we made when typing the table.We have corrected the table and the reference to the Ock area in page 9 (line 15).

5.
Page 9 L15 The authors also state that the spatial resolution (1km 2 ) may be too coarse to represent small catchments like the Ock.The majority of NRFA catchments are smaller than this 'small' catchment in the UK, so in this context I don't think it is such a small catchment.Can the authors comment on the applicability of the model to represent runoff in the majority of catchments in the UK? Authors: We expressed our idea incorrectly.We meant that this catchment is too small to integrate the behaviour of a wider area as represented in our distributed modelling.The reviewer is right as this is not an issue of the catchment being small, but rather for it being upstream (lowest mean flow of the studied catchments).
Finally, I imagine that the reason for the discrepancy is probably more to do with the significant human influences (groundwater abstraction and recharge) affecting flows at this site rather than the coarseness of the model.

Authors:
We agree that the human influences will have a role in the discrepancy.In addition, this is a catchment highly maintained by groundwater (Robinson and Stam, 1993), and whereas our approach can reproduce successfully river flow by constraining surface runoff and producing sub-surface runoff via free drainage over larger groundwater catchments like the Thames that will present a flatter terrain overall, we believe that over smaller catchments like the Ock where the slope is still high (see Figure 6) the model still needs the inclusion of the representation of an dynamic groundwater scheme, as discussed in page 12 (lines 5-9).We have modified the text in page 9 (line 15) as follows: "The Ock is the smaller catchment of the selection (234 km 2 ), located upstream within the Thames catchment (mean observed river flow of 0.6 m 3 s -1 ), and this result indicates that for upstream small catchments the slope dependency alone does not necessarily solve the problem (the Ock does not present low mean slope as the Thames does, see Fig. 6).However we point out that for the Thames catchment our new parametrization provides the best result of all catchments (NS = 0.82)."

It would be useful to add the average annual rainfall totals to Table 1.
Authors: Yes, we agree.It has been added.

P11 L 23 'all flavours of JULES' does not make sense to me. Authors:
By "all flavours of JULES represented" we meant the range of parameter configurations represented in the x axis of Fig. 11, we will change this to "all parametrizations of JULES represented", keeping consistency with the rest of the text.
8. The spectral analysis felt a little redundant given all the other tests and is barely mentioned in the discussion or conclusions.I recommend removing this from the paper or better incorporating these results into the conclusions.

Authors:
We have added text explaining how the current analysis of modelled discharge differs from Weedon et al (2015) for the Thames catchment in Section 3.6: "Note that JULES discharge performance against observations was assessed with cross spectral by Weedon et al. (2015), but the model was run at daily time steps which caused numerical artefacts in discharge variability (excessive high-frequency attenuation).Here RFM routing was applied sub-daily thereby avoiding the artefacts." We have added changed the last sentence in the "Conclusions" section for the following paragraph that clarifies the significance of including cross spectral analysis: "We have also shown that cross spectral analysis for evaluating model performance against observations quantifies the mismatches in variability, and separately mismatches in phase, at different time scales that are not otherwise apparent from global metrics such as NS and RMSE.Potentially the recognition of a specific time scale where a model is performing badly could help identification of the incorrect behaviour in terms of water transport and/or sub-surface storage.The cross-spectral analysis comparing the modelled river flow with observations has reinforced the choice of the new parametrization for surface runoff production."

Table 1 -
The authors list the catchment area of the Ock at Abingdon as 639km 2 ,