<p>Snow models are usually evaluated at sites providing high-quality meteorological data, so that the uncertainty in the meteorological input data can be neglected when assessing the model performances. However, high-quality input data are rarely available in mountain areas and, in practical applications, the meteorological forcing to drive snow models is typically derived from spatial interpolation of the available in-situ data or from reanalyses, whose accuracy can be considerably lower. In order to fully characterize the performances of a snow model, the model sensitivity to errors in the input data should be quantified.</p> <p>In this study we test the ability of six snow models to reproduce snow water equivalent, snow density and snow depth when they are forced by meteorological input data with gradually lower accuracy. The SNOWPACK, GEOTOP, HTESSEL, UTOPIA, SMASH and S3M snow models are forced, first, with high-quality measurements performed at the experimental site of Torgnon, located at 2160 m a.s.l. in the Italian Alps (control run). Then, the models are forced by data at gradually lower temporal and/or spatial resolutions, obtained (i) by sampling the original Torgnon 30-minute time series at 3, 6, and 12 hours, (ii) by spatially interpolating neighboring in-situ station measurements and (iii) by extracting information from GLDAS, ERA5, ERA-Interim reanalyses at the gridpoint closest to the Torgnon station. Since the selected models are characterized by different degrees of complexity, from highly sophisticated multi-layer snow models to simple, empirical, single-layer snow schemes, we also discuss the results of these experiments in relation to the model complexity.</p> <p>Results show that when forced by accurate 30-min resolution weather station data the single-layer, intermediate-complexity snow models HTESSEL and UTOPIA provide similar skills as the more sophisticated multi-layer model SNOWPACK, and these three models show better agreement with observations and more robust performances over different seasons compared to the lower complexity models SMASH and S3M. All models forced by 3-hourly data provide similar skills as the control run while with 6- and 12-hourly temporal resolution forcings we generally observe a reduction in model performances, except for the SMASH model which shows low sensitivity to the temporal degradation of the input data. Spatially interpolated data from neighboring stations and reanalyses result to be adequate forcings, provided that temperature and precipitation variables are not affected by large biases over the considered period. A simple bias-adjustment technique applied to ERA-Interim temperatures, however, allowed all models to achieve similar performances as in the control run. All models irrespectively of their complexity show weaknesses in the representation of the snow density.</p>