basis for a streamflow forecasting system to rincón del bonete and salto grande (uruguay)

21
ORIGINAL PAPER Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay) Stefanie Talento & Rafael Terra Received: 4 June 2012 / Accepted: 11 December 2012 # Springer-Verlag Wien 2012 Abstract This paper presents the basis for the design of stream- flow prediction systems for the hydroelectric dams of Rincón del Bonete (Uruguay) and Salto Grande (UruguayArgentina). The prediction is made, independently, for each reservoir and each month of the year with two methodologies: data-driven statistical models and hybrid downscaling that includes atmo- spheric predictors. We determine a set of potential predictors and then fit linear models coupled with variable selection techniques, under the hypothesis of perfectly known predictors. The predic- tive skill of the schemes outperforms the climatological forecast throughout the year in both reservoirs (except August in Rincón del Bonete). This remains the case even when the forecast lead does not allow for the use of preceding flows as predictors. While in Rincón del Bonete it is not possible to distinguish a period of high predictability, in Salto Grande, there is a robust signal in MarchMay and OctoberDecember. 1 Introduction In spite of the chaotic nature of the atmosphere (Lorenz 1963), monthly or seasonal averages of some hydrometeorological variables are potentially predictable, in many regions of the planet. The main source of this predictability relies on the slowly varying components of the climate system. For example, Westra and Sharma (2010) use sea surface temperature (SST) anomalies to predict precipitation, and Koster et al. (2011) analyse the contribution of soil moisture to the skill of precip- itation and air temperature forecasts. Therefore, if the anomalies of the boundary conditions that force the atmosphere are pre- dictable, then certain aspects of the climate could also be predictable. As a consequence, forecasts of monthly or seasonal streamflow averages may be possible, in a probabilistic sense. The models to predict monthly or seasonal streamflow are categorized as statistical (also known as data-driven) or dynamical (process-driven) (Wang 2006). Statistical models are built analysing historical records and identifying relationships among certain predictor variables and the variable to predict (predictand). Within this category, we can find the time series models, in particular, the auto- regressive models, which do not include any external climatic information (observed or forecasted) and, hence, are almost useless in cases were the streamflow persistence is weak and the climatic forcing is important. Typically, data-driven mod- els in which climatic information is used include, at least, an indicator of the state of the SST (see, for example, Lima and Lall 2010; Soukup et al. 2009; Westra and Sharma 2009). Dynamical models attempt to represent the chain of cau- salities in the climate system with a physical basis. These types of models conceive the streamflow as the output of a watershed system and mathematically approximate the in- ternal physical processes based on some degree of under- standing of such processes (Wang 2006). Currently, these schemes are conceived in two steps. In the first step, fore- casts of the future atmospheric state are generated. In the second step, these forecasts are entered as input to a macro- scale hydrological model which relates the atmospheric variables with streamflow. An example of this methodology can be found in Wood et al. (2002). S. Talento (*) Unidad de Ciencias de la Atmósfera, Facultad de Ciencias, Universidad de la República, Iguá 4225, Montevideo 11400, Uruguay e-mail: [email protected] S. Talento Instituto de Matemática y Estadística Rafael Laguardia, Facultad de Ingeniería, Universidad de la República, Julio Herrera y Reissig 565, Montevideo 11300, Uruguay R. Terra Instituto de Mecánica de los Fluidos e Ingeniería Ambiental, Facultad de Ingeniería, Universidad de la República, Julio Herrera y Reissig 565, Montevideo 11300, Uruguay Theor Appl Climatol DOI 10.1007/s00704-012-0822-8

Upload: rafael-terra

Post on 08-Dec-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

ORIGINAL PAPER

Basis for a streamflow forecasting system to Rincón delBonete and Salto Grande (Uruguay)

Stefanie Talento & Rafael Terra

Received: 4 June 2012 /Accepted: 11 December 2012# Springer-Verlag Wien 2012

Abstract This paper presents the basis for the design of stream-flow prediction systems for the hydroelectric dams of Rincóndel Bonete (Uruguay) and Salto Grande (Uruguay–Argentina).The prediction is made, independently, for each reservoir andeach month of the year with two methodologies: data-drivenstatistical models and hybrid downscaling that includes atmo-spheric predictors.We determine a set of potential predictors andthen fit linearmodels coupledwith variable selection techniques,under the hypothesis of perfectly known predictors. The predic-tive skill of the schemes outperforms the climatological forecastthroughout the year in both reservoirs (except August in Rincóndel Bonete). This remains the case even when the forecast leaddoes not allow for the use of preceding flows as predictors.While in Rincón del Bonete it is not possible to distinguish aperiod of high predictability, in Salto Grande, there is a robustsignal in March–May and October–December.

1 Introduction

In spite of the chaotic nature of the atmosphere (Lorenz 1963),monthly or seasonal averages of some hydrometeorological

variables are potentially predictable, in many regions of theplanet. The main source of this predictability relies on theslowly varying components of the climate system. For example,Westra and Sharma (2010) use sea surface temperature (SST)anomalies to predict precipitation, and Koster et al. (2011)analyse the contribution of soil moisture to the skill of precip-itation and air temperature forecasts. Therefore, if the anomaliesof the boundary conditions that force the atmosphere are pre-dictable, then certain aspects of the climate could also bepredictable. As a consequence, forecasts of monthly or seasonalstreamflow averages may be possible, in a probabilistic sense.

The models to predict monthly or seasonal streamfloware categorized as statistical (also known as data-driven) ordynamical (process-driven) (Wang 2006).

Statistical models are built analysing historical records andidentifying relationships among certain predictor variablesand the variable to predict (predictand). Within this category,we can find the time series models, in particular, the auto-regressive models, which do not include any external climaticinformation (observed or forecasted) and, hence, are almostuseless in cases were the streamflow persistence is weak andthe climatic forcing is important. Typically, data-driven mod-els in which climatic information is used include, at least, anindicator of the state of the SST (see, for example, Lima andLall 2010; Soukup et al. 2009; Westra and Sharma 2009).

Dynamical models attempt to represent the chain of cau-salities in the climate system with a physical basis. Thesetypes of models conceive the streamflow as the output of awatershed system and mathematically approximate the in-ternal physical processes based on some degree of under-standing of such processes (Wang 2006). Currently, theseschemes are conceived in two steps. In the first step, fore-casts of the future atmospheric state are generated. In thesecond step, these forecasts are entered as input to a macro-scale hydrological model which relates the atmosphericvariables with streamflow. An example of this methodologycan be found in Wood et al. (2002).

S. Talento (*)Unidad de Ciencias de la Atmósfera, Facultad de Ciencias,Universidad de la República, Iguá 4225,Montevideo 11400, Uruguaye-mail: [email protected]

S. TalentoInstituto de Matemática y Estadística Rafael Laguardia,Facultad de Ingeniería, Universidad de la República,Julio Herrera y Reissig 565,Montevideo 11300, Uruguay

R. TerraInstituto de Mecánica de los Fluidos e Ingeniería Ambiental,Facultad de Ingeniería, Universidad de la República,Julio Herrera y Reissig 565,Montevideo 11300, Uruguay

Theor Appl ClimatolDOI 10.1007/s00704-012-0822-8

Page 2: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

Another strategy for predicting monthly or seasonalstreamflow is usually denominated hybrid downscaling(Goddard et al. 2001). In this case, the forecasting processconsists of two stages where the statistical and the dynam-ical approaches are combined. In the first stage, a data-driven model is constructed, statistically relating the predic-tand (streamflow) with certain aspects of the climate (con-sidering, among others, the future atmospheric state). In thesecond stage, the evolution of the atmosphere–ocean cou-pled system is predicted based on a physically based nu-merical model. Finally, the forecast of the future climatestate is used as input for the statistical model to generate thestreamflow prediction.

The statistical part of the forecasting scheme can beconceived either as index-based or not. On one hand, theindex approach has the advantage of simplicity but impliesonly a one-dimensional representation of the variability and,in case where several indices are used, it may pose an ill-conditioned problem because of high mutual correlations.Examples where a statistical relationship is found betweenstreamflow and climate indices can be found in Landman etal. (2001). On the other hand, statistical techniques thatavoid the usage of indices represent a multivariate approachwhich may be advantageous in several aspects but lacks thesimplicity of the index approach. In Westra and Sharma(2009), Westra et al. (2007), Richman (1986) and Barnston(1994), some examples can be found.

The methodology selection is strongly conditioned by thetime scales involved and the degree of understanding of thephysical processes in which the predictability is based; thus,the optimal scheme varies in each case. On one hand, thestatistical models can bemisleading in cases where the climateused to train the model is not representative of the futurevariability. On the other hand, the dynamical models still havedifficulties in simulating the observed climate, compromisingtheir skill. Additionally, it is worth noting that the physicallybased numerical models are not fully exempt from the sameproblems of the statistical models, since parameters in subgridscale processes are also empirically adjusted.

The aim of this study is the design of a prediction systemof streamflow to Rincón del Bonete (Negro River, Uruguay)and Salto Grande (Uruguay River, Uruguay and Argentina)basins. A compromise has to be reached in the length of theaveraging period for streamflow. Monthly values are moreaffected by the unpredictable component of atmosphericvariability compared to longer (bi-monthly or seasonal)averages, but is preferred by the reservoir operators andtherefore was chosen.

Prediction systems will be designed using the data-drivenand hybrid downscaling approaches. Despite the fact thatstreamflows in Rincón del Bonete and Salto Grande are notindependent, each reservoir is considered separately.Independence not related to the common predictors could

increase the variance of the aggregated behaviour of bothreservoirs, which may be relevant for decision making. Inaddition, each month is also treated independently, attemptingto capture the strong seasonality of climate forcing in theselection of predictor variables. In summary, the prediction ofmonthly streamflows will be performed independently for eachreservoir and each month of the year, following the methodol-ogies of data-driven model and hybrid downscaling.

There are infinite variables, many of them related to eachother, which could be relevant for the prediction of thestreamflows. We determine an initial set of predictor varia-bles from the analysis of the regional atmospheric circula-tion, the evolution of global SST field and the persistencecomponent represented by streamflows in previous months,the latter one conditional to forecast lead.

Despite the close relationship between precipitation andstreamflow in this region, the former is not incorporated as apotential predictor because the aim of the work is to produceforecasts with lead time higher than the weather predictabilitythreshold. In addition, it is expected that the seasonal climatepredictability will be captured directly by the other predictorsconsidered. It is known that soil moisture conditions may haveimportant implications for regional hydroclimatic evolution(Grimm et al. 2007); hence, the inclusion of an index repre-senting this condition would be desirable. Here, we considerthat this state is, at least partially, represented by the anteced-ent flows which are more readily available.

El Niño Southern Oscillation (ENSO) is a quasi-periodicinteraction between the atmosphere and the tropical PacificOcean which is of fundamental importance to the global hydro-climate. ENSO events are defined when there are significantSST anomalies in the equatorial tropical Pacific; when positive(negative) anomalies occur, it is called El Niño (La Niña). Thedegree to which an ENSO event impacts the hydroclimate of aregion depends on the time of year, amplitude and spatialdistribution of the SST anomalies associated with it.

Several studies have documented, based on observations,hydroclimatic patterns associated with ENSO anomalies(Aceituno 1988, 1989; Ropelewski and Halpert 1987, 1989),in particular, on the southeastern region of South America(SESA), comprised of southern Brazil, Uruguay and part ofnortheastern Argentina.

Pisciottano et al. (1994) and Cazes-Boezio et al. (2003)found that ENSO has statistically significant effects on theclimate of SESA during the austral spring of an event yearand, albeit weaker, during the following autumn, with positiveprecipitation anomalies over SESA during El Niño and nega-tive during La Niña. Also, the relationship between ENSO andriver streamflow in SESA has been recognized. Mechoso andPérez-Irribarren (1992) found a tendency to negative anomalyof streamflow from June to December of a La Niña year, and aslightly weaker positive anomaly from November of El Niñoyear to the next February in Uruguay and Negro rivers.

S. Talento, R. Terra

Page 3: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

The set of predictor variables in the data-driven modelwill include an index representative of ENSO and thestreamflows from the previous months. The index represen-tative of ENSO can either be observed by forecast time orshould be itself a prediction. The streamflow in the previousmonths will only be available if the forecast lead permits.However, to assess the streamflow predictability and theskill of the models, we will work under the hypothesis thatthe predictors are known. This procedure will generate anupper bound of the forecast skill, since in operational mode,some predictors may not be known, and it is not possible toobtain perfect forecasts for them.

For the hybrid downscaling prediction scheme, we willadd indices of regional atmospheric circulation to the pre-dictors considered in the data-driven model. In operationalmode, forecasts of the atmospheric indices should be gen-erated through the use of an atmospheric general circulationmodel (AGCM) forced with global SST forecasts. Again, toquantify the prediction skill of the scheme, we will workunder the assumption of known predictors. Finally, we willassess the ability of a particular AGCM [the University ofCalifornia at Los Angeles (UCLA)] to predict the regionalatmospheric circulation indices.

It is worth noting that in this paper, we will follow astatistical scheme driven by indices and in which there isonly one response variable to be predicted (as both reser-voirs are treated separately). Other approaches not focusedon indices or in which the streamflows at both reservoirs areconsidered jointly are also possible. Also, we would like toemphasize that although the use of an ensemble of AGCMswould have been the preferred option, the computationaldemand made it not realizable and, therefore, many of theresults of this paper are highly model dependant.

The paper is organized as follows: In Section 2, wepresent the data and the main features of the UCLA-AGCM. In Section 3, we discuss potential predictors ofstreamflows and select 12 of them for each month andreservoir. In Section 4, we describe how to estimate theprediction error and the statistical prediction technique usedto relate the predictor variables with streamflow. InSection 5, we evaluate the skill of the UCLA-AGCM tosimulate the atmospheric predictor indices. In Section 6, wehighlight the most relevant results. Finally, in Section 7, wepresent a summary and conclusions.

2 Data and model

2.1 Observational data sets

The monthly streamflow time series of Rincón del Bonetecorresponds to the Negro river flow in the current location ofthe Dr. Gabriel Terra dam and is available from January 1908 to

December 2007. Meanwhile, the monthly streamflow timeseries of Salto Grande corresponds to the flow in the UruguayRiver in the current location of the binational Salto Grande damand covers the period January 1909–December 2008. Datawere obtained through UTE (National Electrical Utility) andthe Joint Technical Commission of Salto Grande. Figure 1shows the geographical location of both dams.

Neither streamflow time series is naturalized and could,therefore, be impacted by changes in land use and waterregulation upstream. In particular, a number of dams havebeen recently built in Salto Grande watershed. Modificationin runoff due to human activities could potentially distort theclimatic signal, but we expect this effect to be weak in thecases of study.

For the analysis of atmospheric circulation, we useNCEP-NCAR monthly reanalysis (Kalnay et al. 1996) ofthe zonal and meridional components of the wind and thegeopotential height at 200 hPa. We selected only upper-level(200 hPa) fields because those fields are both highly corre-lated with precipitation (and, therefore, streamflow) and, atthe same time, are usually better simulated by AGCMs thanlow-level fields.

Although reanalysis are available since 1948, satellitedata was incorporated only in 1979. This change is partic-ularly relevant for upper-level atmospheric variables in theSouthern Hemisphere. Before 1979, upper-level observa-tions were limited to meteorological radiosondes, whichwere extremely rare in the entire Southern Hemisphereaffecting the quality of the reanalysis of upper-level fieldsin the region.

We use the NOAA Niño 3.4 index, which is obtained byaveraging the SST (from ERSST V3B) in the 5°S–5°N and190°E–240°E region. This index is available on a monthlybasis from January 1950 to the present through http://www.cpc.noaa.gov/data/indices/.

In addition, for the numerical simulations, we use theglobal observed SST from Reynolds et al. (2002).

2.2 Selection of the period of study

All selected variables are available starting in 1950. However,as mentioned earlier, the quality of atmospheric reanalysis forupper-level variables in this region might not be appropriatebefore the incorporation of information from satellites.Indeed, differences in the relationship between atmosphericfields and flows in Rincón del Bonete and Salto Grande werefound among the periods 1948–1978 and 1979–2007 (notshown). These differences may be due to the inclusion ofsatellite data in the reanalysis after 1979, but may also beassociated with the climate shift of the late 1970s (Trenberth1990; Miller et al. 1994), interdecadal natural variability oranthropogenic climate change. Considering these aspects, weselected January 1979 as the beginning of the period of study.

Basis for a streamflow forecasting system

Page 4: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

2.3 Atmospheric model

The simulations in Section 5 are performed with the UCLA-AGCM. The version used has a resolution of 2° in latitude,2.5° in longitude and 29 levels in the vertical, which extendfrom the ground up to about 50 km above the mean sealevel. The main features and parameterizations of subgrid-scale physical processes are described in Farrara et al.(2000) and Konor et al. (2009). We simulate the periodJanuary 1979–December 2008 forcing the model with ob-served monthly SST. The model deduces the daily variabil-ity of the SST field from the monthly values by theprocedure described in Farrara et al. (2000). Six simulationsare performed varying the initial conditions and then theensemble mean is computed.

The simulated atmospheric circulation is then contrastedwith the actual observed atmospheric circulation. This typeof evaluation of the skill of a model, when extended for along enough period, shows how the model responds to SSTwhen SST is known. In an operational forecast scheme, theSST is not known but must also be predicted, hence

introducing an additional source of error. In conclusion,estimates of the reliability of climatic forecasting experi-ments conducted through “perfect forecast” of SST experi-ments should be considered as an upper bound. Morerealistic estimates of the skill of a forecast can be obtainedthrough an exercise of “retrospective forecast”, in whichSST forecasts are used as boundary conditions for the modelinstead of observed SSTs.

3 Identification and preliminary analysis of predictors

In this section, we select an initial set of predictor variablesfor each reservoir and month of the year considering theregional atmospheric circulation, ENSO and previous flows.

3.1 Regional atmospheric circulation

Considering the delay between the occurrence of an atmo-spheric phenomena and its manifestation in terms of outflowof a hydrological basin, 2-month averages of atmospheric

Fig. 1 Rincón del Bonete and Salto Grande location map

S. Talento, R. Terra

Page 5: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

fields were related to the observed flow during the secondmonth.

We first select the region between 50°S–10°S and 280°E–330°E, which includes the portion of the South Americancontinent located south of 10°S and name it SA. The atmo-spheric circulation in this region is expected to be highlycorrelated with the streamflows at both reservoirs, given thatboth basins are fully embedded inside it. Calculations wererepeated varying the region, and SA was selected as acompromise between correlation with streamflows and pre-dictive skill, i.e. the smaller the region, the higher thecorrelations with streamflow and the larger the region, thehigher the predictive skill of the atmospheric circulation.

We then perform a principal component analysis (PCA)on the covariance matrix in the region SA of the threeatmospheric fields considered: zonal wind, meridional windand geopotential height at 200 hPa. This technique isexpected to substantially reduce the number of variableswithout losing the most relevant information on interannualvariability. We conducted the PCA to departures from theannual cycle (using the 1979–2008 climatology) for each 2-month period of the year and each atmospheric field.

By requiring that 50 % of the total variance is retained, ingeneral, it is enough to keep the first two or first threeprincipal components (Pcs). To standardize, we decided toretain, in all cases, the first three Pcs.

In summary, for each of the 12 bimesters of the year, wegenerated nine indices (the retained Pcs) that reflect theinterannual variability in the region SA: three of them asso-ciated with the zonal wind, three to the meridional wind andanother three to the geopotential height, all of them in the200-hPa level. These nine indices will be used as predictorsof flows in Rincón del Bonete and Salto Grande. Theprocedure for obtaining the indices does not guarantee inany way that they will be useful for streamflow prediction.However, it is expected that they collectively capture theknown relationship between streamflow and atmosphericcirculation.

Besides the PCA analysis performed here, other variationswere tested, i.e. we calculated PCA over the correlation ma-trix, we considered the zonal and meridional wind compo-nents together, etc. However, as no important differences weredetected in the results, we decided to only show the resultsdescribed following the procedure described above.

3.2 ENSO

Given the ease with which Niño 3.4 index forecasts can beobtained, and in order to simplify the selection of predictorvariables, we identify a potential predictor of streamflowsbased on the Niño 3.4 index. However, it should be notedthat, for certain months of the year, SST in other regions ofthe Pacific Ocean could be better correlated with the flow.

Interannual variations of some atmospheric variables(such as tropospheric mean temperature in the tropics) tendto follow changes in the eastern equatorial Pacific OceanSST, resulting in maximum responses after one or twoseasons (Kumar and Hoerling 2003; Su et al. 2005 andreferences therein). Relationships between ENSO and SSTin regions different from the tropical Pacific, where theformer leads the latter, are also known (Su et al. 2005 andreferences therein). This prompts us to find the optimal leadof Niño 3.4 index as a predictor for streamflow.

Figures 2 and 3 show the lag correlation values betweenmonthly flows in Rincón del Bonete and Salto Grande,respectively, and the bimonthly Niño 3.4 index up to a yearin advance. Bimonthly averaging is performed only tosmooth the time series. Statistical significance was calculat-ed based on Student's test with 29 (30) degrees of freedomfor Rincón del Bonete (Salto Grande); hence, correlationvalues above 0.32 (0.31) are statistically significant at the95 % confidence level. For both reservoirs, all statisticallysignificant correlations are positive.

For Rincón del Bonete, the relationship streamflow–Niño3.4 index is highly seasonal with significant correlationsonly during the months of November through February(summer). Even within summer season, we find differences:while in November and December the maximum correlationis obtained with lags of 2 or 3 months, for January andFebruary the correlations with small lags are relatively low,and even not significant, but increase substantially andbecome maximum with 6 or 8 months lags, respectively.

For Salto Grande, there are significant correlations be-tween streamflow and Niño 3.4 index in all months exceptSeptember. There is not a clear seasonal pattern in therelationship in this case, although correlations are milderfor the flows between June and October.

We select the bi-monthly lagged Niño 3.4 index withmaximum significant correlation (in the absolute valuesense) with the streamflow as the predictor associated withENSO. In those cases in which no lag shows significantcorrelation, 1-month lag Niño 3.4 index is used.

3.3 Antecedent flows

Figures 4 and 5 present the autocorrelation between themonthly flow time series with 1 and 2 months lags inRincón del Bonete and Salto Grande, respectively.

For Rincón del Bonete (Fig. 4), we observe that the 1-month lag correlation of flows is positive and significant inall months of the year, except January and August. Incontrast, 2-month correlations are much lower and onlysignificant (and positive) for 4 months in the year.

For Salto Grande (Fig. 5), in general, the correlationvalues are larger than in Rincón del Bonete, being the 1-month lag correlations positive and significant throughout

Basis for a streamflow forecasting system

Page 6: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

the year and the 2-months lag correlations also positiveand significant in summer, autumn and winter. It shouldbe noted that, despite the 1-month lag correlations beingsignificant, the values also fall strongly in Septemberand October.

Based on the results shown in Figs. 4 and 5, we includethe respective streamflows in the two previous months in theset of predictor variables.

3.4 Summary and notation

For each reservoir and each month, we have selected a totalof 12 predictor variables: nine associated with the regionalatmospheric circulation, one related to ENSO and two rep-resenting the flow persistence component.

To denote each of the nine predictor variables associatedwith regional atmospheric circulation, we use the followingnotation: Pc + number (1, 2 or 3 as the first, second or thirdPrincipal Component) + variable (u, v or hgt as appropriatefor zonal wind, meridional wind or geopotential height).The optimal Niño 3.4 index will be denoted N3.4, and theflows with 1 and 2 months of antecedence will be denotedQ1 and Q2, respectively.

In general, A will denote the group of atmospheric vari-ables (i.e. Pc 1, 2, 3, u, v and hgt), O will denote the group ofoceanic variables (which, in this case, is reduced to N3.4

index) and Q will denote the group of previous flows (i.e.Q1 and Q2).

As summary, in Figs 6 and 7, we show diagrams thatindicate the absolute value of the correlation betweenthe monthly flows and the 12 predictor variables forRincón del Bonete and Salto Grande, respectively. Asbefore, only correlations statistically significant at the95 % level are shown. In general, we observe statisti-cally significant correlations, a fact that indicates that alinear relationship between the flow and the predictorsmight be suitable. However, it is worth noting that forRincón del Bonete, in August, none of the 12 predictorsselected reaches statistically significant correlations withthe flow.

4 Streamflow prediction models

We will address the problem of predicting monthlyflows at Rincón del Bonete and Salto Grande as aregression exercise, in which the 12 variables that canbe used as predictive variables are the ones in thegroups A, O or Q. The procedure is performed for eachmonth and each reservoir independently.

The 12 predictor variables selected are different innature and in lead time with respect to the predictand.

Fig. 2 Lag correlation betweenmonthly flow in Rincón delBonete and bimonthly Niño 3.4index. Month of the flow isindicated in the abscissa, whilethe Niño 3.4 index antecedenceis indicated, in months, in theordinate. Those correlationsthat fall below the threshold of95 % statistical significance areshown in white. The periodconsidered for the calculation ofthe correlation is 1979–2007

S. Talento, R. Terra

Page 7: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

Those predictors that precede the target flow have theadvantage that, if the forecast antecedence permits,could be observed prior to the forecasts release and,

therefore, would be effectively known. By contrast,those predictors that are simultaneous with the targetflow can not be observed before the forecast release

Fig. 4 One-month and 2-month lag autocorrelation of monthly flow in Rincón del Bonete. The horizontal gray line indicates the 95 % levelstatistical significance

Fig. 3 Idem Fig. 2 for SaltoGrande. The period consideredfor the calculation of thecorrelation is 1979–2008

Basis for a streamflow forecasting system

Page 8: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

and, therefore, should be themselves predicted, addingthe associated uncertainty.

In particular, if the antecedence of the forecast is greaterthan 2 months, none of the two variables in group Q will be

Fig 6 Absolute value of thecorrelation between the 12predictors selected and monthlyflows in Rincón del Bonete.Month of the flow is indicatedin the abscissa, while thepredictor variable is indicated inthe ordinate. Those correlationsthat fall below the threshold of95 % statistical significance areshown in white

Fig. 5 Idem Fig. 4 for Salto Grande

S. Talento, R. Terra

Page 9: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

available, not even in the form of an imperfect forecastbecause the flow prediction problem is precisely what weare addressing. Therefore, to generate predictions more than2 months in advance, we will not include the predictors ingroup Q.

As for N3.4, the antecedence depends on the month andthe reservoir, but in some cases, it is simultaneous andtherefore will not be known. However, Niño 3.4 index hasgreat predictability in the scale of a few months, and thereare several international centres that offer skilful forecastsmore than 6 months in advance.

In Table 1, we summarize the time periods involved witheach group of predictor variables as well as the lead timenecessary to produce at least an imperfect forecast of them,to use afterward to generate the streamflow forecast.

It is worth noting that in this paper, lead time is notexplicitly explored as a continuous variable, but rather as adiscrete one in which different situations are considered: nolead time (stream flow of previous month known), lead timeassociated to the predictability of O (ENSO) and A variables(associated to other oceans). All of these lead times are ofpractical interest in decision making.

Given the above comments, we will develop flow pre-diction schemes in the following three situations:

1. All the predictor variables (groups A, O and Q) areavailable.

2. Variables in group Q are not available. Then, the pre-dictors are reduced to groups A and O.

3. Variables in groups Q and O are available, and group Ais not considered. In particular, as a simple benchmarkfor comparison, we also develop a streamflow forecastscheme that relies only on group O, namely the N3.4index.

In order to facilitate subsequent comparisons, each of themodels developed will have an indication (via a sub-index)of the group of predictor variables used (A, O or Q).

The regression techniques are usually separated ingroups: linear or non-linear, parametric or non-parametric.In the (non-) linear techniques, it is assumed that the rela-tionship between the predictor variables and the responsevariable is (non-) linear. Meanwhile, the division betweenparametric and non-parametric schemes is generated whenthe relationship between predictors and predictand is or isnot know excepting a finite number of parameters,respectively.

Although a number of techniques, including linear, non-linear, parametric and non-parametric, were tested in the

Fig 7 Idem Fig. 6 for SaltoGrande

Basis for a streamflow forecasting system

Page 10: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

cases of study of this paper, the best results were obtainedusing linear regression coupled with a variable selectiontechnique. Therefore, this will be the only technique we willdiscuss in this paper.

This section is organized as follows: First, we will dis-cuss the concept of prediction error as well as ways toestimate it. Second, we will explain the basic concepts oflinear regression and a technique that allows to performvariable selection.

4.1 Estimation of prediction error of a regression model

In what follows, X1; . . . ;Xr denote the predictor variables,while Y denotes the response variable.

A set of m observations is given by (r + 1)-tuples:

D ¼ X i1; . . . ; X

ir ; Y

i� �

; i ¼ 1; . . . ; m� �

where X ij denotes the ith observation of variable Xj, and Yi

denotes the ith observation of variable Y.As important as the techniques to build the models are the

mechanisms for assessing their predictive skill. To definethe prediction error of a model, it is necessary to considerobservations independent from those used to develop themodel.

Generally, if there is enough data, the most commonprocedure used to estimate the prediction error is to formtwo disjoint independent sets: a training set and a test set.The data contained in the training set is used to develop theprediction model, while those contained in the test set areused only to evaluate the predictive skill. However, in casesin which this division is not practicable because of restrictedamounts of observations, alternative techniques are oftenused.

The simplest estimate of prediction error can be obtainedby the so-called re-substitution error or apparent error(Izenman 2008). In this technique, the training set consistsof all the available observations, i.e. all observations (m) areused to estimate the model. This estimate of the predictionerror tends to be too optimistic, since the same data used todevelop the model are later used to evaluate how well themodel fits the same data, for which it was optimized.

The other major class of techniques for estimating theprediction error are the re-sampling techniques, with cross-validation (cv) (Stone 1974) being the most popular. In thesemethods, disjoint training and testing data sets are recursivelyselected. In a k-fold cross-validation, the data is randomlypartitioned into k equal size subsamples. k−1 of these sub-samples are used as training sets, while the remainder is usedto test the model. In particular, if k equals the sample size, ineach step, we consider as learning set a set consisting of allobservations minus one. This observation, which is left out, isused to test the regression model obtained with the remaining(m−1) observations. This procedure is repeated alternating theobservation left out of the learning set, and the measure of theerror deduced this way is know as cv leave-one-out.

Given the limitation in the size of the set of availableobservations (m), we chose the procedure cv leave-one-outas the preferred way to estimate the prediction error andselect the optimal model.

In order that the prediction error is measured in the sameunits as the variable we are trying to predict (flow, in thiscase), we evaluate the root mean square error cv.

As a framework for comparison, we introduce the modelymean that simply predicts the average of the observationsin the learning set, i.e. ymean means predict the historicalclimatology. As we study each month separately, for monthi, the streamflow forecast of the model ymean is obtained asthe average of the observed streamflows in month i for thecases contained in the learning set. It is expected that themodels to be developed here will have prediction errorlower than the ymean model, i.e. its predictive ability isexpected to be higher than the predictive ability of forecast-ing the historical average.

4.2 Linear regression and variable selection

We use the multiple linear regression—a parametric tech-nique that assumes that the response variable Y is linearlyrelated to the predictor variables X1; . . . ; Xr in the form:

Y ¼ b0 þ b1X1 þ . . .þ brXr þ e

where e, the error term in the model, is an observable

Table 1 Detail of the time peri-ods involved in each of the pre-dictor variable sets as well as thelead time necessary to produce,at least, an imperfect forecast ofthem; considering m as targetmonth for the prediction ofstreamflow

Predictor variables to predictstreamflow in target month(m)

Time period involved Lead time when, at least, an imperfectforecast of the predictors could bepossible

A Months (m−1) and (m) >2 months

O From months (m−12) and(m−11) to months (m−1) and(m)

>2 months

Q Q1: (m−1) <2 monthsQ2: (m−2)

S. Talento, R. Terra

Page 11: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

random variable (with mean 0 and variance s2), andb0; . . . ; brare the (r+1) unknown parameters to be determined.

If we include too many predictor variables, or if some ofthem are strongly correlated, the resulting problem canbecome ill conditioned, and small changes in the input datacan lead to large changes in the coefficients. Consequently,although the fit in the learning set could be very good, themodel will not perform well when faced with new data andwill be of little value as a prediction tool.

The idea of variable selection meets the need of obtaininga simple regression model to ensure good predictive skilland avoid over-fitting. However, those techniques are criti-cized for using the same data to add or delete variables and,therefore, change the predictor variables assumed a priori.This could mean that if the data changes slightly, the vari-ables selected may also change, making the procedure un-stable. The notion of which characteristics make a variableimportant is not yet clear, but one interpretation is that avariable becomes significant if its exclusion seriously affectsthe ability of model predictions (Izenman 2008 and refer-ences therein). There are several variable selection techni-ques; in this paper, we use backward elimination.

This technique begins with the entire set of variables. Ateach step, the model eliminates the variable whose F-indexis smallest. The F-index is defined as:

F�index ¼ RSS0 � RSS1ð Þ= df1 � df0ð ÞRSS1=df1

where RSS0 is the residual sum of squares (RSS) forthe reduced model, and RSS1 is RSS for the model withmore variables. df1 � df0 ¼ 1 and df1 ¼ m� k � 1, wherek is the number of variables in the model with morevariables. Then, we re-adjust the model by eliminating avariable and repeating the procedure. A variant of thismethod is called stepwise backward elimination: at eachstep, a variable earlier eliminated can be included again(Izenman 2008). This will be the procedure used in thiswork.

Combining multiple linear regression with variableselection techniques (in our case, stepwise backwardelimination) yields a series of models with differentnumbers of predictor variables. We determine the opti-mal number of variables to retain in the model rankingthe leave-one-out cv errors that the different modelsgenerate. First, we use all available data (which willlater be divided into learning and test set) to implementthe backward elimination technique. This generates alist of selected predictor variables to form models thatuse between 1 and r variables. Second, the leave-one-out cv error is computed for each of them. The modelthat presents the lowest leave-one-out cv error will bedesignated as optimal and will be named lm-optimal,with a sub-index indicating the set of predictor variablesused for its development.

The implementation was done in R; in particular, Leapspackage was used (Lumley 2009).

Fig. 8 Correlations betweenobserved and Pc and Pc of theensemble mean. Correlationsnot significant at 95 % ornegative are shown in white.The abscissa indicates themonth of flow to be predictedby using a Pc

Basis for a streamflow forecasting system

Page 12: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

5 Skill evaluation of the UCLA-AGCM

This section evaluates the ability of the UCLA-AGCMat forecasting the atmospheric regional circulation, inparticular, group A predictors in the AS region.

We first compute the ensemble mean of all six simula-tions, its climatology for the 1979–2008 period and, sub-tracting the latter from the former, the inter-annualanomalies for every variable. We next compute the Pcs ofthe ensemble mean as the projection of these simulated

Fig. 9 Leave-one-out cross-validation errors (expressed as ratios to the error of model ymean) of the models lmAOQ-optimal, lmAUCLAOQ-optimal and lmOQ-optimal for Rincón del Bonete. The abscissa indicates the month of flow to be predicted

Fig. 10 Idem Fig. 9 for Salto Grande

S. Talento, R. Terra

Page 13: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

anomalies on the respective observed empirical orthogonalfunctions (eofs) for each bimester.

In Fig. 8, we present the correlations between the variousobserved Pcs and the Pcs of the ensemble mean for every 2-month period and variable. For consistency with previoussections, the x-axis notation indicates the second month of

the two 2-month period or target month to predict the flow.As before, in Fig. 6, the correlations that are not significant(or significant, but negative) are indicated in white. Clearly,the season with the poorest correlations between observedand simulated Pcs is late spring–early summer: in Octoberand November, only one of the variables significantly

Fig. 11 Leave-one-out cross-validation errors (expressed as ratios to the error of model ymean) of the models lmAO-optimal, lmAUCLAO-optimaland lmO for Rincón del Bonete. The abscissa indicates the month of flow to be predicted

Fig. 12 Idem Fig. 11 for Salto Grande

Basis for a streamflow forecasting system

Page 14: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

correlates and none in December. In contrast, the best cor-relations season runs from February to May where there areseveral variables with high correlations.

Finally, we define that a Pc is potentially predictable bythe model and include them in what we will name theAUCLA group, when its correlation with the correspondingPc of the ensemble mean is significant, as shown in Fig. 6. Itis important to note that these results are model dependent

and that other models may have higher or lower predictiveability over the variables of interest.

6 Results

We next show the leave-one-out cv error of the differentprediction models, for each month of the year and each

Fig. 13 Apparent errors (expressed as ratios to the error of model ymean) of the models lmAOQ-optimal, lmAUCLAOQ-optimal and lmOQ forRincón del Bonete (above) and Salto Grande (below). The abscissa indicates the month of flow to be predicted

S. Talento, R. Terra

Page 15: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

reservoir. Errors are presented as ratios with respect to the cverror of the model ymean, so that values higher (lower) than1 indicate a performance worse (better) than simply predict-ing the climatological mean. It is recalled that the predictorsare assumed known (or perfectly predictable). In operationalmode, the introduction of imperfect forecasts of the predic-tor variables will possibly generate larger errors in theprediction system.

Results in which variables of group Q are available,which depends on forecast lead, are presented separatelyfrom those in which they are not. The same goes for theresults with all the predictors of group A and those restrictedto variables in group AUCLA.

Figures 9 and 10 show the results in situations whenthe forecast lead allows the use of group Q predictors,for Rincón del Bonete and Salto Grande, respectively.

Fig. 14 Apparent errors (expressed as ratios to the error of model ymean) of the models lmAO-optimal, lmAUCLAO-optimal and lmO for Rincóndel Bonete (above) and Salto Grande (below). The abscissa indicates the month of flow to be predicted

Basis for a streamflow forecasting system

Page 16: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

The difference between the cv errors of the modelslmAOQ-optimal and lmAUCLAOQ-optimal representsthe loss in predictive ability incurred by using onlythose atmospheric predictors which are potentially pre-dictable, while the difference between lmAOQ-optimaland lmOQ-optimal is an indicator of the relative impor-tance of including atmospheric predictors in the predic-tion scheme.

For Rincón del Bonete (Fig. 9), lmAOQ-optimal haspredictive ability superior to ymean in all months exceptAugust. For this model and reservoir, there is no clear periodof higher predictability (low cv error). The predictive abilityof the lmAUCLAOQ-optimal model is, as expected, clearlyinferior to that of lmAOQ-optimal model, although non-trivial (beats climatology) in every month but August. Thebiggest difference between the lmAOQ-optimal and

Fig. 15 Fivefold cross-validation errors (expressed as ratios to the error of model ymean) of the models lmAOQ-optimal, lmAUCLAOQ-optimaland lmOQ for Rincón del Bonete (above) and Salto Grande (below). The abscissa indicates the month of flow to be predicted

S. Talento, R. Terra

Page 17: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

lmAUCLAOQ-optimal predictive ability occurs inDecember. If we do not consider any of the predictors ofgroup A (lmOQ-optimal), the predictive ability drops fur-ther, but also outperforms ymean at all times except forAugust. The cv error curves for models lmAUCLAOQ-optimal and lmOQ-optimal overlap from August toJanuary and in May. This behaviour indicates that, in thosemonths, either none of the variables of group A is

potentially predictable or those variables potentially predict-able are not selected by the variable reduction procedureimplemented.

For Salto Grande (Fig. 10), lmAOQ-optimal has predic-tive ability superior to ymean throughout the year. Twoperiods of high predictability are highlighted: March toMay and October to December. The periods with least gainin predictability are January, February and September. For

Fig. 16 Five-fold cross-validation errors (expressed as ratios to the error of model ymean) of the models lmAO-optimal, lmAUCLAO-optimal andlmO for Rincón del Bonete (above) and Salto Grande (below). The abscissa indicates the month of flow to be predicted

Basis for a streamflow forecasting system

Page 18: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

this reservoir, lmAUCLAOQ-optimal also has predictiveability higher than ymean in every month. The greatest lossof ability when one moves from A to AUCLA is noted inOctober. lmOQ-optimal also outperforms ymean throughthe year and only differentiates from lmAUCLAOQ-optimal between March and July, a period when the impor-tance of including atmospheric predictors is highlighted.

Figures 11 and 12 present the main results for anteced-ence situations that do not allow to have neither Q1 nor Q2as predictors in Rincón del Bonete and Salto Grande,respectively.

For Rincón del Bonete (Fig. 11), lmAO-optimal has aperformance superior to ymean in every month, with thehighest skill in April and June, while lmAUCLAO-optimaloutperforms ymean in all months except August andOctober. The biggest difference between lmAO-optimaland lmAUCLAO-optimal cv errors is found in April. Incontrast, lmO has ability superior to ymean only betweenNovember and April. The importance of the inclusion ofatmospheric predictors is remarkable, especially in autumn–winter season: March to July.

In Salto Grande (Fig. 12), lmAO-optimal outperformsymean throughout the year except in February with two

periods where the performance improvement is substantial:March to June (autumn) and October to December (spring).Meanwhile, lmAUCLAO-optimal and lmO (whose skillcoincides half of the year) do worse than ymean only ontwo occasions: in February and September. The biggestdifferences in the predictive abilities of the models lmAO-optimal and lmAUCLAO-optimal occur precisely in autumnand spring. The differences between the models lmAO-optimal and lmO are highest from March to June and fromOctober to December showing that, in these two periods, theinclusion of atmospheric predictors is extremely beneficial.

In Appendix, we show the list of selected variables forthe model lmAUCLAOQ-optimal.

Although the leave-one-out cv error was the measure ofthe prediction error selected to determine the optimal modelfor each month and reservoir, we next expand the analysisand show other measures of the prediction error for theselected models.

In Fig. 13, we display the results of apparent error forRincón del Bonete and Salto Grande in situations when theforecast lead allows the use of group Q predictors. As wasmentioned before, the apparent error can lead to too opti-mistic results. In fact, we can see that the apparent errors are

Fig. 17 Variables selected forthe model lmAUCLAOQ-optimal for Rincón del Bonete.Variables not available for se-lection (those not included inthe AUCLA group) are indicatedin white, variables available butnot selected are indicated ingrey and variables selected areindicated in black. The abscissaindicates the month of flow tobe predicted

S. Talento, R. Terra

Page 19: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

lower than the leave-one-out cv analogous. With this mea-sure of the predictive error, we find that in all months and inboth reservoirs, the predictive ability of the selected modelsis higher than the ability of the model ymean. In general,also with this measure of the error, the models are ordered indecreasing order of predictive ability as follows: lm-AOQ-optimal, lm-AUCLAOQ-optimal and lm-OQ-optimal.

In Fig. 14, we display the results of apparent error for Rincóndel Bonete and SaltoGrande for situations of antecedence that donot allow the use of group Q predictors. With this measure, thethree models (lmAO-optimal, lmAUCLAO-optimal and lmO)have higher or equal predictive ability than the climatologicalmean, in all the months for the two reservoirs.

Figure 15 shows the results for another measure of pre-dictive error: the fivefold cross-validation (which for thecases of study means a leave-five or leave-six out), in thecases where the Q predictors are used. For Rincón delBonete, it is worth highlighting that in August, the threemodels perform worse than ymean, and in October, themodel lmOQ-optimal is also worse than using the climato-logical mean. Comparing the fivefold with the leave-one-outcross-validation results, we see that the general results arethe same, although a strong decrease in the performance oflmAOQ-optimal in December is observed. For Salto

Grande, the results are quite similar to those obtained withthe leave-one-out procedure.

Finally, Fig. 16 shows the fivefold cross-validationerrors for the models that do not consider the predictorsin the Q set. For Rincón del Bonete, with this measureof the error, lmAO-optimal is worse than ymean inJanuary, lmAUCLAO-optimal is worse than the climato-logical mean in August and October, and lmO performsworse than ymean from May to October. For SaltoGrande, the performance of the three models is worsethan ymean in February, while lmAUCLAO-optimal andlmO have worse results than ymean also in September.

7 Summary and conclusions

An initial set of predictor variables for river streamflow wasselected based on an analysis of the relationship with regionalatmospheric circulation, ENSO and antecedent flows. Twelvepredictors were determined for each month and reservoir: ninePcs of the 200-hPa circulation (group A), one associated withENSO at an optimal lead (group O), and the 1 and 2 monthsantecedent flows Q1 and Q2 (group Q). The availability of thevariables from group Q depends on forecast lead.

Fig. 18 Idem Fig. 10 for SaltoGrande

Basis for a streamflow forecasting system

Page 20: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

Prediction systems that involve variables of groups A, Qand O belong to the category of hybrid downscalingschemes; on the other hand, those that do not involvepredictors of group A belong to the category of data-driven prediction.

A linear regression model, coupled with the backwardvariable selection technique, was adjusted for each monthand reservoir. The optimal number of variables was deter-mined by minimizing the cv leave-one-out error.

Based on the performance of the model lmAOQ-optimal,and in the context of perfectly known predictors, we concludethat except for August in Rincón del Bonete, the predictiveskill of the model outperforms the trivial climatological fore-cast throughout the year in both reservoirs (using the leave-one-out cv error). While for Rincón del Bonete we cannotclearly distinguish a period of high predictability, for SaltoGrande, the March to May and the October to Decemberseasons stand out as the most robust in this sense.

The previous result holds, in general, even when forecastlead does not allow for the use of Q1 and Q2 as predictorvariables, i.e. antecedences higher than 2 months.

In addition, we reassess the prediction skill of the linearmodels restricting the atmospheric predictors to those thatmight be, in turn, predictable, which is a situation closer tothe one that must be faced in operational mode. The deter-mination of the potentially predictable atmospheric predic-tors was done through a hindcast-type analysis with theUCLA-AGCM. Limiting the atmospheric predictors in sucha way, the models developed still have predictive abilityhigher than forecasting the historical average in both reser-voirs in most months, even under situations of antecedenceof more than 2 months. Although these results should beconsidered upper bounds of the predictive skill that themodels may have in operational model, they areencouraging.

Acknowledgements Part of this work was performed while the firstauthor was supported by a grant from Agencia Nacional de Investiga-ción e Innovación (ANII). We thank Marco Scavino for insightfuldiscussions and Gabriel Cazes for helping with the simulations.

Appendix

In Figs. 17 and 18, we show, for each month of the year, thevariables selected for the model lmAUCLAOQ-optimal forRincón del Bonete and Salto Grande, respectively. Variablesnot available for selection (those not included in the AUCLA

group) are indicated in white, variables available but notselected are indicated in grey and variables selected areindicated in black.

For Rincón del Bonete (Fig. 17), the most selected variableis Q1; N3.4 is persistently selected fromNovember to February,

and the only season in which atmospheric variables wereselected is from February to July.

For Salto Grande (Fig. 18), in every month of the year, atleast one of the variables of precedent flow is selected. N3.4is selected in a continuous manner from October to Januaryand, similar to what happened for Rincón del Bonete, theselection of atmospheric variables is restricted to the periodMarch–July.

References

Aceituno P (1988) On the functioning of the southern oscillation in theSouth American sector. Part I: surface climate. Mon Weather Rev116:505–524

Aceituno P (1989) On the functioning of the southern oscillation in theSouth American sector. Part II. Upper-air circulation. J Clim2:341–355

Barnston AG (1994) Linear statistical short-term climate predictiveskill in the Northern Hemisphere. J Clim 7:1513–1564

Cazes-Boezio G, Robertson A, Mechoso R (2003) Seasonal depen-dence of ENSO teleconnections over South America and relation-ships with precipitation in Uruguay. J Clim 16:1159–1176

Farrara JD, Mechoso CR, Robertson AW (2000) Ensembles of AGCMtwo-tier predictions and simulations of the circulation anomaliesduring winter 1997–1998. Mon Weather Rev 128:3589–3604

Goddard L, Mason SJ, Zebiak SE, Ropelewski CF, Basher R, CaneMA (2001) Current approaches to seasonal-to-interannual climatepredictions. Int J Clim 21:1111–1152

Grimm AM, Pal JS, Giorgi F (2007) Connection between springconditions and peak summer monsoon rainfall in SouthAmerica: role of soil moisture, surface temperature and topogra-phy in eastern Brazil. J Clim 20:5929–5945

Izenman AJ (2008) Modern multivariate statistical techniques,Springer texts in statistics. Springer, New York

Kalnay E et al (1996) The NCEP/NCAR 40-year reanalysis project.Bull Am Meteorol Soc 77:437–470

Konor CS, Cazes-Boezio G, Mechoso CR, Arakawa A (2009)Parameterization of PBL processes in an atmospheric generalcirculation model: description and preliminary assessment. MonWeather Rev 137:1061–1082

Koster R et al (2011) The second phase of the global land–atmospherecoupling experiment: soil moisture contributions to subseasonalforecast skill. J Hydrometeorol 12:805–822

Kumar A, Hoerling MP (2003) The nature and causes for the delayedatmospheric response to El Niño. J Clim 16:1391–1403

Landman WA, Mason SJ, Tyson PD, Tennat WJ (2001) Statisticaldownscaling of GCM simulations to streamflow. J Hydrol252:221–236

Lima CH, Lall U (2010) Climate informed monthly streamflow fore-casts for the Brazilian hydropower network using a periodic ridgeregression model. J Hydrol 380:438–449

Lorenz EN (1963) Deterministic nonperiodic flow. J Atmos Sci20:130–141

Lumley T (2009) Package leaps. http://cran.r-project.org/Mechoso CR, Pérez-Irribarren G (1992) Streamflow in southeastern

South America and the southern oscillation. J Clim 5:1535–1539Miller AJ, Cayan DR, Barnett TP, Graham EN, Oberhuber JM (1994)

The 1976–1977 climate shift of the Pacific Ocean. J Oceanogr7:21–26

Pisciottano G, Díaz A, Cazes G, Mechoso R (1994) El Niño—southernoscillation impact on rainfall in Uruguay. J Clim 7:1286–1302

S. Talento, R. Terra

Page 21: Basis for a streamflow forecasting system to Rincón del Bonete and Salto Grande (Uruguay)

Reynolds RW, Rayner NA, Smith TM, Stokes DC, Wang W (2002) Animproved in situ and satellite SST analysis for climate. J Clim15:1609–1625

Richman MB (1986) Rotation of principal components. J Climatol6:293–335

Ropelewski CF, Halpert MS (1987) Global and regional scale precip-itation patterns associated with the El Niño/southern oscillation.Mon Weather Rev 115:1606–1626

Ropelewski CF, Halpert MS (1989) Precipitation patterns associatedwith the high index phase of the southern oscillation. J Clim2:268–284

Soukup TL, Aziz OA, Tootle GA, Piechota TC, Wulff SS (2009) Longlead-time streamflow forecasting of the North Platte River incor-porating oceanic-atmospheric climate variability. J Hydrol368:131–142

Stone M (1974) Cross-validatory choice and assessment of statisticalpredictions. J R Stat Soc Ser B (Methodol) 36:111–147

Su H, Neelin JD, Meyerson JE (2005) Mechanisms for lagged atmo-spheric response to ENSO forcing. J Clim 18:4195–4215

Trenberth KE (1990) Recent observed interdecadal climate changes inthe Northern Hemisphere. Bull Am Meteorol Soc 71:988–993

Wang W (2006) Stochasticity, nonlinearity and forecasting of stream-flow processes. IOS, Amsterdam

Westra S, Brown C, Lall U, Sharma A (2007) Modeling multivariablehydrological series: principal component analysis or independentcomponent analysis? Water Resources Res 43:W06429

Westra S, Sharma A (2009) Probabilistic estimation of multivariatestreamflow using independent component analysis and climateinformation. J Hydrometeorol 10:1479–1492

Westra S, Sharma A (2010) An upper limit to seasonal rainfall predict-ability? J Clim 7:3332–3351

Wood AW, Maurer EP, Kumar A, Lettenmaier DP (2002) Long-rangeexperimental hydrologic forecasting for the eastern United States.J Geophys Res 107:4429. doi:10.1029/2001JD000659

Basis for a streamflow forecasting system