# Estimation of Long-term Trends and Loads with Low-frequency Water Quality Sampling in the Baoxiang River, One Tributary to Dianchi Lake

LI Na1,†, GUO Huaicheng2 1. Chinese Society for Urban Studies, Beijing 100835; 2. College of Environmental Sciences and Engineering, Peking University, Beijing 100871; † E-mail: lina2006@pku.edu.cn

Studies of water quality trends and pollutant loads in the Baoxiang River, a tributary to Dianchi Lake were limited by the lack of consistent data. This study evaluated long-term trends and loads using ESTREND and LOADEST with water quality data collected with low-frequency sampling and continuous daily flow data calculated by Muskingum method. Significantly increasing trends in nutrient (NH3-N, TN, and TP) concentrations were detected at the 0.05 probability level. TSS concentration showed a significant decreasing trend of 12.34 percent per year. The similar results of unadjusted and flow-adjusted concentration indicated that these trends were caused by variation in pollutant emission rather than in river discharge. Regression models within LOADEST performed very well. Most of pollutants great loaded in the wet season in comparison to the dry and normal season, due to increased transports of nonpoint source pollution. The results indicate that it is the effective way to evaluation for low-frequency sampling, and methodology can be used in other watersheds. Key words trend analysis; load estimation; seasonal Kendall; regression model; Baoxiang River

基于低频水质采样估算滇池宝象河的长期水质趋势和污染通量 李娜1,† 郭怀成2 1. 中国城市科学研究会, 北京 100835; 2. 北京大学环境科学与工程学院, 北京 100871; † E-mail: lina2006@pku.edu.cn

摘要 鉴于河流污染通量估算和水质趋势分析受到水质、流量数据缺乏的限制, 基于 ESTREND和 LOADEST模型, 利用低频采样获得离散型水质数据, 对滇池宝象河进行水质趋势分析和污染通量估算。结果表明: 1)营养物质(NH3-N, TN 和 TP)在 0.05 概率水平下呈显著上升趋势, 氮已经成为制约宝象河水质的重要因素; 2) TSS 浓度呈现显著下降趋势, 年均下降率达到 12.34%; 3) 流量调节水质和非流量调节水质出现相同的趋势,表明水质变化受流量的影响很小, 主要由污染物排放量变化引起; 4) 通过方程的系列检验, 利用离散水质数据和连续的日流量数据建立回归方程是有效的, 可以用于污染入湖通量的估算; 5) 由于非点源污染的增加，大多数污染物雨季的入湖负荷高于旱季; 6) ESTREND 和 LOADEST 模型对于解决低频、离散型水质数据的水质趋势分析和通量估算是一个有效的方法, 可以推广应用于其他流域, 其分析结果能够为流域总量控制方案的制订和评估提供有力的科学依据。关键词 趋势分析; 污染通量; 季节 Kendall 检验; 回归模型; 宝象河中图分类号 X502

Trend analysis of water quality can evaluate the actual achievements of pollutant reductions and provide scientific guidance to policy decision maker. However water quality data do not usually follow convenient probability distributions such as the wellknown normal and lognormal distributions on which many classical statistical methods are based, and are also with some problems such as short records, frequently large gaps in the database, missing data, censored data, outliers, and serial correlation. Additionally, seasonality and streamflow are other factors that can significantly effect trend analysis. Several methods are widely used, such as smoothing spline[1–3], regression[4–5], time series[6] and seasonal Kendall test[7–8]. Smoothing spline method is the simplest descriptive model, but it is clearly inadequate here as it ignores the marked seasonal pattern[9]. Regression models are not often used since their assumptions (normality, constant variance, and uncorrelation) are considered too restrictive for usual water quality data. Time series models have some limitations, mainly that the data must be observed at equally spaced time intervals[10–11]. The seasonal Kendall test is able to separate anthropogenic trends from weather-driven fluctuant, such as streamflow, seasonality, water temperature or precipitation and also is able to deal with common problems in water quality series[12].

River load estimation can provide scientific basis for total amount control, so it is a key tool in water quality management projects. The best approach to estimate long-term pollutant loads is high frequency sampling, which provides adequate data to estimate river loads and evaluate management scenarios[13]. However, less intensive sampling programs are often initiated because high frequency water quality sampling requires substantial financial and personnel resources. Now mostly pollutant concentrations are sampled often at longer intervals (i.e. weekly, monthly, or seasonally), particularly compared to sampling frequency of river discharge, at intervals of less than a day. How to use continuous daily flow data and discontinuous water-quality data to predict loads

becomes an important problem. Existing methods for load estimation can be split into three categories: averaging[14–18], ratio[19–20], and regression[21–23]. Averaging is generally considered to be the simplest and best available techniques. But this leads to over or under the estimation of loads, especially if the sampling program does not collect data from the entire range of discharge and concentration variability[24]. Ratio is well suited for cases when a large number of flow data, but only a few concentration data are available. Preston et al.[25] found that the ratio estimators were more often less precise than other approaches considered. Regression has come into widespread, because it developed a relation between pollutant concentration and streamflow to estimate load by using less data (lower costs) than other methods[26].

The ecosystem of the Dianchi Lake has been adversely affected by nutrient enrichment. The pollutants of Baoxiang River take up a larger proportion of the total amount of pollutants which flow into Dianchi Lake. Watershed-based pollutant controls are based on load estimation and trend analysis. However, studies of water-quality trend and pollutant loads in the Baoxiang River are limited by the lack of consistent data. Thus, goal of this study is to use the seasonal Kendall test and develop regression models to quantify long-term trends and loads to aid watershed management decisions. Specific objectives were to: 1) provide background information on the pollution problems in Dianchi Lake to identify trends in major water quality parameters during the study period; 2) evaluate the performance of the regression models that can estimate pollutant loads with daily flow and water quality data collected with low-frequency sampling; 3) utilize regression models to estimate annual and seasonal loads to Dianchi Lake.

1 Method and Materials 1.1 Study area and data source

Dianchi Lake is a representative inland freshwater plateau lake, located in the middle part of Yungui Plateau of southwest China. With water

quality degradation, its blue algae eruption has undergone great changes. The Baoxiang River with an extensive basin of 302 km2, is a main river which flows directly into Dianchi Lake (with 102º29'– 103º01' E, 240º29'–250º28' N) (Fig. 1). The climate of the Baoxiang River district is categorized as humid subtropical monsoonal climate, characterized by warm, humid summers and cool, wet winters. Mean annual precipitation is about 953 mm. Mean annual temperature is 14.7℃. Distribution of precipitation is uneven with most precipitation occurring December through May. The Baoxiang River typically exhibits fluctuations in stream flows, with low flows in winter and increased flows in summer. The rapid population growth, coupled with economic development and rapid urbanization, has resulted in a serious deterioration of water quality during the last several decades. Major pollution sources are domestic sewage, industrial wastewater and agricultural runoff.

Baofengcun monitoring site is located in the mouth of the Baoxiang River (Fig. 1). Manual grab samples have been collected at monthly or submonthly intervals since 1998 by Environmental Quality Monitoring Station of Kunming. The water quality data between 1999 and 2008 were used for this study. These data include the nitrate nitrogen (NO3-N), ammonia nitrogen (NH3-N), total nitrogen (TN), total phosphorus (TP), total suspended solids (TSS) and chemical oxygen demand (COD).

Daily flow data obtained from the Baofengcun site are not sufficient to estimate loads, which need continuous daily flow data. The Muskingum routing method is applied for predicting the outflow at downstream based on daily stream flow records at upstream gauging station of Ganhaizi (Fig. 1). Muskingum channel storage equation is as follow: Where, Qj and Qj+1 are the downstream discharges at jth and (j+1)th time intervals, respectively; Ij and Ij+1 are the upstream discharges at jth and (j+1)th timeᇞt ᇞt=10 intervals; C1, C2 and C3 are Muskingum coefficients;

is time intervals. According channel, hrs, Muskingum model parameters estimation is developed, based on the concept of minimizing the sum of squares using observed inflow-outflow hydrograph data of 3 times flood routing in 2008. The result is that C1=0.262, C2=0.661, C3=0.077. The modelsimulated hydrographs matched the observed hydrographs with 10% mean square error (MSE), based on 2 times observed inflow-outflow hydrograph data (Fig. 2). So the result is good, and the Fig. 3 shows the discharge hydrograph at Baofengcun site by Muskingum method.

1.2 Trend analysis

Trend Estimate program (ESTREND) includes both parametric (Tobit regression) and non-parametric (seasonal Kendall test) methods to evaluate trends in constituent water quality data[27]. Tobit regression uses maximum likelihood estimation (MLE) method to determine trends when more than 5% of the observations are censored. The seasonal Kendall test

is suitable for parameters with less than 5% censored data. The rate of change in each water-quality variable is quantified by the seasonal Kendall slope estimator[28].

It is well known that trend analysis of water quality is more difficult, when concentration is related to streamflow. To eliminate flow effects, ESTREND uses various regression models or locally weighted scatter plot smoothing (LOWESS) techniques[29] to find the concentration-flow relationship, and compute the time series of flow-adjusted concentrations (FAC). Then apply the seasonal Kendall test for trend and slope estimator to time series of FAC values. To account for seasonality in trend analysis, this test makes pairwise comparisons of data values from the same seasons[7] and then combines the results into the seasonal Kendall test statistic. In addition, this test also can deal with common problems in water quality series such as short records, missing data, outliers, irregularity in the measurement pattern and particularly serial correlation.

1.3 Mass load calculations

Load Estimator (LOADEST)[30], a FOTRAN program estimates daily, monthly or annual loads in rivers by developing regression models. Load Estimator automatically selects one of eleven predefined regression models, based on the Akaike information criterion (AIC)[31–32] (see Appendix).

LOADEST includes three methods to estimate loads: maximum likelihood estimation (MLE)[33], adjusted maximum likelihood estimation (AMLE) [34], and least absolute deviation (LAD) [35]. MLE and AMLE both assume that model residuals are normally distributed. If the calibration dataset is uncensored, the MLE is used. The AMLE method is used to estimate loads when the calibration dataset includes censored data. The LAD method estimate loads when the normality assumption is violated.

Regression models performance is assessed using two criteria: coefficient of determination (R2) and

Nash-sutcliffe’s coefficient (NSE). In addition, residual distribution is evaluated using a goodness-of-fit test described by probability plot correlation coefficient (PPCC)[36]. Serial correlation of residuals (SCR) and residual data are also used to verify the validity of the model.

2 Results and Discussion 2.1 Relationship of water-quality constituents to discharge

Plots were generated to depict the LOWESS lines of concentration as a function of discharge for NO3-N, NH3-N, TN, TP, TSS and COD at the monitoring site. LOWESS minimized the influence of outliers on the smoothed line. A smoothness factor (F) of 0.5 was used for the plots shown in Fig. 4.

The concentration of NH3-N as a function of discharge showed no substantial changes with increasing discharge, indicating there was no significant dilution effect or increase in concentration due to washoff. The NO3-N plot showed only initial dilution at lower discharges. An initial increase in COD concentration at lower discharges, followed by a decrease in concentration with increasing discharge, might directly related to discharge of point source contaminants followed by dilution. The concentrations of TN and TP as a function of discharge indicated the effects of dilution and washoff.

2.2 Temporal trends

Both raw concentration (RC) data and flowadjusted concentration (FAC) data were analyzed for trends by the seasonal Kendall test using 12 seasons (Table 1).

Based on the seasonal Kendall FAC, significantly increasing trends in NH3-N, TN, TP and COD were detected at the 0.05 probability level over the study period, which should be paid attention. The slopes of these trends ranged from 13.07 to 24.11 percent per year. Slope of NH3-N exhibited the highest value. However, concentration of TSS showed a significant decreasing trend of 12.34 percent per year during the study period. Concentration of NO3-N showed no significant trend over the study period.

Results were similar for unadjusted and flowadjusted NH3-N, TN, TP, TSS and COD concentrations, indicating that these trends were not caused by variation in stream discharge. Factors that may be contributing to gradually increasing concentration include wastewater discharge especially municipal wastewater discharge and increasing in a lot of fertilizer use in upstream.

2.3 Regression models

LOADEST outputs estimated for NO3-N, NH3-N, TN, TP, COD and TSS under the predominant flow conditions performed well for the study period with R2 values ranging from 0.68 to 0.95 (Table 2). Overall, results for TSS exhibited the highest R2 value. The relatively high R2 values indicated that loads, daily flow, and time were significantly correlated. NO3-N regression equation indicated that loads were related to flow or time was not important. When comparing estimated loads to measured loads, NSE coefficients were 0.59 (NO3-N), 0.61 (NH3-N), 0.72 (TN), 0.74 (TSS), 0.63(COD) and 0.70 (TP).

2.4 Estimation of constituent loads

During the study period 1999–2008, the average annual loads of NO3-N, NH3-N, TN, TP, TSS, COD transported from the Baoxiang River to Dianchi Lake were 44.3, 156.3, 239.2, 18.9, 5608.6, 1374.0 tons respectively. LOADEST also provided information on the error associated with the load estimate, including upper and lower limits of the 95 percentage confidence interval (CI) for each estimate. Annual average load estimates for TN generally was the least precise (Fig. 5).

Annual patterns for TSS and NO3-N were similar to the streamflow pattern, with the highest loads in 1999, a decrease in loads in 2003. Estimated annual load of TP was largest in 2005, the year of highest rate of fertilizer application. The ratio of dissolved inorganic nitrogen (DIN) to TN varied considerably throughout the study period. Peaks in the DIN:TN ratio occurred primarily when flows were at their lowest annual stages.

Annual loads exhibited seasonality, corresponding with variations in discharge and rainfall, which means constituents exhibited magnitudes and changes

during different seasons. All constituents showed greater loads in wet season comparison to the dry season (Fig. 6). There were very large seasonal variations in TSS load. On the contrary, there were very little seasonal variations in COD load. Greater nutrient loading in the wet season was expected due to

increasing nonpoint pollutant input (e.g. fertilizer, pesticides). Estimated seasonal values of the DIN:TN ratio were highest in July.

3 Conclusions

Water quality data (1999–2008) from monitoring site in the Baoxiang River were evaluated using ESTREND and LOADEST. The results of trend analysis indicated nutrient (NO3-N, NH3-N, TN, and TP) concentrations declined or exhibited no change. Results of trend will be used by the government to evaluate the effectiveness of erosion-control and land management practices. In this study, the regression model within LOADEST program performs very well (i.e., can accurately estimate loads relative to measured loads). So it becomes the effective ways to estimate pollutant loads with low-frequency water quality data and can be used with confidence to assess loads in other ephemeral watersheds.

References

[1] Hastie T J, Tibshirani R J. Generalized additive models. London: Chapman and Hall, 1990 [2] Miller J D, Hirst D. Trends in concentrations of solutes in an upland catchment in Scotland. Science of the Total Environment, 1998, 216(1/2): 77–88 [3] Ferrier R C, Edwards A C, Hirst D, et al. Water quality of Scottish rivers: spatial and temporal trends. The Science of the Total Environment, 2001, 265: 327–342 [4] Pastres R, Solidoro C, Ciavatta S, et al. Long-term changes of inorganic nutrients in the Lagoon of Venice (Italy). Journal of Marine Systems, 2004, 51 (1): 179–189 [5] Abaurrea J, Asín J, Cebrián A C, et al. Trend analysis of water quality series based on regression models with correlated errors. Journal of Hydrology, 2011, 400: 341–352 [6] Johnson H O, Gupta S C, Vecchia A V, et al. Assessment of water quality trends in the Minnesota River using non-parametric and parametric methods. Journal of Environment Quality, 2009, 38(3): 1018– 1030 [7] Hirsch R M, Slack J R, Smith R A. Techniques of trend analysis for monthly water quality data. Water Resources Research, 1982, 18(1): 107–121 [8] Carey R O, Migliaccio K W, Brown M T. Nutrient

discharges to Biscayne Bay, Florida: trends, loads, and a pollutant index. Science of the Total Environment, 2011, 409: 530–539 [9] Hirst D. Estimating trend in stream water quality with a time-varing flow relationship. Austrian Journal of Statistics, 1998, 27(1/2): 39–48 [10] Hipel K W, Mcleod A I. Time series modelling of water resources and environmental systems. Amsterdam: Elsevier Science, 1994 [11] Mattikalli N M. Times series analysis of historical surface water quality data of river Glen catchment UK. Journal of Environmental Management, 1996, 48: 149–172 [12] Esterby S R. Review of methods for the detection and estimation of trends with emphasis on water quality application. Hydrological Processes, 1996, 10: 127– 149 [13] Harmel R D, King K W, Slade R. Automated storm water sampling on small watersheds. Appl Eng Agric, 2003, 19: 667–674 [14] Walling D E, Webb B W. Estimating the discharge of contaminants to coastal waters by rivers: some cautionary comments. Marine Pollution Bulletin, 1985, 16: 488–492 [15] Webb B W, Phillips J M, Walling D E, et al. Load estimation methodologies for British rivers and their relevance to the LOIS RACS(R) Programme. The Science of the Total Environment, 1997, 194/195: 379–389 [16] Littlewoods I G, Watts C D, Custance J M. Systematic application of United Kingdom river flow and quality databases for estimating annual river mass loads. Science of the Total Environment, 1998, 210/211: 21–40 [17] Phillips J M, Webb B W, Walling D E, et al. Estimating the suspended sediment loads of rivers in the LOIS study area using infrequent samples. Hydrological Processes, 1999, 13: 1035–1050 [18] Quilbe R, Rousseau A N, Duchemin M, et al. Selecting a calculation method to estimate sediment and nutrient loads in streams: application to the Beaurivage River (Quebec, Canada). Journal of Hydrology, 2006, 326: 295–310 [19] Dolan D M, Yui A K, Geist R D. Evaluation of river load estimation methods for total phosphorus. Journal of Great Lakes Research, 1981, 7(3): 207–214 [20] Rekolainen S, Posch M, Kämäri J, et al. Evaluation of the accuracy and precision of annual phosphorus load estimates from two agricultural basins in Finland.

Journal of Hydrology, 1991, 128: 237–255 [21] Ferguson R I. Accuracy and precision of methods for estimating river loads. Earth Surface Processes and Landforms, 1987, 12: 95–104 [22] Thomas R, Meybeck M. The use of particulate matter // Chapman D. Water quality assessments. London: Chapman & Hall, 1992: 121–122 [23] Kronvang B, Bruhn A J. Choice of sampling strategy and estimation method for calculating nitrogen and phosphorus transport in small lowland streams. Hydrologic Processes, 1996, 10: 1483–1501 [24] Harmel R D, King K W, Haggard B E, et al. Practical guidance for discharge and water quality data collection on small watersheds. Transactions of the ASABE, 2006, 49(4): 937–948 [25] Preston S D, Bierman Jr V J, Silliman S E. An evaluation of methods for the estimation of tributary mass load. Water Resources Research, 1989, 25(6): 1379–1389 [26] Letcher R A, Jakeman A J, Calfas M, et al. A comparison of catchment water quality models and direct estimation techniques. Environmental Modelling & Software, 2002, 17: 77–85 [27] Schertz T W, Alexander R B, Ohe D J. The computer program estimate trend (ESTREND), a system for the detection of trends in water-quality data. US Geological Survey Water Resources Investigations Report, 1991: 91-4040 [28] Sen P K. Estimates of the regression coefficient based on Kendall’s tau. Journal of the American Statistical Association, 1968, 63: 1379–1389 [29] Helsel D R, Hirsch R M. Statistical methods in water resources: US Geological Survey Techniques of Water-resources Investigations. (2002–03) [2014– 0712]. http://water.usgs.gov/pubs/twri/twri4a3/ [30] Runkel R L, Crawford C G, Cohn T A. Load estimator (LOADEST): a FORTRAN program for estimating constituent loads in streams and rivers. US Geological Survey Techniques and Methods book. (2004–03) [2014–0712]. http://pubs.usgs.gov/tm/2005/ tm4a5 [31] Akaike H. Information theory and an extension of the maximum likelihood principle//2nd. International Symposium on Information Theory. Budapest, 1973: 267–281 [32] Akaike H. A new look at the statistical model identification. IEEE Trans Automat Control, 1974, 19(6): 716–723 [33] Cohn T A, Delong L L, Gilroy E J. Estimating constituent loads. Water Resources Research, 1989, 25(5): 937–942 [34] Cohn T A, Gilroy E J, Baier W G. Estimating fluvial transport of trace constituents using a regression model with data subject to censoring // Proceedings of the Joint Statistical Meeting. Boston, 1992: 142–151 [35] Powell J L. Least absolute deviations estimation for the censored regression model. Journal of Econometrics, 1984, 25: 303–325 [36] Vogel R M. The probability plot correlation coefficient test for the normal, lognormal and Gumbel distributional hypotheses. Water Resource Research, 1986, 22(4): 587–590

Fig. 6 Estimated seasonal loads of constituents, 1999–2008

Symbols denote mean load; lines represent 95-percent confidence intervals Fig. 5 Estimated annual loads of constituents, 1999–2008

Fig. 4 LOWESS lines of concentration as a function of discharge for all constituents

Fig. 2 Comparison of hydrograph between simulation results and measured data

Fig. 3 Daily flow of Baofengcun site from 1999 to 2008

Fig. 1 Location of the Baoxiang river watershed and gauging stations