Technische Universität München Fakultät für Bau, Geo, Umwelt Lehrstuhl für Hydrologie und Flussgebietsmanagement Univ. Prof. Dr.-Ing. Markus Disse
Implementation of Automated Validation Methods in FREEWAT and Their Testing with Remotely Received Data from a Multiparametric Probe
Study Project
Tobias Klöffel
Student number: 03636365
Study programme: Environmental Engineering
Supervisor: Dr. Gabriele Chiogna
2018
List of Abbreviations
II
List of Abbreviations
III
Table of Contents
List of Abbreviations ................................................................................................................. V
1. Introduction ......................................................................................................................... 1
1.1. Preliminary Data Analysis ............................................................................................ 1
1.2. Scope of Study Project ................................................................................................ 1
2. Multiparametric Probe ......................................................................................................... 2
2.1. Location of installation ................................................................................................. 2
2.2. Installed equipment ...................................................................................................... 3
2.2.1. Multi-Sensor-Modul MSM-S2 ................................................................................ 4
2.2.2. LogTrans-GPRS ................................................................................................... 4
2.3. SENSOweb ................................................................................................................. 5
2.4. Installation ................................................................................................................... 6
3. Theory of Implemented Methods ........................................................................................ 8
3.1. Outlier test ................................................................................................................... 8
3.2. Trend analysis ............................................................................................................. 9
3.3. Barometric correction ................................................................................................. 10
3.3.1. Confined aquifers ............................................................................................... 10
3.3.2. Unconfined aquifers ............................................................................................ 11
4. Implementation of Methods in FREEWAT ......................................................................... 13
4.1. FREEWAT ................................................................................................................. 13
4.2. The Observation Analysis Tool (OAT)........................................................................ 14
4.3. Programming ............................................................................................................. 16
4.3.1. Pre-steps and used programming tools .............................................................. 16
4.3.2. Important OAT files ............................................................................................. 16
4.4. Implemented programming codes in method.py ....................................................... 18
4.4.1. class Method() .................................................................................................... 18
4.4.2. class OutlierTest() .............................................................................................. 18
4.4.3. class MannKendall() ........................................................................................... 19
4.4.4. class BaromtricCorrection() ................................................................................ 20
5. Application of Implemented Methods ................................................................................ 22
5.1. Analysis of measurement results ............................................................................... 22
5.2. The processing GUI ................................................................................................... 25
List of Abbreviations
IV
5.2.1. The processing GUI – Outlier test ....................................................................... 27
5.2.2. The processing GUI – Trend analysis ................................................................. 27
5.2.3. The processing GUI – Barometric correction ...................................................... 27
5.3. Testing of implemented methods ............................................................................... 28
5.3.1. Application of outlier test ..................................................................................... 29
5.3.2. Application of Mann-Kendall test......................................................................... 30
5.3.3. Application of barometric correction to a confined aquifer ................................... 31
5.3.4. Application of barometric correction to an unconfined aquifer ............................. 34
6. Conclusion ........................................................................................................................ 36
6.1. Summary and discussion ........................................................................................... 36
6.2. Learning outcomes .................................................................................................... 37
List of Figures .......................................................................................................................... 38
List of Tables ........................................................................................................................... 40
References .............................................................................................................................. 41
Appendix A: method.py ............................................................................................................ 43
Appendix B: processThread.py ................................................................................................ 54
Appendix C: processTs_dialog.py ............................................................................................ 57
List of Abbreviations
V
List of Abbreviations
BE barometric efficiency
Da Pneumatic diffusivity of air through the vadose zone
DB Database
EC electrical conductivity
GUI Graphical User Interface
L Thickness of unsaturated/vadose zone
LDO Luminescent Dissolved Oxygen
NTU Nephelometric Turbidity Unit
OAT Observation Analysis Tool
OS Oxygen saturation
p exceedance probability for Grubbs-test
S Mann-Kendall test statistic
s standard deviation
u(τ) Barometric response to a step change
UIT Umwelt- und Ingenieurtechnik GmbH (company based in Dresden,
Germany)
USGS United States Geological Survey
Vbatt Battery voltage
WL Water level
x̅ arithmetic mean
Z standardised Mann-Kendall test statistic
ΔW Water-level change in wells
ΔWb Water-level change in wells due to barometric fluctuations
ΔWi Water-level change in wells due to factors like recharge and
pumping activities etc.
τ Time delay until aquifer response due to atmospheric pressure
changes for unconfined aquifer
Introduction
1
1. Introduction
1.1. Preliminary Data Analysis
No matter what measuring device is used for the determination of chemical or physical
parameters – measurements are fraught with observational errors. Those errors could be either
random or systematic, whereby the latter is predictable and constant relative to its true value. On
the contrary, the former is caused by unpredictable fluctuations without any regularities (Hartung
et al., 2009). Especially in environmental sciences, measuring devices are further exposed to
nature and its occasionally harsh conditions for longer times. Due to their sensitivity, this can
lead to malfunction or uncertainties of readings as practically experienced at times. Both
mentioned aspects necessitate a form of analysis and validation of the measured time-series
data before using them for further activities. Finally, such an analysis enables intermediate
reaction on erroneous data recording and thus the correction of wrongs.
In case measurement time-series data are used as input for hydrological models supporting
relevant decision making in water resources management, preliminary data analysis and
validation becomes inevitable. Owing to error propagation, wrong input can lead to substantial
deviations from the otherwise correct output of such models and, consequently, end in wrong
decision making.
1.2. Scope of Study Project
In this Study Project, three automated validation methods were programmed and implemented
in FREEWAT (FREE and open source tools for WATer resource management) – an EU
HORIZON 2020 Project and composite plugin of the open source GIS desktop software QGIS
(Borsi et al. 2017). Those methods constitute an extension of the original catalogue of methods
for time-series analysis provided in the software and encompass an outlier test, trend-analysis,
and barometric correction for groundwater measurements. In order to verify the functionality of
the implemented methods, time-series data was collected with a multiparametric probe, which
was installed in a pump house near the municipality of Neufahrn bei Freising. Thereby, eight
different parameters were measured and the results analysed for their applicability to the
implemented methods. If this was true for a parameter, the respective method was applied and
the outcome examined. The overall result of this work is thus a well-tested extension for the
FREEWAT plug-in aiming at preliminary data analysis and validation of input time-series data.
In the following report, extensive information is given on: the installation of the multiparametric
probe and its components (chapter 2), the theoretical background of the three implemented
methods (chapter 3), the realization of these in FREEWAT including relevant programming
background (chapter 4), and the application of the methods to time-series data (chapter 5).
Finally, a conclusion was drawn and learning outcomes were reflected. With respect to future
extensions, ideas on improvements and challenges were pointed out.
Multiparametric Probe
2
2. Multiparametric Probe
2.1. Location of installation
For the location of installation of the multiparametric probe, a pump house around 500 m south
of the municipality Neufahrn bei Freising in southern Bavaria, Germany (48°17’59’’ N latitude,
11°39’49’’ E longitude, 469 m a.s.l.) was chosen (Figure 1). It is owned by the Administration
Union of Water Supply Freising Süd and its original purpose is to supply the TUM Campus
Garching with cooling water.
Figure 1. Location of pump house, where the multiparametric probe was installed.
Therefore, the water is extracted from the quaternary aquifer (the upper of two aquifers) via three
wells – one located directly below the pump house and two 100 m east – from a depth of 18 to
20 m below ground (Vaas, 2018). Subsequently, it is released into a broad storage room with
constant free surface water table before its transfer to the target location (TUM Campus
Garching). It is thus subject to constant mixing and continual exchange with the already present
water in the storage room and is homogenized. Under these conditions, the multiparametric
probe was installed to collect time-series data.
In Figure 2 and Figure 3, the interior of the pump house and the installed equipment is
indicated.
Multiparametric Probe
3
Figure 2. Inside the pump house; (1) shows the installed data logger LogTrans-GPSR.
Figure 3. Inside the pump house; (2) shows the installed multiparametric probe Multi-Sensor-Modul MSM-S2.
2.2. Installed equipment
The two main components of the installed equipment, the data logger LogTrans-GPSR and the
multiparametric probe Multi-Sensor-Modul MSM-S2, are products by the German company
Umwelt- und Ingenieurtechnik GmbH (UIT) based in Dresden, Germany. Both were acquired by
the TUM Chair of Hydrology and River Basin Management and provided for this Study Project.
For successful operation, the components come with a probe cable, an interface (USB) cable
and the control software SENSOlog.
(1)
(2)
Multiparametric Probe
4
2.2.1. Multi-Sensor-Modul MSM-S2
The multiparametric probe Multi-Sensor-Modul MSM-S2 (Figure 4) is a device capable of
measuring different water quality parameters as well as water level. For this purpose, various
sensors are obstructed at its lower end (UIT, n.d.). The incorporated sensors measure in total
eight parameters:
• water level [m H2O]
• water temperature [°C]
• electrical conductivity [mS/cm]
• turbidity [NTU]
• oxygen saturation as Luminescent Dissolved Oxygen (LDO) [mg/l]
• pH [-]
• nitrate [pNO3-]
• battery voltage [V]
For better interpretation, water level, EC, and nitrate measurements are additionally transformed
to the units m a.s.l, mS/cm at 25 °C and mg/l, respectively.
Figure 4. The installed Multi-Sensor-Modul MSM-S2.
For more detailed information on the individual sensors, it is here referred to UIT, n.d. Yet, it is
worth mentioning that the nitrate sensor requires regular calibration. According to results from
past research at the chair and recommendations from UIT, calibration should be conducted every
two weeks (Friedrich, 2017).
2.2.2. LogTrans-GPRS
The data logger LogTrans-GPRS (Figure 5) is responsible for controlling the probe and saving
the measured data. Apart from that, the build-in GPRS (General Packet Radio Service)
functionality enables the remote transmission of data. Therefore, a SIM card with sufficient data
volume for wireless data transfer is required. The logger runs on a 12 V rechargeable battery
with a capacity of 6.5 ampere-hours. In order to prevent sudden energy shortage, the current
voltage can be retrieved as a supplementary parameter in addition to the other above-mentioned
ones. Finally, the logger has a USB 2.0 interface which is used for the connection to a computer.
Multiparametric Probe
5
By doing this and with the aid of the software SENSOlog, the probe can be fully controlled and
all relevant measurement settings become adjustable.
Figure 5. The installed data logger LogTrans-GPRS.
It shall be noted that for past projects at the chair, the more compact data logger LogTrans 6-
GPRS by the company UIT was used. Due to a defect occurring after the installation at the pump
house (see chapter 2.4), the logger was replaced by the current and bigger LogTrans-GPRS.
This has the advantage of rechargeable battery and longer measuring periods without battery
recharge.
2.3. SENSOweb
As the option for remote transmission was seized in this project, a receiving software was
required. Here, the company UIT provides a web server called SENSOweb, which can be
accessed from anywhere via a web browser. The TUM Chair of Hydrology and River Basin
Management was conferred access with a username, password and domain. For the sake of
probable future usage of the probe, the server link and access data are given in Table 1.
Table 1. Access data for TUM Chair of Hydrology and River Basin Management to the web server SENSOweb.
link https://sensoweb.de/scripts/sensoweb16.dll
username Skrobanek
password (default)
1234qwer
password (current)
Hydro_4446
domain TU_Muenchen
After successful login, the measurement site can be chosen. Currently, two sites are registered
due to past projects with the probe. Messtestlle 2 is the one used in this project. Finally, the
desired time period needs to be entered and the mode of data illustration selected. Thereby, two
options are provided: Diagramm (graph) or Tabelle (table). Both modes are indicated in Figure 6
and Figure 7, respectively.
Multiparametric Probe
6
Figure 6. Generic graphical illustration of time-series data in SENSOweb.
Figure 7. Generic tabular illustration of time-series data in SENSOweb.
2.4. Installation
For the installation, the aforementioned software SENSOlog was installed. It is used to access
the probe via connection with the interface cable to a computer. Upon successful connection, all
settings concerning measurement intervals and frequency can be adjusted as well as calibrations
for the different sensors performed. Further, a SIM card was purchased for remote transmission
of the measured data. For that, the German provider Congstar was chosen and a data volume
of 200 Megabytes was ascertained to be sufficient to ensure reliable transfer of the data
(Friedrich, 2017).
Multiparametric Probe
7
During the probe installation, some technical challenges were faced. The following time line sums
up the important milestones of that process:
14th of March 2018 Test installation of MSM-S2 and the former data logger LogTrans
6-GPRS in the pump house. Calibration of the nitrate sensor as
well as activation of remote transmission was planned for the
following month. Connection to the data logger was possible and
test measurements were positive.
18th of April 2018 Calibration of nitrate sensor and activation of remote transmission
should be conducted. However, connection to the data logger with
the computer failed.
23rd of April 2018 Attempts to connect the data logger to two other computers
remained unsuccessful. As a consequence, the installed
equipment was dismantled and sent back to UIT for investigation
on the following day.
15th of May 2018 UIT detected a defect in the board and noted the aging of the
nitrate sensor. Both problems were remedied by the company.
25th of June 2018 The MSM-S2 was sent back from UIT with the new data logger
LogTrans-GPRS. Both were positively tested at the chair the
following day.
5th of July 2018 The equipment was installed in the pump house. Connection to the
logger was possible and test measurements were positive.
6th of July 2018 The nitrate sensor was calibrated and remote transmission
established. Official measurement time-series started at noon.
Since the 6th of July, the eight above-mentioned parameters – water level, pH, EC, nitrate,
oxygen, turbidity, temperature and battery voltage – were measured every 3 hours and sent once
a day at 0.05 am to the web server. This time was chosen because grid utilization is expected to
be low at night and, thus, the data could be transmitted reliably.
Theory of Implemented Methods
8
3. Theory of Implemented Methods In the scope of this Study Project, three different methods for preliminary data analysis were
implemented in FREEWAT: outlier test, trend analysis and barometric correction of groundwater
level. In this chapter, a theoretical background of these methods is given.
3.1. Outlier test
An outlying observation (outlier) is defined as a marked deviation from the remaining samples of
a data series in which it appears. Such value may either be the result of the natural variability
inherent in the observed data or an inexplicable error in calculating or recoding the value in the
process of data capturing (Grubbs, 1974). For the latter case, the detection of such value is
desirable to enable the taking of according actions. Therefore, so-called outlier tests are used
(Hartung et al., 2009).
In this project, the common outlier test by Grubbs and Beck, 1972 (Grubbs-test) was
implemented. In hydrology, this test has for example been applied for low outliers detection in
flood series (Cohn et al., 2013). Like almost all outlier tests, the Grubbs-test is based on an
assumed Gaussian (normal) distribution within the underlying data time-series (Grubbs, 1974).
This can be regarded as unfavourable because such condition is hardly given for climatological
or hydrological data series. It is therefore stressed that outliers should not be embraced solely
by their analytical results but also their confirmation by the experimenter through for example
visual investigation (Helsel and Hirsch, 2002).
In the Grubbs-test, the suspected observation (be it high or low value) is used for calculating
a numerical value Tmax (equation 1) or Tmin (equation 2). This value is consecutively compared to
a so-called critical value Tn; 1-p, which would be exceeded by chance at a certain probability p
(niveau). The critical values are determined for specific p and n, and are provided in the literature.
Grubbs, 1974 offers an extensive list with numerous combinations of both parameters, which
was consulted for this project (Table 2).
In general, the null hypothesis
H0: xmax or xmin is no outlier
is rejected, if
Tmin = x̅ - xmin
s > Tn; 1-p (1)
or
Tmax = xmax - x̅
s > Tn; 1-p, (2)
whereby s represents the standard deviation, x̅ the arithmetic mean and n the number of samples
in the regarded data series.
Theory of Implemented Methods
9
3.2. Trend analysis
As mentioned in the previous chapter, Gaussian (normal) distributions are hardly present in
climatological and hydrological time-series data. This fact led to the decision to implement a non-
parametric trend test for trend analysis in contrast to a parametric one.
One of those is the rank-based trend test according to Mann-Kendall (Mann, 1945; Kendall,
1975), which was chosen for implementation. It is well applicable to data series including outliers
and can detect linear as well as non-linear tends. One disadvantage, however, is that it is
susceptible of autocorrelation within a time-series (Helsel and Hirsch, 2002).
In the case of a both-sided trend test, the null hypothesis
H0: {xt(1), xt+1
(1) … xn(1)} is normally distributed and no trend is present
is rejected, if the standardised test statistic Z > Zα/2, whereby Zα/2 is the value of the standard
normal distribution for an exceedance probability of α/2.
Before Z can be determined, the Mann-Kendall test statistic needs to be calculated with
S = ∑ ∑ sgn(x j- xk)nj=k-1
nk=1 (3)
where
sgn(xj-xk) = {
1 ,if xj - xk > 0
0 ,if xj - xk = 0
-1 ,if xj - xk < 0
. (4)
In equation 4, every value is compared with its successor in the data series. Now, Z can be
determined by
Z =
{
S - 1
σs
,if S > 0
0 ,if S = 0
S + 1
σs
,if S < 0
(5)
where σs is the standard deviation obtained by the square root of the corrected variance
σs2 =
[n(n - 1)(2n + 5) - ∑ ti(ti - 1)(2ti + 5)mi=0 ]
18. (6)
The correction in equation 6 is optional, but desirable, because the Mann-Kendall test is rank-
based and with that, it is possible to account for identical values. Thereby, ti is the number of
identical values for a certain value i.
As an additional parameter, usually a so-called p-value is determined, which represents (in
case of a two-sided test) twice the probability that an equal or larger Z value compared to |Z| is
obtained. This means: the larger the p-value, the higher the probability that the null hypothesis is
not rejected and no trend is present. For this step, the normalcdf function, which is an integral
component of calculation programmes and programming languages, was used in the
programming code (see Appendix A).
Theory of Implemented Methods
10
3.3. Barometric correction
Total piezometric head in a well does usually not correspond to that of the surrounding aquifer.
This makes identification of the hydraulic response to natural (e.g. rainfall) or artificial (e.g.
pumping) perturbations difficult (Rasmussen and Crawford, 1997). Among others, external stress
effects imposed by barometric fluctuations can have an appreciable influence on the water level
within open wells. This leads to wrong conclusions about the speed and direction of groundwater
flow, especially in low-gradient aquifers (Spane, 2002).
To solve this issue, techniques for barometric correction have been developed. Two of them
were implemented in this project, whereby it needs to be differentiated between confined and
unconfined aquifers.
3.3.1. Confined aquifers
Water level changes in open wells and barometric pressure fluctuations are inversely related to
each other. This means that an increase in atmospheric pressure leads to a decrease in well
water level (Rasmussen and Crawford, 1997). As far as borehole storage and well skin effects
are neglected, the barometric response in the well is an intermediate one (Spane and Mercer,
1985). For confined aquifers, the same is true for the response in the aquifers themselves
(Spane, 2002).
A helpful tool to describe barometric responses in wells was introduced by Jacob, 1940 and
is known as the barometric efficiency (BE). It is a dimensionless parameter and ranges between
0 (no response) to 1 (full response). BE can be defined as the water level change during an
arbitrary unit of time (ΔW) caused by atmospheric pressure change divided by the atmospheric
pressure change in the same time period (ΔB) (Clark, 1967):
BE = - ∆W
∆B (7)
Here, ΔW represents a lumped parameter accounting for water level changes due to atmospheric
pressure fluctuations (ΔWb) as well as for water level changes due to any other factors like e.g.
pumping and recharge activities or earth tides (ΔWi):
∆W = ∆Wb + ∆Wi (8)
For the following correction, this requires the assumption that all other influential factors, apart
from atmospheric pressure fluctuations, are negligible or have been removed from the time-
series in advance. This is an essential prerequisite in order to obtain usable values for BE (Clark,
1967; Davis and Rasmussen, 1993). For the removal of long-term and daily atmospheric
pressure-independent water level changes, Gonthier, 2007 (p. 9 ff.) gives an overview of
applicable techniques.
In most cases, several water level and barometric pressure observations are available. The
BE is expected not to change with time. However, fluctuations are of course not preventable and
natural. In this case, the BE can be obtained from continuous data with a graphical method.
Therefore, a linear regression between ΔW and ΔB with the ordinary least squares method is
conducted. In the end, BE corresponds to the slope of the line as exemplarily indicated in Figure
8 (Gonthier, 2007).
Theory of Implemented Methods
11
Once BE is determined, the corrected head Rt, corr in the well can be calculated as
Rt, corr = Wt + BE * (B0 - Bt) (9)
whereby Wt and Bt are the water level and atmospheric pressure at time t, respectively, and B0
is an arbitrary air pressure datum (usually mean air pressure at sea level: 1013.25 hPa).
Figure 8. Exemplary linear regression between ΔW and ΔB to determine BE (Gonthier, 2007).
Further methods for barometric correction in confined aquifers, which were not implementable
due to necessary pre-processing of data (e.g. trend removal, separation of ΔWb and ΔWi), can
be found in Spane, 2002 or Rasmussen and Crawford, 1997. Thereby, the method according to
Clark, 1967 protruded and proved to give good results.
3.3.2. Unconfined aquifers
Barometric corrections for unconfined aquifers represent a more complex issue. Here, an
instantaneous response to atmospheric pressure changes can likewise be observed in the
monitoring well (Spane, 2002). Yet, the aquifer itself shows a time-lagged response for the rest
of the water table. This is due to the fact that time is required for the barometric pressure wave
to propagate through the unsaturated zone until it reaches the groundwater (Rasmussen and
Crawford, 1997). While it starts acting on the water table, the water level in the well moves back
towards its original state until this is finally reached. The time delay until this response in the
aquifer starts (τ) is thereby a critical parameter to determine. Several correction techniques have
been developed for open-well systems which are able to find out τ with frequency-based methods
or multiple-regression deconvolution techniques (Spane, 2002). In the course of literature
research, it was realized that these were too complex for implementation. Instead, an alternative
method avoiding τ was chosen.
This method developed by Weeks, 1979 is regarded as an effective correction method for a
delayed total head response in unconfined aquifers by Rasmussen and Crawford, 1997. It is
mainly based on two key factors which need to be determined or estimated before the method
can be applied: the pneumatic diffusivity of air through the unsaturated zone (Da), and the
thickness of the unsaturated zone (L). The former is thereby treated as a lumped parameter that
accounts for the properties of the unsaturated soil materials and soil gas. Influencing factors are
Theory of Implemented Methods
12
vertical permeability of the vadose zone, the compressibility of the soil gas, and the moisture
content (Rasmussen and Crawford, 1997). Given Da and L, the corrected head Rt, corr can now
be determined using the convolution summation according to Carslaw and Jaeger, 1959. At first
the barometric response to a step change (u(τ)) is calculated as
u(τ) = 1 - 4
π∑
(- 1)j
kexp(- vπ2k
2)
∞
j=0
(10)
whereby
k = 2j - 1 (11)
and
v = τDa
4L2 . (12)
To avoid the critical calculation of τ, this term can be utilized in equation 13 to determine the
correcting head H(t) (Rasmussen and Crawford, 1997):
H(t) = ∑u(τ)∆B(t - τ)
n
τ=0
(13)
In the end, the Rt, corr can be calculated as
Rt, corr = Wt - ∑H(t)
n
t=0
(14)
whereby n is the amount of water-level measurements in the regarded data series and Wt is the
water-level measurement at time t.
For information on other methods used for barometric correction, it is again referred to Spane,
2002 as well as Rasmussen and Crawford, 1997.
Implementation of Methods in FREEWAT
13
4. Implementation of Methods in FREEWAT Having outlined the theoretical background of the implemented methods, it is shown how those
were realised in FREEWAT. Before that, some background information about FREEWAT and
the used programming language Python is given.
4.1. FREEWAT
Within the open-source GIS desktop software QGIS (earlier Quantum-GIS), FREEWAT
represents a composite plugin for water resources management, which has its strengths in
groundwater flow modelling and related processes. Therefore, FREEWAT draws on the well-
known 3D finite difference groundwater flow code MODFLOW and other codes (McDonald and
Harbaugh, 1988), especially on the version by Harbaugh, 2005 (MODFLOW-2005). The code
was developed by USGS (United Sates Geological Survey), is physically-based and enables the
simulation of groundwater flow dynamics in the saturated and unsaturated zone, in confined and
unconfined aquifers with a constant or variable thickness and transmissivity as well as in steady-
state or transient conditions. Hydrological simulations can be performed as soon as the relevant
input data (climate, rainfall etc.) are provided (Borsi et al., 2017).
FREEWAT consists of different modules representing distinct tools, each of them fulfilling
their individual tasks (Figure 9). The interconnection between the FREEWAT tools is done with
the help of the programming language Python or, more exact, Python 2.7, which is the integral
language of QGIS.
Figure 9.Tools and modules in FREEWAT and their interconnection (Borsi et al., 2017).
Implementation of Methods in FREEWAT
14
According to Borsi, 2017, the features performed in FREEWAT can be divided in tools
responsible for
• the analysis, interpretation and visualization of hydrogeological and hydrochemical data
and quality issues,
• the simulation of models related to the hydrological cycle and water resources
management,
• the performance of model calibration, sensitivity analysis and uncertainty quantification,
• the preparation of input data and post-processing.
The most important tool and Python package for this project was the so-called Observation
Analysis Tool (OAT), which is introduced closer in the following chapter.
4.2. The Observation Analysis Tool (OAT)
The OAT, which is integrated in the FREEWAT environment through an interface, provides many
time-series related operations from the preparation of input data to the export of data series as
well as their (post-)processing (Figure 10). Several ways are provided to load time-series data
into QGIS: directly as CSV file, from the Open Geospatial Consortium (OGC) Sensor Observation
Service (SOS) server implementation istSOS (Cannata and Antonovic, 2010), as a Raw file, or
as MODFLOW-related files (list file, hob file, gage file) (Borsi et al., 2017).
Figure 10. General structure and usage of the OAT in FREEWAT (Borsi et al., 2017).
Implementation of Methods in FREEWAT
15
From a programming point of view, the OAT is mainly structured in two classes: The first one is
the Sensor class, which is designed to handle the metadata (attributes) of the sensors as well as
the time-series data itself. The other one is the Method class, which represents a processing
method. In the library, the so-called behavioural visitor pattern is applied. This design pattern has
the big advantage that it allows the operation on data series without the necessity to make
changes on the Sensor object itself. Apart from that, the Method class can be easily extended
with further processing methods (Borsi et al., 2017), which was also done in this Study Project.
The OAT takes advantage of three essential Python libraries: NumPy, pandas and SciPy.
NumPy (Numerical Python) is the basis for scientific programming in Python. It enables Python
to work with arrays and is much more efficient in terms of storage and manipulation of numerical
data compared to its built-in data structure (Ernesti and Kaiser, 2012). pandas offers extensive
data structures and functions which were developed for the work with structured data in an easy
and efficient way. It combines the high performance with arrays as known from NumPy with a
flexible manipulability as known from spreadsheets or relational databases like SQL. It is certainly
the essential tool for a powerful and productive dealing for data analysis (McKinney, 2012).
Finally, SciPy represents a collection of packages dealing with various classical issues in
scientific computing. In combination with NumPy, it represents a quite complete substitute for the
powerful MATLAB and its extension toolboxes (McKinney, 2012).
In the OAT, every Sensor object has a single time-series consisting of a pandas time-series
and the following metadata: name, description, location (latitude, longitude, elevation), unit of
measure, observed property, coordinate system, time-zone, frequency of measurement, weight
statistic and data availability (time interval). Further, each can be stored in a spatialite database
(DB) and re-loaded with all its time-series data and metadata (Borsi et al., 2017). The time-series
data itself is stored in the form of a data frame. Therefore, the pandas-object DataFrame is used,
which represents a two-dimensional data structure in tabular form with titles for rows and columns
(McKinney, 2012). The DataFrame contains time, data, quality index, and a tag marking whether
an individual observation is used or not. An exemplary time-series of a Sensor object visualized
on an IPython console (see next chapter) is depicted in the following:
Before extending the Method object with the three previously introduced methods, 14 others were
already implemented in the OAT. For an overview of their capabilities, it is here referred to Borsi
et al., 2017 (p. 43 f.).
Implementation of Methods in FREEWAT
16
4.3. Programming
4.3.1. Pre-steps and used programming tools
As mentioned before, the OAT is an integrated package in the FREEWAT environment and, as
all other tools, written in the programming language Python. Hence, the first step was to become
familiar with this language, or more precisely with Python 2 (recent version Python 2.7), which
sometimes differs from the newer Python 3 in its syntax. Apart from that, the latter includes some
incompatible changes compared to the older versions (McKinney, 2012). After an initial training
with Python, the original FREEWAT source code was requested from the OAT developers at
Scuola Universitaria Professionale della Svizzera Italiana (SUPSI), Switzerland. Therein, the
OAT code was included and could be analysed.
For the programming done in this project, the powerful and interactive Python development
environment Spyder (Scientific PYthon Development EnviRonment) was used. It comprises the
extended Python-Shell IPython for interactive and exploratory programming and comes with an
enhanced interpreter1. Further, it delivers all important Python libraries (including pandas, NumPy
and SciPy). Another advantage compared to usual development environments (like e.g. IDLE) is
the visualization of data within the IPython shell using the Python library matplotlib (McKinney,
2012). Spyder can be freely downloaded from https://pypi.org/project/spyder/.
The design of the Graphical User Interface (GUI) for the individual methods was made with
the QtDesigner. With this, the respective .ui file (see chapter 4.3.2) could be opened and edited.
In FREEWAT, the connection between GUI and Python code is done via PyQt4 – a Python toolkit
for creating comprehensive GUIs for software based on the Qt paradigm.
4.3.2. Important OAT files
The oat folder, which is included in the FREEWAT source code, contains three subfolders (Figure
11), comprising Python files (with the ending .py) of more or lesser importance for the
implementation of the three methods. The three programmes __init__.py, config.py and oatInit.py
are required for initialisation and configuration purposes and are automatically started as soon
as the OAT is used.
Figure 11. Structure of the OAT folder.
1 Python is an “interpreted” programming language. This means that it cannot directly be read from machines but
needs to be translated to machine code by an interpreter first. Thus, it stands in contrast to programming languages like e.g. C++. The advantage of an interpreted language is its better readability. A disadvantage might be its longer run time, which is, however, negligible for uncomplex programmes (Ernesti and Kaiser, 2012).
Implementation of Methods in FREEWAT
17
During the code analysis, several Python files were identified as necessary to be altered or
extended for the implementation of the methods. Table 2 shows a list of these including their path
in the oat folder structure as well as a short description of their functions.
Table 2. Python files in OAT which were altered or extended.
name path description
sensor.py oat/oatlib This file contains the Sensor class introduced in
chapter 4.2. Here, the reading and management
of the registered sensors including their metadata
and time-series data is conducted.
method.py oat/oatlib This file contains the Method class introduced in
chapter 4.2. Here, all functions for applying the
methods to the time-series data of the registered
sensors are implemented.
This file is called by processThread.py and sends
its calculated results back to the same.
The code of method.py is attached in Appendix A.
process.ui oat/ui This file contains the GUI for processing time-
series data. It can be accessed and edited with the
QtDesigner introduced in chapter 4.3.1.
processTs_dialog.py oat/plugin/process This file represents the connection between the
interactive elements (e.g. buttons) of the GUI and
the actions performed in the OAT. Further,
processThread.py is called if a method is executed
and the necessary processes are started to
perform the according method.
After the method was run, the results are received
from processThread.py and further processed in
here (e.g. graphical representation of results)
depending on the performed method.
The code of processTs_dialog.py is attached in
Appendix C.
processThread.py oat/plugin/process This file is called by processTs_dialog.py and
contains all threads2, which in the end run the
function of the respective method in method.py.
Thereby, specifically the execution methods of the
respective classes in method.py (see next
chapter) are called. Further, the inputs of the user
in the GUI are read and passed as attributes to the
function.
2 In informatics, a thread indicates a “light-weight” process. Generally, it represents a determined order in the execution
of a programme (Ernesti and Kaiser, 2012).
Implementation of Methods in FREEWAT
18
This file receives the results from method.py and
forwards them to processTs_dialog.py, where
they are further processed.
The code of processThread.py is attached in
Appendix B.
4.4. Implemented programming codes in method.py
Having introduced the important Python files for this project in the OAT, the focus of this chapter
is put on the file method.py, where the actual execution of the methods takes place. As mentioned
before, this chapter requires a fundamental understanding of Python. Yet, certain peculiarities
and terms will be briefly outlined. For detailed explanations, it is referred to the online Python
Documentation (https://docs.python.org/2/tutorial/).
In the following, the Python codes of the three implemented methods are described in their
structure and function. For a better understanding, it is recommended to read along in the original
Python codes in method.py attached in Appendix A.
4.4.1. class Method()
Each method implemented in method.py represents an inheriting class3 of the base class class
Method()4 located in the same file. Those inheriting classes can be extended by additional
attributes and methods or can just be overwritten. Generally, the base class ensures
comparability and also consistency among the inheriting classes. In this case, it applies to the
manner of how the methods are executed (def execute()5) as well as how the calculated output
is returned (def returnResults()). In the OAT, def execute() is always overwritten by the
inheriting classes. The dealing with the output is managed by creating a dictionary6 comprising
the type and time-series data of the obtained results. Depending on the type, those will be
displayed differently during the output in the processing GUI later.
4.4.2. class OutlierTest()
For the outlier test, the class OutlierTest() was created and extended by two attributes:
self.crit and self.index. Thereby, the former represents the critical value Tn; 1-p for a certain
exceedance probability p (compare chapter 3.1). Concerning p, the user is given four different
options to choose from:
• p = 0.05
• p = 0.01
• p = 0.005
• p = 0.001
3 Such classes inherit the attributes and methods of their base class and can be extended arbitrarily by their own ones.
4 In Python, classes are initiated with the syntax class className(baseClass1, baseClass2, …). Here, the base
class is not shown for the sake of better clarity.
5 In Python, functions (also called methods if they are part of a class) are initiated with the syntax def functionName(argument1, argument2, …). Here, arguments are not shown for the sake of better clarity.
6 A dictionary in Python represents a specific data structure consisting of an unordered set of key : value pairs.
Implementation of Methods in FREEWAT
19
The chosen value is thereby read in by processThread.py and passed to method.py. The second
attribute is an index required for the calculations and set equal to 0 in the beginning.
Apart from the attributes, two additional methods were implemented in the class: def
remove_max() and def remove_min(). Both are responsible for the identification, removal and
substitution of a maximum (def remove_max()) or a minimum (def remove_min()) outlier, and
are consecutively called by the method def execute(). If an outlier was identified and removed,
it is substituted by the arithmetic mean of the previous and subsequent value. In the end, the
removed outlier as well as its substitute are saved in a dictionary with their respective time stamp.
This way, they can later be depicted in the output.
Before the methods def remove_max() and def remove_min() are executed, the method
def execute() starts by identifying the number of samples contained in the time-series (obs).
Being that obs is lower than 30, an Exception7 is raised due to the fact that for the completion of
a proper outlier test an amount of at least 30 values should be present. In this case, no outlier
test is conducted. If obs is larger or equal to 30, the time-series is tested for outliers: first, it is
checked if obs is larger or equal to 30, 40, 50, 60, 80 or 100. If obs falls in one of these ranges,
this number (numb) is chosen as the amount of samples considered for the outlier test. This way,
the maximum quantity of values is ensured. Now, def remove_max() and def remove_min()
are executed consecutively for the first numb samples and repeated until all identified outliers are
removed and replaced. Subsequently, the last numb samples are tested in the same pattern. If
obs is larger than 200, always 100 samples are considered at a time until less than that are left.
Finally, the last 100 samples are tested. At the end of every loop (in case obs > 200) or at the
end of the code, a dictionary for output (dict_outl) is updated with the most recently removed
and replaced values. This dictionary is eventually returned to methodThred.py along with the
overhauled time-series data.
4.4.3. class MannKendall()
The created class for the Mann-Kendall Test is called class MannKendall(). In this case, only
one additional attribute had to be assigned: self.alpha. It represents the α described in chapter
3.3.2. and is read in as well as passed to the class from processThread.py. In the processing
GUI, the user is given two alpha values to choose from:
• alpha = 0.05
• alpha = 0.01
Before assigning the selected value to self.alpha, the method def execute() starts by
importing the norm functions from the specific SciPy library stats, which are used later in the code
for the calculations. Then, the value for self.alpha is assigned and halved because the two-
sided Mann-Kendall Test was implemented. After the introduction of further variables for the
calculations and a dictionary for the output of the results (result), S is calculated according to
the procedure in chapter 3.3.2. In the next step, the time-series is checked for repeating values
(also called ties). If no ties are detected, the variance (var) is calculated with the uncorrected
procedure. In the other case, a correction needs to be applied. From that, the standard deviation
is determined and S can be normalized to obtain Z. Now, the p-value is calculated by applying
the norm.cdf function. Apart from that, the norm.ppf function is used to check if a trend is
7 Exceptions are errors which occur during the execution of a programme. They can be raised intentionally in case a certain condition is fulfilled (here N < 30).
Implementation of Methods in FREEWAT
20
present or not, or in other words, if h is True or False. If h is True and Z is negative, a decreasing
trend is present. Instead, if Z is positive, an increasing trend is present. For the case that h is
False, no trend could be detected. In the end, three variables are added to the dictionary result:
• trend: including the type of trend present
• S: including the calculated S-value
• p_value: including the calculated p-value
This dictionary is eventually returned to methodThread.py.
4.4.4. class BaromtricCorrection()
The third class added to method.py is class BarometricCorrection(), responsible for the
execution of the barometric correction. Therefore, several attributes needed to be assigned to it,
which are shown and explained in Table 3. All of them are read in and passed to the class from
methodThread.py.
Table 3. Additional attributes for class BarometricCorrection().
name description
self.press Represents the atmospheric pressure sensor registered in the
OAT, which is used for the barometric correction.
The user can choose the desired sensor from a sensor list
implemented in the processing GUI in the form of a combo box.
self.thickness Represents the thickness of the vadose zone.
This attribute is only considered if an unconfined aquifer is
regarded. The user can enter the value in the respective text box
in the processing GUI.
self.pneumat Represents the pneumatic diffusivity of the vadose zone.
This attribute is only considered if an unconfined aquifer is
regarded. The user can enter the value in the respective text box
in the processing GUI.
self.bar_eff Represents the barometric efficiency entered by the user or
includes the information that it should be estimated.
This attribute is only considered if a confined aquifer is regarded.
If the user knows the barometric efficiency in advance, it can be
entered in the respective text box in the GUI. Alternatively, the
barometric efficiency is estimated.
self.unit Represents the unit of the air-pressure sensor.
The user can choose the unit of the air-pressure sensor data from
a list implemented in the processing GUI in the form of a combo
box. The following units can be selected: Pa, hPa, kPa, bar,
pressure head [m H2O].
Implementation of Methods in FREEWAT
21
In the beginning of def execute(), the linregress function from the specific SciPy library stats
is imported for later calculations. Then, the attribute self.unit is checked for the selected unit
and, depending on that, the values of the atmospheric pressure time-series data are transformed
to the unit [m H2O] accordingly. In the next step, a DataFrame is created (oat_press) with two
columns: one for the atmospheric pressure head (air_pressure) and the other for the water
level measurements (head). Now it is checked, if a confined or an unconfined aquifer prevails,
which is chosen by the user in the processing GUI. In the former case, the attributes
self.thickness and self.pneumat are both given the value None8, which is identified by the
code. As a consequence, the code follows the path for a confined aquifer.
In the first step, the air pressure time-series is relativized to an air pressure datum – in this
case the average mean air pressure at sea level (1013.25 hPa). After that, it is checked if the
barometric efficiency should be estimated or not. If this is the case (if self.bar_eff ==
‘Estimated’), another DataFrame is created (d_oat_press) with two columns: one for
differences of the barometric pressure head (air_pressure_d) and the other for the differences
of the water level measurements (head_d). Afterwards, the DataFrame is cleaned from NaN9
values. Next, the linear regression between air_pressure_d and head_d is conducted with the
linregress function to estimate the barometric efficiency (compare chapter 3.3.1). This value
is then assigned to the respective variable (be). Apart from that, the Coefficient of Determination
(rsqu) is determined to check the correlation between both parameters. In case the barometric
efficiency is given by the user, this value is assigned to be accordingly and rsqu is neglected.
Finally, the corrected water level time-series is calculated. For the special case that the
barometric efficiency is 0.0 (no response at all in the aquifer), no correction is required. Before
the results are further processed, the corrected DataFrame is again cleaned from NaN values to
ensure a decent illustration of the output. In the end, it is tested if the barometric efficiency lies
within its designated range. If not, a ValueError is raised.
In case the option for unconfined aquifer was chosen, the other path starts by creating the
same DataFrame as for the confined case (d2_oat_press) only without relativizing the water
level measurements in advance. Then, the corrected water level time-series is calculated
according to the procedures explained in chapter 3.3.2. For that purpose, three for-loops had to
be included into one another from which one is executed 400 times to obtain the approximate
value for an infinite iteration. This was found out to have appreciable consequences on the
runtime of the programme. Afterwards, the corrected water level time-series is cleaned from NaN
values. The two variables be and rsqu are neglected.
In the end, the corrected and uncorrected water level time-series are renamed for the
eventual display of the results. Finally, both of them as well as the variables be and rsqu are
returned to processThread.py.
8 The None-type in Python is frequently used to represent the absence of a value. It is the counterpart of the Null-type in other programming languages (e.g. C++).
9 The NaN-type is a data type in the pandas library and stands for “Not a Number”. This value is automatically assigned to positions in a DataFrame which are not filled.
Application of Implemented Methods
22
5. Application of Implemented Methods Having introduced the implemented methods (outlier test, trend analysis and barometric
correction) from their theoretical (chapter 3) as well as their programming point of view (chapter
4), this chapter deals with the application of those in FREEWAT. The QGIS version used in this
Study Project was version 2.18, which is regarded as the most reliable at the moment of writing.
Yet, newer versions are already available. All versions can be obtained from the QGIS homepage
(http://qgis.org). In order to apply the methods, the creation of new sensors is an essential pre-
step. Knowledge about is given in
As mentioned in chapter 1, the implemented methods should be tested with data that were
collected with the installed multiparametric probe in the pump house. Therefore, it was necessary
to analyse the results from the measurements first.
5.1. Analysis of measurement results
As mentioned in chapter 2.4., measurements of the eight parameters started on the 6th of July
2018 at noon. They continued until the 4th of September 2018 at midnight with measurements
every three hours resulting in a total number of 476 measurements. The parameters included in
the analysis were water level (WL), pH, electrical conductivity (EC), nitrate, oxygen saturation
(OS), turbidity, temperature, and battery voltage (Vbatt). The following graphs illustrate the
measurement results of the individual parameters, whereby it is emphasised to consider the scale
of the y-axis for each of them:
Application of Implemented Methods
23
Application of Implemented Methods
24
Looking at the graphs, six of the eight parameters do not show any saliences or considerable
variations. This is true for WL, pH, EC, OS, temperature and Vbatt. One might argue that EC and
temperature show an upward trend, whereas OS and Vbatt show a downward trend. This
argument, however, is precarious when looking at the scales of the respective graphs. Yet, a
downward trend for Vbatt is probably reasonable to assume as the voltage of the battery is
expected to decrease with time. The lack of variations can be explained by the fact that the water
in the storage room of the pump house is subject to constant mixing with the surrounding waters.
For that reason, natural differences in the chemical properties are immediately dampened. Apart
from that, the water level is actively kept constant so that in this regard fluctuations were not
expected as already anticipated in chapter 2.
Nevertheless, two of the eight parameters show considerable variations: nitrate and turbidity.
The graph for the nitrate measurements indicates a clear downward trend starting from 25.85
mg/l nitrate for the first measurement and decreasing down to 3.18 mg/l for the last four
measurements. This trend is relatively small in the first week of measuring before it increases
and reaches its maximum in mid-July. Afterwards, it decreases again and eventually the value
becomes almost stable. As mentioned in chapter 2.2.1., a downward trend is common for the
nitrate sensor of this probe which is why it requires regular calibration. According to Friedrich,
2017, yet, such trend should be visible earliest after two weeks from starting measurements. In
combination with the observations of the Administration Union of Water Supply Freising Süd,
which ensured the stability of the nitrate value at a value of around 25 mg/l throughout the year,
a defect of the nitrate sensor can be assumed.
The second parameter showing two noticeable variations is the turbidity. Here, the graph is
mostly constant at a value between 0.64 and 0.67 NTU except for two peaks occurring at the 25th
Application of Implemented Methods
25
of July and the 27th of August. An increased turbidity may be the result of precipitation events.
However, the rainfall dataset of the German Weather Service (DWD) did not approve this
assumption for either of the two days. This means that both peaks might represent outliers in the
end.
With respect to testing the implemented methods, eventually two time-series are in line for
application: the nitrate data is suitable for trend analysis (Mann-Kendall) and the turbidity data
for outlier test. For the barometric corrections external time-series data were necessary, which
are introduced closer in chapter 5.3. Before the methods can be applied successfully, though, a
short explanation of the processing GUI of the OAT is required.
5.2. The processing GUI
The processing GUI for the OAT is created with the help of the Qt Designer including the
extensions for the implemented methods. This is very convenient due to the graphical setup of
the Qt Designer enabling that GUIs can be easily assembled by drag and drop of their features
(Figure 12). Those are then controllable via the code or vice versa.
Figure 12. Main window of the Qt Designer with the implementable features in the left column, a created GUI in
centre and editing options in the right column.
As explained in chapter 4.3.2., the file comprising the GUI for the processing of time-series data
in the OAT is process.ui, which is located in the ui folder. This GUI has the same structure
independent of the method chosen and generally consists of 6 sections shown in Figure 12. They
are explained in the following:
1) The combo box in the most upper part comprises a list with all registered sensors (sensor
list) from which one of them needs to be selected. In Figure 13, the sensor
Observed_head_1 is chosen exemplarily. As soon as the Preview-button next to the
Application of Implemented Methods
26
sensor list is clicked, the time-series data of the sensor is loaded and a preview graph
appears in a small window on the right (section 2). Now the remaining window becomes
active and a methods can be applied.
2) The small graph in the upper corner on the right-hand sides shows a preview graph of
the time-series data of the loaded sensor.
3) The combo box on the left-hand side contains the methods which can be applied to the
time-series data of the selected sensor (methods list). The Execute and Save button
below are used for executing the method and saving the results. The Overwrite tick box
can be checked, if the original time-series data should be overwritten with the new.
4) In the central upper part of the processing GUI, the options and features for the selected
method are shown. This part differs with every method and changes automatically as
soon as different methods are selected from the methods list. In Figure 13, the originally
implemented method hydro separation is pictured.
5) In the centre of the GUI, the results obtained after a method was executed are shown.
This is done either graphically, non-graphically or as a combination of both and depends
on the method itself. In Figure 13, the results of the executed method hydro separation
applied to the sensor Observed_head_1 are illustrated (only graphical illustration).
6) At the bottom of the GUI, a box shows the history of the preceding actions conducted.
Figure 13. Structure and features of the GUI for processing time-series data in the OAT with sections 1 to 6. Here,
the sensor Observered_head_1 and the originally implemented method hydro separation are selected.
Having introduced the general setup of the processing GUI, the specifics in section 4 with respect
to the three implemented methods are shown now.
1
2
3
4
5
6
Application of Implemented Methods
27
5.2.1. The processing GUI – Outlier test
When outlier test is selected from the methods list, the processing GUI shows a text asking the
user to select the desired exceedance probability (p-value) with which the Grubbs-test is
conducted. For that, the combo box in the middle comprises four options to choose from with a
maximum of p = 0.05 and a minimum of p = 0.001. Figure 14 shows section 4 of the processing
GUI when outlier test is selected:
Figure 14. Section 4 of the processing GUI if the method outlier test is selected.
5.2.2. The processing GUI – Trend analysis
When mann-kendall is selected, the user is asked to choose between the two alpha values alpha
= 0.05 and alpha = 0.01 from a combo box. The Mann-Kendall test will then be conducted with
the respective value. Figure 15 shows section 4 of the processing GUI when mann-kendall is
selected:
Figure 15. Section 4 of the processing GUI if the method mann-kendall is selected.
5.2.3. The processing GUI – Barometric correction
When barometric correction is selected, section 4 shows much more content compared to the
previous two methods. First of all, the user needs to select a radio button to choose between a
confined and an unconfined aquifer. Depending on the selection, different areas in the GUI
become active or inactive. If the radio button Confined aquifer is chosen, the user needs to enter
the barometric efficiency or alternatively check the tick box below so that the barometric efficiency
is estimated. On the other hand, the fields for entering the thickness of the vadose zone and the
pneumatic diffusivity become inactive. In case the radio button Unconfined aquifer is chosen, the
opposite is the case and the field for entering the barometric efficiency as well as the check box
become inactive.
Regardless of what aquifer is chosen, the user needs to select the atmospheric pressure
sensor from a combo box, which comprises a list of all registered sensors. Finally, a second
combo box provides different options for the unit of the atmospheric pressure readings. Figure
16 and 17 show section 4 of the processing GUI when barometric correction is selected for a
confined and an unconfined aquifer.
Application of Implemented Methods
28
Figure 16. Section 4 of the processing GUI if the method barometric correction is selected and Confined aquifer
chosen.
Figure 17. Section 4 of the processing GUI if the method barometric correction is selected and Unconfined aquifer
chosen.
5.3. Testing of implemented methods
Once the methods were programmed and implemented in the original OAT code, they were
tested with the aforementioned input data. For the barometric correction, external input data was
used. In order to process registered sensors in FREEWAT with the OAT, one needs to select
FREEWAT in the upper menu bar in QGIS, which will be included automatically as soon as
FREEWAT is successfully installed. Afterwards, the menu item OAT can be chosen. When
clicking Process time series from the emerging item list, the processing GUI appears (Figure 18).
Figure 18. QGIS path for opening the processing GUI of the OAT in FREEWAT.
Application of Implemented Methods
29
5.3.1. Application of outlier test
For the application of the outlier test to the turbidity readings, a new sensor with the according
data was created first. Subsequently, the processing GUI was opened as described above and
the turbidity sensor was selected from the sensor list. After clicking the Preview-button, the time-
series data of the turbidity sensor was loaded and the preview graph appeared. Then, the method
outlier test could be chosen from the methods list. Concerning the exceedance probability, the
highest value p = 0.05 was chosen first. Afterwards, the results were compared to the lowest
possible exceedance probability p = 0.001. An overview of the settings is shown in Figure 19:
Figure 19. Settings for applying the outlier test to the turbidity readings. Here, an exceedance probability of p = 0.05
is chosen.
By clicking the Execute-button, the outlier test was run.
For the outlier test, the results in Figure 20 appear in the form of a graph showing the
overhauled time-series and a box below. This box includes a list with the values of all detected
outliers sorted by date as well as the value by which they were replaced. For a decent visual
comparison of the resulting graph to the preview graph, it was necessary to adapt the scale of
the y-axis. This can be done by clicking the Configure subplots-button ( ).
Figure 20. Processing GUI after the outlier test was applied to the turbidity readings (p = 0.05).
Application of Implemented Methods
30
For an exceedance probability of p = 0.05, 11 outliers were detected and replaced. This includes
the two extreme values which were expected to represent outliers during the analysis of the
measurements. The reason for the high number is the small general deviation in the time-series
except for the two extreme values. This has an influence on the value of the standard deviation
and thus a slight variation is sufficient to be detected as an outlier. In contrast to p = 0.05, an
exceedance probability of p = 0.001 only detected 8 outliers because the test becomes less
sensitive and values need a higher deviation to be detected as such. The resulting graph,
however, showed a strong similarity to p = 0.05.
It is noted here that the appearance of the resulting graph represents the common design
which is used for all other methods in the OAT (including the other two implemented ones). This
way, visual uniformity for all methods is guaranteed.
5.3.2. Application of Mann-Kendall test
As previously mentioned, the Mann-Kendall test was applied to the nitrate readings. Therefore,
a new sensor was created and the time-series data added. As for the outlier test, the processing
GUI was opened and afterwards the nitrate sensor was chosen from the sensor list. By clicking
the Preview-button, the time-series data was loaded and the preview graph appeared. Now, the
method mann-kendall was selected from the methods list. As an alpha value, alpha = 0.05 was
chosen for the first run and was afterwards compared to alpha = 0.01. An overview of the settings
is shown in Figure 21:
Figure 21. Settings for applying the Mann-Kendall test to the nitrate readings. Here, an alpha value of alpha = 0.05
is chosen.
By clicking the Execute-button, the Mann-Kendall test was run.
As opposed to the outlier test, the results for the Mann-Kendall test in Figure 22 appear only
in the form of a box. This box includes values for three parameters: trend, p-value and S. For
alpha = 0.05, they showed decreasing trend, 0.0 and -112141, respectively. The p-value of 0.0
can be interpreted in a way, that the null hypothesis is for sure rejected and a trend is present
(compare chapter 3.2.). The very large negative value for S emphasizes the fact, that a strong
decreasing trend is prevalent as stated by the trend parameter.
Comparing the results with an alpha value of alpha = 0.01, the three parameters give the
same values. This was expected because the alpha value only states how pronounced a trend
must be to be detected as one. In this example, the negative trend is very clear.
Application of Implemented Methods
31
Figure 22. Processing GUI after the Mann-Kendall test was applied to the nitrate readings with an alpha = 0.05.
5.3.3. Application of barometric correction to a confined aquifer
In order to apply and test the implemented barometric correction, external input data had to be
used. The reason is that atmospheric pressure fluctuations do not show any water level
alterations in the pump house. As mentioned before, the water level there is actively kept constant
and, apart from that, the free surface water in the storage room does not have any connection to
the surrounding aquifer.
Figure 23. External water level input data for a confined aquifer with artificial air pressure time-series.
For the case of a confined aquifer, water level measurements from a FREEWAT Remote Training
Course conducted between the 5th of March and 3rd of June 2018 were used. Those comprise
Application of Implemented Methods
32
daily measurements for a period of 30 days. Determined by that, an artificial atmospheric
pressure time-series was created in a way, that atmospheric pressure variations resulted in an
immediate water level response, which is common for confined aquifers (Figure 23). Thereby, a
random factor of 0.833 was chosen which is equal to the barometric efficiency (compare chapter
3.3.1.). As mentioned in chapter 3.3., it needs to be assumed that all other factors which might
influence the water level in the well were either removed or can be neglected.
In order to conduct the barometric correction, for both time-series a new sensor was created
and the processing GUI was opened. Afterwards, the water level sensor was selected from the
sensor list and, by clicking Preview, the time-series data was loaded and the preview graph
appeared. Further, the method barometric correction was chosen from the methods list. In the
appearing options, the radio button Confined aquifer was selected and the fields with barometric
efficiency became active. In order to test the estimation of the barometric efficiency, the tick box
was checked for the first run, which leads to the consequence that the text field for manual input
becomes ineffective (but stays active). Afterwards, the results were compared with a random
input of 0.5 without estimation. Finally, the atmospheric pressure sensor was selected from the
other sensor list and the according measuring unit was chosen – in this case hPa. An overview
of the settings is shown in Figure 24:
Figure 24. Settings for applying the barometric correction to external water level measurements with an artificial
atmospheric pressure time-series in hPa. Here for a confined aquifer where the barometric efficiency was estimated.
By clicking the Execute-button, the correction was conducted.
The results of the correction shown in Figure 25 appear in the same form as for the outlier
test with a graph and a box below. Thereby, the graph shows the original and the corrected water
level time-series. It can be seen that the latter depicts a straight line because all water level
variations due to atmospheric pressure fluctuations were removed. On the other hand, the box
shows the results of two parameters: the barometric efficiency (in this case estimated) and the
R2 for the correlation between the change in water level to the change in barometric head (see
chapter 3.3.1.). As expected, the barometric efficiency shows a value of 0.833 and the R2 a value
of 1.0, corresponding to a perfect correlation. It is strongly emphasized that such a value would
not appear in reality and the corrected water level would usually not show a perfect straight line.
This was only done for demonstration purposes.
If the tick box for estimating the barometric efficiency is unchecked, the value in the input
field is used for the correction. By entering a barometric efficiency of 0.5, which had for example
been determined through experiments, the resulting graph in Figure 26 sowed a reduced effect
of atmospheric pressure fluctuations to water level changes compared to a barometric efficiency
of 0.833. This means that the remaining variations might be the result of other factors like for
example pumping or recharge activities. In this case, the R2 is not calculated because the
correlation between the change in water level and the change in barometric head is not of
interest. Needless to say, the value shown for the barometric efficiency is 0.5.
Application of Implemented Methods
33
Figure 25. Processing GUI after the barometric correction for confined aquifers was applied to external water level
readings and artificial atmospheric pressure time-series. The barometric correction was estimated.
Figure 26. Processing GUI after the barometric correction for confined aquifers was applied to external water level
readings and artificial atmospheric pressure time-series. The barometric correction was not estimated.
Application of Implemented Methods
34
5.3.4. Application of barometric correction to an unconfined aquifer
The input data for application of the barometric correction to an unconfined aquifer was taken
from Spane, 2002. With the help of a graph data extractor, a water level and the corresponding
atmospheric pressure time-series was obtained from a figure representing hourly measurements
within a period of six months. The respective graphs are shown in Figure 27, whereby the upper
graph represents atmospheric pressure, the lower water level and the one in between the
corrected water level.
Figure 27. Water level and atmospheric pressure input data used for application of the barometric correction to an
unconfined aquifer (Spane, 2002).
Once both time-series were extracted, two new sensors were created and the processing GUI
was opened. The water level sensor was selected from the sensor list and the data loaded by
clicking the Preview-button. Consequently, the preview graph appeared and the method
barometric correction could be selected from the methods list. Now the radio button Unconfined
aquifer was clicked from the appearing options and the two fields for entering the thickness of
the vadose zone and pneumatic diffusivity become active. On the other side, the fields for
barometric efficiency are inactivated. As noted in chapter 3.3.2., the values for the thickness of
the vadose zone as well as the pneumatic diffusivity need to be determined and known before
the correction can be applied. In this case, both parameters were obtained from the paper being
68 m and 0.10 m2/s, respectively. Having entered them, the atmospheric pressure sensor was
selected from the other sensor list and the measuring unit was chosen (pressure head [m]). An
overview of the settings is shown in Figure 28:
Application of Implemented Methods
35
Figure 28. Settings for applying the barometric correction to external water level and atmospheric pressure time-
series in hPa. Here for an unconfined aquifer.
By clicking the Execute-button, the correction was conducted.
The appearing results in Figure 29 show the same setup as for confined aquifers consisting
of a graph and a box below including the same two parameters. The barometric efficiency and
R2, however, are not calculated for unconfined aquifers because both are not relevant for
performing the correction. The resulting graph again shows the original in comparison to the
corrected water level time-series. It can be seen that the upper and lower peaks were mostly
dampened by the correction and the graph becomes more smooth as it was expected. Yet, the
result does not correspond perfectly to the middle graph in Figure 27. The reason for that is that
a maximum accuracy of one value per day was possible for the water level extractions with the
graph data extractor. This of course leads to deviations. However, the general trend of the graph
is well matched.
Figure 29. Processing GUI after the barometric correction for unconfined aquifers was applied to external water level
and atmospheric pressure time-series.
Conclusion
36
6. Conclusion
6.1. Summary and discussion
In this Study Project, three methods – outlier test according to Grubbs, trend analysis according
to Mann-Kendall and barometric correction – were implemented in the QGIS composite plugin
FREEWAT. Therefore, the particular tool for time-series management and processing (the
Observation Analysis Tool) was extended. Apart from that, time-series data was collected with a
multiparametric probe in a pump house near the municipality of Neufahrn bei Freising for almost
two months with the aim to obtain a data basis to test those methods. This was possible for the
outlier and Mann-Kendall test by taking the measurements of the two parameters turbidity and
nitrate as input data, respectively. For testing the barometric correction, external input data had
to be used because water level measurements of the multiparametric probe were not suitable
due to the lack of impact of atmospheric pressure deviations on the water level in the pump
house. Apart from that, no connection between the water there and the surrounding aquifer
prevailed. It is worth mentioning that the battery of the multiparametric probe sustained without
any recharging of the data logger for the whole measurement period. This lasted almost two
month and measurements were conducted every three hours. Further the option for remote
transmission was used. Compared to the former data logger used at the chair which had to be
recharged with batteries after two weeks while performing comparable activities (Friedrich, 2017),
this represents a great benefit.
The application of the tests showed that all implemented methods are working well in
combination with the prevailing input data: for the turbidity readings, the two expected outliers
were removed and replaced for both exceedance probabilities – p = 0.05 and p = 0.001. Apart
from those, even more were detected whereby the amount was higher for p = 0.05 (11 outliers)
compared to p = 0.001 (8 outliers). The reason for the high number of outliers is thereby not an
error in the programming but is explicable with a lack of variation in the time-series itself
influencing the results of the calculations in equation 1 and 2. Hence, it may be legitimately
doubted that some of the replaced values really represent outliers. In the end, it is the user who
decides if the analytical results are meaningful or not. Helpful in this regard may be the
examination of the resulting graph or of influencing factors (for example precipitation events in
the case of turbidity measurements). For the trend analysis, the expected downward trend of the
nitrate time-series was confirmed with both alpha values of 0.05 and 0.01. The high negative S
obtained in the calculations emphasized the peculiarity of the trend. Concerning the barometric
corrections, the expected results were obtained as well for both the confined and unconfined
aquifer. It would have been desirable to confirm the functionality of the correction with additional
input data. Yet, due to the complexity of the process (outlined below) and the lack of suitable
time-series data of water level and atmospheric pressure measurements this was not feasible in
the scope of this Study Project.
The barometric correction represented the most challenging of the three methods. In order
to realize its implementation, some simplifications or pre-processings are required: first of all it
needs to be assumed that no fluctuations of the water level due to any other factors are present
in the input time-series. Such fluctuations might be the result of tidal or river stage cycles, earth
tides, tectonic events or pumping and recharge activities (Spane, 2002). The latter might be
Conclusion
37
detected by a trend analysis and removed subsequently. This could be a subject for further
improvements of the implemented method. Apart from that, two other factors are not considered
here: borehole storage and well skin effects. Both can result in a delayed response to
atmospheric pressure changes (Rasmussen and Crawford, 1997). A well skin which increases
clogging with time can in this regard also lead to a change in the barometric efficiency (Gonthier,
2007). In return, this reduces the reliability of the outcomes obtained by the barometric correction
for confined aquifers. Spane, 2002 gives an overview of methods concerning this issue which
could be a subject for further improvements as well. Finally, another extension could be the option
to choose between open and closed well systems. In this Study Project, the former case was
assumed.
6.2. Learning outcomes
Many beneficial learning outcomes were drawn from this Study Project with respect to future
scientific or practical work: first of all, the work with the multiparametric probe sensitised for the
sensitivity of complex measuring devices used for creating input time-series data. In this regard,
regular analysis of the measurements is an essential routine which should not be
underestimated. Only like that a correct data basis can be obtained for further usage (for example
in hydrological models) and a reliable output can be generated.
Further, the examination of different statistical methods for outlier test and trend analysis
fostered the personal knowledge about those appreciably. Such methods are frequently used in
various disciplines and fields in environmental sciences.
Concerning the barometric correction, the complexity of this issue was realised and
understood. In the future, water level measurements of open well systems will always be
considered with respect to this influencing factor.
In the end, the biggest beneficial learning outcome was certainly drawn from the engagement
with the programming language Python. This open-source language with an enormous user
community is often used in environmental sciences for instance for the interaction with or the
writing of programmes (ArcGIS and QGIS), hydrological models (FREEWAT) and in many other
aspects. Apart from that, its strengths for the data analysis with pandas, the creation of graphs
with matplotlib and scientific calculations with NumPy and SciPy were realised. This enables an
independence from programmes which require the purchase of expensive licenses like MATLAB
or SPSS while being able to produce similar results. This is especially beneficial when working
in companies or institutions which cannot afford such licenses.
List of Figures
38
List of Figures Figure 1. Location of pump house, where the multiparametric probe was installed. .................. 2
Figure 2. Inside the pump house; (1) shows the installed data logger LogTrans-GPSR. ........... 3
Figure 3. Inside the pump house; (2) shows the installed multiparametric probe Multi-Sensor-
Modul MSM-S2. ......................................................................................................................... 3
Figure 4. The installed Multi-Sensor-Modul MSM-S2. ............................................................... 4
Figure 5. The installed data logger LogTrans-GPRS. ................................................................ 5
Figure 6. Generic graphical illustration of time-series data in SENSOweb. ................................ 6
Figure 7. Generic tabular illustration of time-series data in SENSOweb. ................................... 6
Figure 8. Exemplary linear regression between ΔW and ΔB to determine BE (Gonthier, 2007).
................................................................................................................................................ 11
Figure 9.Tools and modules in FREEWAT and their interconnection (Borsi et al., 2017). ....... 13
Figure 10. General structure and usage of the OAT in FREEWAT (Borsi et al., 2017). ........... 14
Figure 11. Structure of the OAT folder..................................................................................... 16
Figure 12. Main window of the Qt Designer with the implementable features in the left column, a
created GUI in centre and editing options in the right column. ................................................. 25
Figure 13. Structure and features of the GUI for processing time-series data in the OAT with
sections 1 to 6. Here, the sensor Observered_head_1 and the originally implemented method
hydro separation are selected. ................................................................................................. 26
Figure 14. Section 4 of the processing GUI if the method outlier test is selected. .................... 27
Figure 15. Section 4 of the processing GUI if the method mann-kendall is selected. ............... 27
Figure 16. Section 4 of the processing GUI if the method barometric correction is selected and
Confined aquifer chosen. ......................................................................................................... 28
Figure 17. Section 4 of the processing GUI if the method barometric correction is selected and
Unconfined aquifer chosen. ..................................................................................................... 28
Figure 18. QGIS path for opening the processing GUI of the OAT in FREEWAT. ................... 28
Figure 19. Settings for applying the outlier test to the turbidity readings. Here, an exceedance
probability of p = 0.05 is chosen. .............................................................................................. 29
Figure 20. Processing GUI after the outlier test was applied to the turbidity readings (p = 0.05).
................................................................................................................................................ 29
Figure 21. Settings for applying the Mann-Kendall test to the nitrate readings. Here, an alpha
value of alpha = 0.05 is chosen. ............................................................................................... 30
Figure 22. Processing GUI after the Mann-Kendall test was applied to the nitrate readings with
an alpha = 0.05. ....................................................................................................................... 31
Figure 23. External water level input data for a confined aquifer with artificial air pressure time-
series. ...................................................................................................................................... 31
List of Figures
39
Figure 24. Settings for applying the barometric correction to external water level measurements
with an artificial atmospheric pressure time-series in hPa. Here for a confined aquifer where the
barometric efficiency was estimated. ....................................................................................... 32
Figure 25. Processing GUI after the barometric correction for confined aquifers was applied to
external water level readings and artificial atmospheric pressure time-series. The barometric
correction was estimated. ........................................................................................................ 33
Figure 26. Processing GUI after the barometric correction for confined aquifers was applied to
external water level readings and artificial atmospheric pressure time-series. The barometric
correction was not estimated. .................................................................................................. 33
Figure 27. Water level and atmospheric pressure input data used for application of the
barometric correction to an unconfined aquifer (Spane, 2002). ................................................ 34
Figure 28. Settings for applying the barometric correction to external water level and atmospheric
pressure time-series in hPa. Here for an unconfined aquifer. ................................................... 35
Figure 29. Processing GUI after the barometric correction for unconfined aquifers was applied
to external water level and atmospheric pressure time-series. ................................................. 35
List of Tables
40
List of Tables Table 1. Access data for TUM Chair of Hydrology and River Basin Management to the web server
SENSOweb. .............................................................................................................................. 5
Table 2. Python files in OAT which were altered or extended. ................................................. 17
Table 3. Additional attributes for class BarometricCorrection(). ...................................... 20
References
41
References Borsi, I., Foglia, L., M. Cannata, Vázquez-Suñé, E., Mehl, S., De Filippis, G., Criollo, R., Ghetta,
M., Cardoso, M., Velasco, V., Neumann, J., Toegl, A., Serrano, A., Riera, C., and
Rossetto, R. 2017. FREEWAT User Manual, Volume 0 – Reference Manual, version 1.0,
September 30th.
Cannata M. and Antonovic M., 2010. istSOS: investigation of the sensor observation service. In
WebMGS 1st international workshop on pervasive web mapping, geoprocessing and
services, Como, Italy (pp. 26-27).
Cannata, M., Neumann, J., and Cardoso, M., 2017. FREEWAT User Manual, Volume 5 -
Observation Analysis Tool, version 1.0, September 30th.
Carslaw, H.S., Jaeger, J.C., 1959. Conduction of Heat in Solids. Oxford University Press, New York.
Clark, W.E., 1967. Computing the Barometric Efficiency of a Well. J. Hydraul. Div. 93, 93–98.
Cohn, T.A., England, J.F., Berenbrock, C.E., Mason, R.R., Stedinger, J.R., Lamontagne, J.R., 2013. A generalized Grubbs-Beck test statistic for detecting multiple potentially influential low outliers in flood series. Water Resour. Res. 49, 5047–5058. https://doi.org/10.1002/wrcr.20392
Davis, D.R., Rasmussen, T.C., 1993. A comparison of linear regression with Clark’s method for estimating barometric efficiency of confined aquifers. Water Resour. Res. 29, 1849–1854.
Ernesti, J., Kaiser, P., 2012. Python 3: Das umfassende Handbuch: Sprachgrundlagen, Objektorientierung, Modularisierung, 3rd ed. Galileo Computing, Bonn.
Friedrich, S., 2017. Konfiguration einer multiparametrischen Sonde und Auswertung der Daten (Bachelor Thesis). Technical University of Munich, Munich.
Gonthier, G.J., 2007. A Graphical Method for Estimation of Barometric Efficiency from Continuous Data— Concepts and Application to a Site in the Piedmont, Air Force Plant 6, Marietta, Georgia (Scientific Investigation Report No. 2007–5111). USGS.
Grubbs, F.E., 1974. Procedures for Detecting Outlying Observations in Samples. Defense Technical Information Center, Maryland.
Grubbs, F.E., Beck, G., 1972. Extension of Sample Sizes and Percentage Points for Significance Tests of Outlying Observations. Technometrics 14, 847–854. https://doi.org/10.1080/00401706.1972.10488981
Hartung, J., Elpelt, B., Klösener, K.-H., 2009. Statistik: Lehr- und Handbuch der angewandten Statistik, 15., überarb. und wesentlich erw. ed. De Gruyter Oldenbourg, München.
Helsel, D.R., Hirsch, R.M., 2002. Statistical Methods in Water Resources (No. A3), Hydrologic Analysis and Interpretation. United States Geological Survey.
McKinney, W., 2012. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython, 1st ed. O’Reilly and Associates, Beijing.
Rasmussen, T.C., Crawford, L.A., 1997. Identifying and Removing Barometric Pressure Effects in Confined and Unconfined Aquifers. Ground Water 35, 502–511.
Spane, F.A., 2002. Considering barometric pressure in groundwater flow investigations. Water Resour. Res. 38, 1–18.
References
42
Spane, F.A.J., Mercer, R.B., 1985. HEADCO: a program for converting observed water levels and pressure measurements to formation pressure and standard hydraulic head (No. RHO-BW-ST--71-P). Rockwell International Corp.
UIT, n.d. Bedienungsanleitung MSM-S2.
Vaas, M., 2018. Interview on Groundwater Situation in Around Neufahrn and Well Installations at the Pump House.
Weeks, E.P., 1979. Barometric fluctuations in wells tapping deep unconfined aquifers. Water Resour. Res. 15, 1167–1176.
Appendix A: method.py
43
Appendix A: method.py In Appendix A, the implemented parts of the programming code in the file method.py is attached.
Further, class Method() is shown from which all other classes inherit.
class Method(object):
""" base class for processing """
"""
Initialize the class
Arguments:
self (obj): the class
"""
def __init__(self):
"""
Initialize the class
Arguments:
self (obj): the class
"""
self.result = {
"op": self.__class__.__name__,
"type": None,
"data": None
}
def execute(self, oat, detailedresult=False):
""" base method for processing """
raise Exception("base method to be overwritten in childs objects")
def returnResult(self, detailedresult):
""" determines the type of process result:
result or dict of info and result
"""
if detailedresult is True:
return self.result
else:
Appendix A: method.py
44
return self.result['data']
class OutlierTest(Method):
"""
Testing for outliers using 'Grubbs Test', removes and replaces them
"""
def __init__(self, crit):
super(OutlierTest, self).__init__()
self.crit = crit
self.index = 0
def remove_max(self, obs, ser, oat, exc_val, numb):
"""
Function to remove maximum outlier and to value by linear
interpolation
"""
# create dict to store index and value of removed outliers and
# new value
d_1 = {}
while ((ser.max() - ser.mean()) / ser.std()) > exc_val[numb]:
val_id = oat.ts.index.get_loc(ser.idxmax())
if val_id is 0:
val = oat.ts['data'][val_id + 1]
elif val_id is obs:
val = oat.ts['data'][obs - 1]
else:
val = ((oat.ts['data'][val_id - 1] +
oat.ts['data'][val_id + 1]) / 2)
d_1[ser.idxmax()] = oat.ts['data'][val_id], val
oat.ts['data'][val_id], ser[ser.idxmax] = val, val
return d_1
def remove_min(self, obs, ser, oat, exc_val, numb):
"""
Function to remove minimum outlier and to replace value by linear
interpolation
"""
Appendix A: method.py
45
# create dict to store index and value of removed outliers and
# new value
d_2 = {}
while ((ser.mean() - ser.min()) / ser.std()) > exc_val[numb]:
val_id = oat.ts.index.get_loc(ser.idxmin())
if val_id is 0:
val = oat.ts['data'][val_id + 1]
elif val_id is obs:
val = oat.ts['data'][obs - 1]
else:
val = ((oat.ts['data'][val_id - 1] +
oat.ts['data'][val_id + 1]) / 2)
d_2[ser.idxmin()] = oat.ts['data'][val_id], val
oat.ts['data'][val_id], ser[ser.idxmin] = val, val
return d_2
def execute(self, oat, detailedresult=False):
"""
Testing, if outlier is present and calling functions for removal
and replacement.
Critical values for exceedance probabilities 1% and 5% taken from
'Grubbs and Beck (1972):
Extension of Sample Sizes and Percentage Points for Significance
Tests of Outlying Observations.
Technometrics, vol. 14, p. 847-854.'
"""
if self.crit == 'p = 0.05':
exc_val = {30: 2.745, 40: 2.866, 50: 2.964, 60: 3.025, 80: 3.130,
100: 3.207}
elif self.crit == 'p = 0.01':
exc_val = {30: 3.103, 40: 3.240, 50: 3.345, 60: 3.411, 80: 3.521,
100: 3.600}
elif self.crit == 'p = 0.005':
exc_val = {30: 3.236, 40: 3.381, 50: 3.483, 60: 3.560, 80: 3.673,
100: 3.754}
elif self.crit == 'p = 0.001':
Appendix A: method.py
46
exc_val = {30: 3.507, 40: 3.673, 50: 3.789, 60: 3.874, 80: 3.998,
100: 4.084}
dict_outl = {}
obs = len(oat.ts['data'])
obs_sub = obs
# test if amount of samples is larger than 30
if obs < 30:
raise Exception('Amount of samples should be larger or equal'
'to 30.')
else:
# adapt critical value to amount of samples
if obs >= 100:
numb = 100
elif obs < 100 and obs >= 80:
numb = 80
elif obs < 80 and obs >= 60:
numb = 60
elif obs < 60 and obs >= 50:
numb = 50
elif obs < 50 and obs >= 40:
numb = 40
elif obs < 40 and obs >= 30:
numb = 30
# move numb steps forward and call functions to remove min and
# max outliers
while obs_sub > numb:
self.index += numb
ser = oat.ts['data'][(self.index-numb):self.index]
d_1 = self.remove_max(obs, ser, oat, exc_val, numb)
dict_outl.update(d_1)
d_2 = self.remove_min(obs, ser, oat, exc_val, numb)
dict_outl.update(d_2)
obs_sub -= numb
ser = oat.ts['data'][-numb:]
d_1 = self.remove_max(obs, ser, oat, exc_val, numb)
Appendix A: method.py
47
dict_outl.update(d_1)
d_2 = self.remove_min(obs, ser, oat, exc_val, numb)
dict_outl.update(d_2)
temp_oat = oat.copy()
self.result['type'] = "outliers"
self.result['data'] = temp_oat
self.result['outliers'] = dict_outl
return self.returnResult(detailedresult)
class BarometricCorrection(Method):
"""
Conduct atmospheric correction for confined and unconfined aquifers
Barometric efficiency can be estimated or given
"""
def __init__(self, press=None, thickness=None, pneumat=None, bar_eff=None,
unit=None, quantity=None):
super(BarometricCorrection, self).__init__()
self.press = press
self.thickness = thickness
self.pneumat = pneumat
self.bar_eff = bar_eff
self.unit = unit
self.quantity = quantity
def execute(self, oat, detailedresult=False):
# import lingregress from scipy.stats for slope and r-square
try:
from scipy.stats import linregress
except:
raise ImportError("scipy module is required for this method.")
# convert barometric pressure to barometric head
if self.unit == 'Pa':
self.press.ts['data'] = (self.press.ts['data']) / (1000 *
Appendix A: method.py
48
9.81)
elif self.unit == 'hPa':
self.press.ts['data'] = (self.press.ts['data'] * 100) / (1000 *
9.81)
elif self.unit == 'kPa':
self.press.ts['data'] = (self.press.ts['data'] * 1000) / (1000 *
9.81)
elif self.unit == 'bar':
self.press.ts['data'] = (self.press.ts['data'] * 10000) / (1000 *
9.81)
else:
self.press.ts['data'] = self.press.ts['data']
# copy oat to enable comparison of corrected and uncorrected head
uncorr = oat.copy()
# create DataFrame with water head and barometric pressure head and
# clean it
oat_press = pd.DataFrame({'air_pressure': self.press.ts['data'],
'head': oat.ts['data']},
index=oat.ts.index).dropna()
# correction for confined aquifer
if self.pneumat == None and self.thickness == None:
# relativise to average atmospheric pressure at sea level
oat_press['air_pressure'] = oat_press['air_pressure'] - (101325 /
(1000 * 9.81))
if self.bar_eff:
# estimate barometric efficieny
if self.bar_eff == 'Estimated':
d_oat_press = pd.DataFrame({'air_pressure_d': oat_press.loc
[:, 'air_pressure'].diff(),
'head_d':
oat_press.loc[:, 'head'].diff()
},
index=oat_press.index).dropna()
Appendix A: method.py
49
# get slope (barometric efficiency) and correlation of
# change in water level and barometric pressure
be, _, rval = linregress(x=d_oat_press.loc
[:, 'air_pressure_d'],
y=d_oat_press.loc[:, 'head_d']
)[0:3]
be = -be
rsqu = np.square(rval)
# barometric efficiency is given
else:
be = self.bar_eff
rsqu = '-'
oat.ts['data'] = oat_press['head'] + (be *
oat_press['air_pressure'])
# barometric efficiency = 0 (no response)
else:
be = 0.0
rsqu = '-'
# convert water level head to total head
if self.quantity == 'Water level (vented)':
oat_press['head'] = oat_press['head'] + oat_press['air_'
'pressure']
else:
oat.ts['data'] = oat_press['head']
# clean time series for plot
oat.ts.dropna()
if be > 1 or be < 0:
raise ValueError('Time series must be cleaned before applying'
'barometric correction.\n\b'
'There might be trend or influences of '
'pumping, recharge, earth tides etc.')
# correction for unconfined aquifer
Appendix A: method.py
50
else:
# create DataFrame with total head and barometric pressure
# difference
d2_oat_press = pd.DataFrame({'pressure_d': oat_press.loc[:, 'air_'
'pressure'].diff(),
'total_head': oat_press.loc[:, 'head']
},
index=oat_press.index).dropna()
# calculate corrected head
for z in range(len(d2_oat_press)):
sum_press = 0
for j in np.arange(z, -1, -1):
u = 0
x = 0
for i in range(400):
x = x + ((((-1) ** i) / ((2 * i) - 1)) *
np.exp(-((j * self.pneumat) /
(4 * (self.thickness ** 2))) *
(np.pi ** 2) * ((2 * i) - 1) ** 2))
u = u + ((1 - (4 / np.pi)) * x)
sum_press = sum_press + u * (-
d2_oat_press['pressure_d'][z-j])
d2_oat_press['total_head'][z] = d2_oat_press['total_'
'head'][z] - sum_press
# write corrected head to sensor
oat.ts['data'] = d2_oat_press['total_head']
# clean time series for plot
oat.ts.dropna()
be = '-'
rsqu = '-'
# declarate names for plot
uncorr.name = "{}_original".format(uncorr.name)
Appendix A: method.py
51
oat.name = "{}_corrected".format(oat.name)
temp_oat = oat.copy()
self.result['type'] = "barometric correction"
self.result['data'] = [temp_oat, uncorr]
self.result['parameter'] = [be, rsqu]
return self.returnResult(detailedresult)
class MannKendall(Method):
"""
Conducting Mann-Kendall trend test with alpha = 0.05 or alpha = 0.01
"""
def __init__(self, alpha):
super(MannKendall, self).__init__()
self.alpha = alpha
def execute(self, oat, detailedresult=False):
# import norm from scipy.stats
try:
from scipy.stats import norm
except:
raise ImportError("scipy module is required for this method")
if self.alpha == 'alpha = 0.05':
self.alpha = 0.05
elif self.alpha == 'alpha = 0.01':
self.alpha = 0.01
# both-sided test
alpha = self.alpha/2
s = 0
x = oat.ts['data']
n = len(oat.ts)
result = {}
for k in range(n-1):
for j in range(k+1, n):
Appendix A: method.py
52
s += np.sign(x[j]-x[k])
# considering ties
unique_x = np.unique(x)
g = len(unique_x)
# there is no tie in data
if n == g:
var = (n*(n-1)*(2*n+5))/18
# there are ties in data
else:
tp = np.zeros(unique_x.shape)
for i in range(len(unique_x)):
tp[i] = sum(x == unique_x[i])
var = (n*(n-1)*(2*n+5) - np.sum(tp*(tp-1)*(2*tp+5)))/18
std = np.sqrt(var)
# normalize S to get Z
if s >= 0:
z=((s-1)/std)
elif s == 0:
z = 0
else:
z = ((s+1)/std)
# calculate p-value and if trend is present
p = np.around((2*(1-norm.cdf(abs(z)))), decimals=4)
h = abs(z) > norm.ppf(1-alpha)
# evaluate trend
if (z < 0) and h:
trend = 'decreasing trend'
elif (z > 0) and h:
trend = 'increasing trend'
else:
trend = 'no trend detected'
result['trend'] = trend
Appendix A: method.py
53
result['S'] = int(s)
result['p_value'] = p
self.result['type'] = "dict"
self.result['data'] = result
return self.returnResult(detailedresult)
Appendix B: processThread.py
54
Appendix B: processThread.py In Appendix B, the implemented parts of the programming code in the file processThread.py is
attached. Further, class BaseThread() is added from which all other classes inherit.
class BaseThread(QObject):
"""
Base thread for method execution
All the method are executed in a background thread, to prevent the
gui to freeze
"""
def __init__(self, gui):
super(BaseThread, self).__init__()
self.result = None
self.gui = gui
self.history = {
"process": "",
"params": {}
}
end = pyqtSignal(dict, dict)
exception = pyqtSignal(Exception)
debug = pyqtSignal(str)
class OutlierThread(BaseThread):
def __init__(self, gui):
super(OutlierThread, self).__init__(gui)
self.history['process'] = "Outlier test"
def run(self):
try:
prob = self.gui.probCombo.currentText()
self.result = self.gui.oat.process(method.OutlierTest(prob),
detailedresult=True)
self.end.emit(self.result, self.history)
self.history['params']['p-value'] = prob.split('=')[1].strip()
Appendix B: processThread.py
55
except Exception as e:
self.exception.emit(e)
class MannKendallThread(BaseThread):
def __init__(self, gui):
super(MannKendallThread, self).__init__(gui)
self.history['process'] = "Mann-Kendall"
def run(self):
try:
alpha = self.gui.trendCombo.currentText()
self.history['params']['alpha'] = alpha.split('=')[1].strip()
self.result = self.gui.oat.process(method.MannKendall(alpha=alpha),
detailedresult=True)
self.end.emit(self.result, self.history)
except Exception as e:
self.exception.emit(e)
class BarometricCorrectionThread(BaseThread):
def __init__(self, gui):
super(BarometricCorrectionThread, self).__init__(gui)
self.history['process'] = "Barometric correction"
def run(self):
try:
sensor_name = self.gui.pressSensor.currentText()
press = self.gui.get_sensor_from_db(sensor_name)
unit = self.gui.correctionUnit.currentText()
quantity = None
bar_eff = float(self.gui.correctionBe.text())
if self.gui.correctionEstimation.isChecked():
Appendix B: processThread.py
56
bar_eff = 'Estimated'
if self.gui.unconfinedradioButton.isChecked():
try:
thickness = float(self.gui.correctionVadose.text())
except:
raise Exception('Please insert thickness of vadose zone.')
try:
pneumat = float(self.gui.correctionPneumat.text())
except:
raise Exception('Please insert pneumatic diffusivity.')
else:
thickness = None
pneumat = None
self.history['params']['Pressure sensor'] = sensor_name
self.history['params']['Unit'] = unit
self.history['params']['Measured quantity'] = quantity
self.history['params']['Barometric efficiency'] = bar_eff
self.history['params']['Thickness of vadose zone'] = thickness
self.history['params']['Pneumatic diffusivity'] = pneumat
self.result = self.gui.oat.process(method.BarometricCorrection(
press=press, thickness=thickness,
pneumat=pneumat, bar_eff=bar_eff, unit=unit,
quantity=quantity), detailedresult=True)
self.end.emit(self.result, self.history)
except Exception as e:
self.exception.emit(e)
Appendix C: processTs_dialog.py
57
Appendix C: processTs_dialog.py In Appendix C, the implemented parts of the programming code in the file processTs_dialog.py
is attached. In this file, only one class is present (class ProcessTs()) containing several
functions. Mostly existing functions were only extended by some lines. In this case, the whole
function is shown whereby the original lines are greyed out.
def manage_gui(self):
"""
Connect all required signal
"""
# Connect all signal
self.loadButton.clicked.connect(self.load_sensor)
self.executeFilter.clicked.connect(self.run_filter)
self.saveFilter.clicked.connect(self.save_result)
self.comboFilter.currentIndexChanged.connect(self.change_stacked_filter)
self.sensorList.currentIndexChanged.connect(self.hide_filter)
self.dvalueTime.stateChanged.connect(self.enable_dval_frame)
self.qualityTime.stateChanged.connect(self.enable_quality_frame)
self.qualityUseval.stateChanged.connect(self.enable_quality_val_frame)
self.dvalueUse.stateChanged.connect(self.enable_value_val_frame)
self.intTime.stateChanged.connect(self.enable_int_frame)
self.statisticsTime.stateChanged.connect(self.enable_statistics_frame)
self.unconfinedradioButton.toggled.connect(lambda: self.enable_confined
(self.unconfinedradioButton)
)
self.hEvtTime.stateChanged.connect(self.h_evt_toggle)
self.hIndexTime.stateChanged.connect(self.h_index_toggle)
self.stackedWidget.setCurrentIndex(0)
try:
self.load_sensor_list()
except Exception as _:
QMessageBox.warning(self, "Warning", 'No sensor found!!!')
return
def load_sensor_list(self):
"""
Appendix C: processTs_dialog.py
58
Add sensor name to ComboBox
"""
query = "SELECT * FROM freewat_sensors"
res = self.db.execute_query(query)
self.clear_ui()
self.db.close()
self.sensorList.clear()
self.compareSensor.clear()
self.subSensor.clear()
self.pressSensor.clear()
for elem in res:
self.sensorList.addItem(elem[1])
self.compareSensor.addItem(elem[1])
self.subSensor.addItem(elem[1])
self.pressSensor.addItem(elem[1])
def run_filter(self):
"""
execute button listener, execute the selected filter
"""
self.add_loading()
sel_filter = self.comboFilter.currentText()
self.base_thread = processThread.Thread()
print sel_filter
if sel_filter == "digital filter":
self.thread = processThread.DigitalThread(self)
elif sel_filter == "exceedance":
self.thread = processThread.ExceedanceThread(self)
elif sel_filter == "hydro events":
self.thread = processThread.HydroEventsThread(self)
elif sel_filter == "hydro indices":
self.thread = processThread.HydroIndicesThread(self)
Appendix C: processTs_dialog.py
59
elif sel_filter == "quality":
self.thread = processThread.QualityStatThread(self)
elif sel_filter == "resample":
self.thread = processThread.ResampleThread(self)
elif sel_filter == "data values":
self.thread = processThread.DataValuesThread(self)
elif sel_filter == "hydro separation":
self.thread = processThread.HydroGraphSepThread(self)
elif sel_filter == "integrate":
self.thread = processThread.IntegrateThread(self)
elif sel_filter == "compare":
self.thread = processThread.CompareThread(self)
elif sel_filter == "subtract":
self.thread = processThread.SubtractThread(self)
elif sel_filter == "fill":
self.thread = processThread.FillThread(self)
elif sel_filter == "statistics":
self.thread = processThread.StatisticsThread(self)
elif sel_filter == "hargreaves":
self.thread = processThread.HargreavesEToThread(self)
elif sel_filter == "outlier test":
self.thread = processThread.OutlierThread(self)
elif sel_filter == "barometric correction":
self.thread = processThread.BarometricCorrectionThread(self)
elif sel_filter == "mann-kendall":
self.thread = processThread.MannKendallThread(self)
else:
print "no method found"
self.remove_loading()
return
# run processing thread on background
self.thread.moveToThread(self.base_thread)
self.thread.end.connect(self.print_result)
self.thread.exception.connect(self.print_exception)
self.thread.debug.connect(self.print_debug)
self.base_thread.started.connect(self.thread.run)
self.base_thread.start()
Appendix C: processTs_dialog.py
60
def save_result(self):
"""
Save elaboration
overwrite previous sensor or create new one
"""
res_type = self.result['type']
flag = False
if res_type == "sensor":
self.oat = self.result['data']
flag = self.save_sensor()
elif res_type == "sensor list":
self.save_sensor_list()
return
elif res_type == "dict":
flag = self.save_dict()
elif res_type == "dict list":
flag = self.save_dict_list()
elif res_type == "outliers":
self.oat = self.result['data']
flag = self.save_sensor()
elif res_type == "barometric correction":
self.oat = self.result['data'][0]
flag = self.save_sensor()
else:
print "Undef type"
if not flag:
return
self.historyBox.clear()
self.load_sensor_list()
def enable_confined(self, value):
if value.isChecked() == True:
Appendix C: processTs_dialog.py
61
self.unconfinedFrame.setEnabled(True)
self.confinedFrame.setEnabled(False)
else:
self.unconfinedFrame.setEnabled(False)
self.confinedFrame.setEnabled(True)
def print_result(self, result, history):
"""
Print processing result
"""
self.result = result
self.history = history
message = "Filter " + self.history['process'] + " params: "
for param in self.history['params'].keys():
message += " " + param + " : " + str(self.history['params'][param]
) + " |"
self.historyBox.appendPlainText(message)
if result['type'] == "sensor":
self.draw_chart(self.result['data'])
elif result['type'] == "sensor list":
self.draw_multiple_chart(self.result['data'])
elif result['type'] == "dict list":
field = QPlainTextEdit()
field.setReadOnly(True)
for elem in self.result['data']:
tmp = ""
for key in elem.keys():
tmp += key + " : " + str(elem[key]) + " "
field.appendPlainText(tmp)
self.clear_layout(self.resultFrame.layout())
self.resultFrame.layout().addWidget(field)
elif result['type'] == "dict":
Appendix C: processTs_dialog.py
62
field = QPlainTextEdit()
field.setReadOnly(True)
data = self.result['data']
for key in data.keys():
field.appendPlainText(key + " : " + str(data[key]))
self.clear_layout(self.resultFrame.layout())
self.resultFrame.layout().addWidget(field)
elif result['type'] == "outliers":
self.draw_chart(self.result['data'])
field = QPlainTextEdit()
field.setReadOnly(True)
outl = self.result['outliers']
if outl:
# sort output by date
for key in sorted(outl.keys()):
field.appendPlainText("Outlier removed:\t" +
key.strftime("%Y-%m-%d %H:%M:%S") +
" : " + str(outl[key][0]) +
" replaced by " + str(outl[key][1]))
else:
field.appendPlainText("No outliers detected.")
self.resultFrame.layout().addWidget(field)
elif result['type'] == "barometric correction":
self.draw_multiple_chart(self.result['data'])
field = QPlainTextEdit()
field.setReadOnly(True)
params = self.result['parameter']
field.appendPlainText("Barometric efficiency: " + str(params[0]) +
"\n" +
"R-square (change in water level to change"
"in barometric head): " + str(params[1]))
self.resultFrame.layout().addWidget(field)
Appendix C: processTs_dialog.py
63
else:
print "undefined result"
self.base_thread.quit()