high-frequency observation and characterization of the ... · high-frequency observation and...
Post on 28-Jun-2020
5 Views
Preview:
TRANSCRIPT
High-Frequency Observation and Characterization of
the Marine Environment: Completion and Spectral
Clustering of Multivariate Time Series
Grassi K.1,2,3, Dezecache C.2 , Phan T. T. H.2,4, Poisson-Caillault E.2, Bigand A.2, Lefebvre A1. 1 Ifremer, Laboratoire Environnement et Ressources, 62321, Boulogne sur Mer, France
2 LISIC, EA 4491, Université du Littoral Côte d’Opale, 62228 Calais, France. 3 WeatherForce, 31000 Toulouse, France 4VNUA - Vietnam National University of Agriculture
Source : [Dickey, 2003]
Low frequency (weekly or less)
High frequency (weekly or more)
Nested scales
2
REPHY/SRN
MAREL-Carnot
FerryBox
Source : [Dickey, 2003]
REPHY : RÉseau de surveillance du PHYtoplancton et des PHYcotoxines SRN : Suivi Régional des Nutriments
Low frequency (weekly or less)
High frequency (weekly or more)
Nested scales
3
Satellite images Sampling
Satellite
Mobile stations:
FerryBox, SMATCH
Low frequency sampling :
REPHY / SRN Fixed stations: MAREL-Carnot, MESURHO 4
Multisource and multiparameters data
Suivi Régional des Nutriments
28/03
Data gap problem
5
Suivi Régional des Nutriments
28/03
6
Data gap problem
MAREL Carnot station dataset: - 19 times series, sampling every 20 minutes - Missing values:
• 62,2% for Phosphate • 59,9% for Nitrates • 27,2% for pH • 12,3% for Fluorescence
- Small gaps to large gaps (7 months for pH) - Moving average or linear regression are inappropriate
Problem of missing values (Phan, 2018)
7
Data gap problem
Dynamic Time Warping (DTW) based imputation
Q
8
G
References: -DTWUMI: https://cran.r-project.org/web/packages/DTWUMI/index.html
-DTWBI: https://cran.r-project.org/web/packages/DTWBI/index.html
-Thi-Thu-Hong Phan, Emilie Poisson Caillault, Alain Lefebvre, André Bigand, Dynamic time warping based imputation for univariate
time series data, Pattern Recognition Letters, 2017, ISSN 0167-8655, https://doi.org/10.1016/j.patrec.2017.08.019.
-T. T. H. Phan, E. P. Caillault, A. Bigand and A. Lefebvre, "DTW-Approach for uncorrelated multivariate time series imputation," 2017
IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), Tokyo, 2017, pp. 1-6.
https://doi/10.1109/MLSP.2017.8168165.
Methods: Dynamic Time Warping (DTW) based imputation
DTWBI => univariate time series / DTWUMI => multivariate time series
Data gap problem
9
References: -DTWUMI: https://cran.r-project.org/web/packages/DTWUMI/index.html
-DTWBI: https://cran.r-project.org/web/packages/DTWBI/index.html
-Thi-Thu-Hong Phan, Emilie Poisson Caillault, Alain Lefebvre, André Bigand, Dynamic time warping based imputation for univariate
time series data, Pattern Recognition Letters, 2017, ISSN 0167-8655, https://doi.org/10.1016/j.patrec.2017.08.019.
-T. T. H. Phan, E. P. Caillault, A. Bigand and A. Lefebvre, "DTW-Approach for uncorrelated multivariate time series imputation," 2017
IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), Tokyo, 2017, pp. 1-6.
https://doi/10.1109/MLSP.2017.8168165.
Dynamic Time Warping (DTW) based imputation *
G
Q
Methods: Dynamic Time Warping (DTW) based imputation
DTWBI => univariate time series / DTWUMI => multivariate time series
Data gap problem
10
Level of details
Recursive method : Multi-level Spectral Clustering
1st Spectral clustering
2nd Spectral clustering
3rd Spectral clustering
Classification in the spectral space
- DTW completion
- Data normalization
(centering, scaling)
state2 state1
state3 state4 state5 State1 state2
Scaled data
Size:105,192*9
NA’s: 0:0%
Raw data HF
Size: 92,968*9
NA’s: 320,401: 38%
Detection of environmental states
11
1st Spectral clustering
2nd Spectral clustering
3rd Spectral clustering
Frequency Dynamic by months
Time index Months
s2
Scaled data
s1
s1
Scaled data
s1 s2
s2 s3 s4
s1
Scaled data
s1 s2
s2 s3 s4
s1 s2 s3 s4 S8 s7 s6 s5
Sta
tes
Sta
tes
Sta
tes
Results
Detection of environmental states
Sta
tes
12
3rd Spectral clustering
Nitrate Fluorescence Dissolved Oxygen
Succession events
Time
Events : pressure and response
Pressure
Reponses
Response
Time index
12
Detection of environmental states
Classification independent from time and Fluorescence signal
13
3rd Spectral clustering
Phosphate
Time index
Sta
tes
Rare/Extreme events
Phosphate Correlation State 7 = 0.62
States
Intermittent Events : rare/extreme
Detection of environmental states
14
3rd Spectral clustering
Phosphate
Time index
Sta
tes
Rare/Extreme events
Correlation phosphates and turbidity
States
Detection of environmental states
Intermittent Events : rare/extreme
15
3rd Spectral clustering
Phosphate
Time index
Sta
tes
Rare/Extreme events
Correlation phosphates and turbidity
States
Phosphate Desorption
Detection of environmental states
New Phosphate stock available for phytoplancton
Intermittent Events : rare/extreme
16
HF Databases
S1
S2
S3
S4
S5
S6
S7
S8
Spectral-C
Label
The protocol allowed to : - Optimize HF data processing - Define states in multi-parameters time series - Detect, identify and characterize this states
- Characterize events and extract label for frequent, rare or extreme events
CONCLUSIONS and PERSPECTIVES
Adding news data sources
DTWBI
17
Multi Agent
Learning
HF Databases
Sat
in-situ
S1
S2
S3
S4
S5
S6
S7
S8
DTWBI Spectral-C ML/DL
Correspondence
Label/Data
ML/DL
ML/DL
∑
∑
∑ Sat Model + +
Label Machine Learning
Deep Learning
S1
S2
…
Sx
Prediction
Label Classification
system
S1
S2
…
Sx
S1
S2
…
Sx
New data
Majority
Vote
S1
S2
…
Sx
training dataset
CONCLUSIONS and PERSPECTIVES
Deep-Learning
Thank you for your attention
10/10/2018 18
The authors also want to acknowledge H2020 JERICO-Next for their financial contribution as well as the organizers.
This work has been partly funded by the French government and the region Hauts-de-France in the framework of the project CPER 2014-2020 MARCO
Kelly Grassi's PhD is funded by WeatherForce as part of its R & D program "Building an Initial State of the Atmosphere by Unconventional Data Aggregation".
19
K-means Spectal-C Hierarchical-C
20
Spectral Approach
Data segmentation
• Multi-sensor base
• No information
educational dataset: - A circle and a ball - 2000 points each
- States
Dimension 1 Dimension 1
Dim
ensi
on
2
Dim
ensi
on
2
K-means
10/10/2018 21
Projection of the
classification on
the data
sampled
initial space
Using the algorithm of the nearest neighbors K
to rank the initial base
Linearly separable data
K-means
N according to the gap
Algorithme de la classification spectrale
10/10/2018 22
Algorithme k-means
Criterion for minimizing intra-group distances
data
number of groups groups the barycentre of the group
K-means min J
𝑋
𝐾 µ𝑘
labels
1) Initialization of K centers
2) Assigning each point to its nearest center
3) New estimation of centers
4) Calculation of the criterion J, return to 2) if the criterion is not respected
with
23
RESULTS
Rare and extreme events
Correlation : 0 .62
PCA States 7 (dim1/dim2)
PCA Regular series (dim1/dim2)
24
1st Spectral clustering
Frequency of states by months
Sta
tes
Time index
Correlation of each parameter for a given cluster :
Temperature
s2
Scaled data
s1
Dynamics
Classification independently of time but Seasonal dynamics
Salinity Turbidity Temperature Dissolved Oxygen Nitrate Phosphate Silicate PAR Sea Level
S1 -0.35 0.30 -0.73 0.52 0.38 0.21 0.38 -0.21 0.014
S2 0.35 -0.30 0.73 -0.52 -0.38 -0.21 -0.38 0.21 -0.014
STATES DYNAMIC AND MAIN CONTRIBUTING PARAMETERS
Months States
25
2nd Spectral clustering
Sta
te
s
Time index
States dynamics Frequency of states by months
Correlation of each parameter for a given cluster :
Nitrate Silicate
s1
Scaled data
s1 s2
s2 s3 s4
- News structuring variables : Oxygen, Nitrate, Silicate
- Actively involved in developing production processes
Salinity Turbidity Temperature Dissolved Oxygen Nitrate Phosphate Silicate PAR Sea Level
S1 0.04 -0.08 -0.48 0.62 -0.16 -0.14 -0.06 -0.09 0.02
S2 -0.41 0.40 -0.39 0.05 0.53 0.34 0.47 -0.15 -0.002
S3 0.30 -0.11 0.30 -0.46 0.11 -0.02 0.02 -0.05 0.009
S4 0.13 -0.23 0.53 -0.19 -0.48 -0.19 -0.42 0.26 -0.02
STATES DYNAMIC AND MAIN CONTRIBUTING PARAMETERS
Months States States
26
3rd Spectral clustering S
tate
s
Time index
Dynamic states Frequency of states by months
s1
Scaled data
s1 s2
s2 s3 s4
s1 s2 s3 s4 S8 s7 s6 s5
8 states with some different dynamics
STATES DYNAMIC AND MAIN CONTRIBUTING PARAMETERS
Months
27
3rd Spectral clustering
Salinity
Time index
Sta
tes
EXAMPLE OF STATES LABELISATION
States
Time
28
3rd Spectral clustering
Salinity
Time index
Sta
tes
Sensor Failure
States
Time
EXAMPLE OF STATES LABELISATION
29
Fig. DTW cost matrix showing the optimal matching path
Identification of similar sub-sequences
- Pre-selection of sequences based on global features over all times series available
- Minimization of the matching path based on DTW cost matrix
30
Data gap problem
- Other methods are provided which aims at improving the limitation of DTW: • Derivative Dynamic Time Warping (DDTW) • Adaptive Feature Based Dynamic Time Warping (AFBDTW)
- Several functions are included to assess the similarity between time series: • Similarity • Root Mean Square Error (RMSE) • Normalized Mean Absolute Error (NMAE) • Fraction of Standard Deviation (FSD) • Fractional Bias (FB) • Fraction of data that satisfied smoothing amplitude cover (FA2)
Dynamic Time Warping (DTW) based imputation
Additional features within package DTWUMI
31
top related