seven years and one day in the life of internet...

62
Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix Seven years and one day in the life of Internet: Multiresolution and random projections for robust monitoring Patrice ABRY 1 , in collab. with Pierre BORGNAT 1 , Guillaume DEWAELE 1 , Kensuke FUKUDA 2 , Kenjiro CHO 3 1 CNRS – ENS Lyon Physics Lab., Ecole Normale Sup´ erieure de Lyon, France – 2 NII, Tokyo, Japan – 3 IIJ Internet Initiative Japan. Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 1 / 48

Upload: others

Post on 29-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Seven years and one dayin the life of Internet:

Multiresolution and random projectionsfor robust monitoring

Patrice ABRY 1,

in collab. with Pierre BORGNAT 1, Guillaume DEWAELE 1,Kensuke FUKUDA 2, Kenjiro CHO 3

1CNRS – ENS Lyon Physics Lab., Ecole Normale Superieure de Lyon, France –2NII, Tokyo, Japan – 3IIJ Internet Initiative Japan.

Does Fractal Scaling at the IP Level Depend onTCP Flow Arrival Processes ?

Nicolas Hohn, Darryl Veitch Patrice Abry

CUBIN CNRS, UMR 5672Department of Electrical& Electronic Engineering Laboratoire de PhysiqueUniversity of Melbourne Ecole Normale Superieure de Lyon

Australia France

ACM/SIGCOMM Internet Measurement WorkshopNovember 6-8 2002

Marseille, France

Introduction Classical Analysis Robust Analysis w/ Sketches Longitudinal Study Conclusion

Seven Years and One Day:Sketching the Evolution of Internet Traffic

Pierre BORGNAT 1, Guillaume DEWAELE 1

Kensuke FUKUDA 2, Patrice ABRY 1 Kenjiro CHO 3

1CNRS – ENS Lyon Physics Lab., Université de Lyon, France –2NII, Tokyo, Japan – 3IIJ Internet Initiative Japan.

INFOCOM 2009

CENTRE NATIONAL �

DE LA RECHERCHE�

SCIENTIFIQUE

ECOLE NORMALE SUPERIEURE DE LYON

Introduction Classical Analysis Robust Analysis w/ Sketches Longitudinal Study Conclusion

Internet Traffic: A Longitudinal Analysis[Context: Passive Monitoring of TCP/IP traffic on a link]

What are the evolutions of traffic over the years?• Topics in Statistical analysis of traffic

• Distributions of protocols, of packet sizes, of IAT, of flows,...• Aggregated traffic: Marginal laws• LRD (Long Range Dependence)• ...

• Diversity of expected traffic: http, P2P, mail, DNS,...• Variety of conditions: used bandwidth, congestion,...• Frequent anomalies: scans, viruses&worms, DDoS,...• ...

• Intuition: One trace is not enough!(for longitudinal, empirical data analysis)

• MAWI dataset: more than 7 years of daily traces

Introduction Classical Analysis Robust Analysis w/ Sketches Longitudinal Study Conclusion

Seven Years and One Day:Sketching the Evolution of Internet Traffic

Pierre BORGNAT 1, Guillaume DEWAELE 1

Kensuke FUKUDA 2, Patrice ABRY 1 Kenjiro CHO 3

1CNRS – ENS Lyon Physics Lab., Université de Lyon, France –2NII, Tokyo, Japan – 3IIJ Internet Initiative Japan.

INFOCOM 2009

CENTRE NATIONAL �

DE LA RECHERCHE�

SCIENTIFIQUE

ECOLE NORMALE SUPERIEURE DE LYON

Introduction Classical Analysis Robust Analysis w/ Sketches Longitudinal Study Conclusion

Internet Traffic: A Longitudinal Analysis[Context: Passive Monitoring of TCP/IP traffic on a link]

What are the evolutions of traffic over the years?• Topics in Statistical analysis of traffic

• Distributions of protocols, of packet sizes, of IAT, of flows,...• Aggregated traffic: Marginal laws• LRD (Long Range Dependence)• ...

• Diversity of expected traffic: http, P2P, mail, DNS,...• Variety of conditions: used bandwidth, congestion,...• Frequent anomalies: scans, viruses&worms, DDoS,...• ...

• Intuition: One trace is not enough!(for longitudinal, empirical data analysis)

• MAWI dataset: more than 7 years of daily traces

Introduction Classical Analysis Robust Analysis w/ Sketches Longitudinal Study Conclusion

Seven Years and One Day:Sketching the Evolution of Internet Traffic

Pierre BORGNAT 1, Guillaume DEWAELE 1

Kensuke FUKUDA 2, Patrice ABRY 1 Kenjiro CHO 3

1CNRS – ENS Lyon Physics Lab., Université de Lyon, France –2NII, Tokyo, Japan – 3IIJ Internet Initiative Japan.

INFOCOM 2009

CENTRE NATIONAL �

DE LA RECHERCHE�

SCIENTIFIQUE

ECOLE NORMALE SUPERIEURE DE LYON

Introduction Classical Analysis Robust Analysis w/ Sketches Longitudinal Study Conclusion

Internet Traffic: A Longitudinal Analysis[Context: Passive Monitoring of TCP/IP traffic on a link]

What are the evolutions of traffic over the years?• Topics in Statistical analysis of traffic

• Distributions of protocols, of packet sizes, of IAT, of flows,...• Aggregated traffic: Marginal laws• LRD (Long Range Dependence)• ...

• Diversity of expected traffic: http, P2P, mail, DNS,...• Variety of conditions: used bandwidth, congestion,...• Frequent anomalies: scans, viruses&worms, DDoS,...• ...

• Intuition: One trace is not enough!(for longitudinal, empirical data analysis)

• MAWI dataset: more than 7 years of daily traces

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 1 / 48

Page 2: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Goals⇒ Discuss issues in Internet robust monitoring

• Internet monitoring :- parameter estimation, anomaly detection, traffic

classification, . . .• Issues :

- database ?- data ? Information actually available ?- level of Description (Pkt, Flow, Session,. . . ) ?- aggregation level ?- robust monitoring ?- objective (scientific) assessment and comparisons ?

• Solutions :- Multiresolution Analysis,- Random projections (sketches),

• Illustrations on MAWI database- parameter estimation,- anomaly detection.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 2 / 48

Page 3: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Issue 1 : Data Set• What Data Set ?

- What network ? What link ?- What type of traffic ? What What usage (academic,

companies, commercials,...) ?- What Size ? What duration ? How many traces ?- Publicly available ? What documentation ?

• Answer : MAWI database- WIDE network (AS2500). TransPacific (Japan-US)

Backbone.- Sample Point B : 18Mbps CAR (100Mbps link)- Sample Point F : 100Mpbs, 150Mpbs CAR (1Gbps link),

after 2007- ' 1.5TB of (compressed and anonymized) packet traces- 7 years (2001-2008), each day, 15 min long traces,- A few 24h traces (One Day in the Life of Internet),- Publicly available and (partially) documented at :

http ://mawi.wide.ad.jp/

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 3 / 48

Page 4: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Issue 2 : Data

• What Data ?- What information are you actually allowed/able to use ?- What are the rules of the Game ?- What are the goals of the Game ?- IP 5−tuple ? Time stamp ? Payload ? Netflow ?

• Answer : MAWI data- IP Pkt time stamps and 5−tuple :

IPProtocol, IPSrc, IPDst, PtSrc, PtDst,- no payload !- no bi-directionality !

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 4 / 48

Page 5: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Issue 3 : Level of Description

• What description level ?- Pkt ? Flow (connection) ? Session ?

• Issues :- Ability to collect ? Network load ?- Sampling ? What sampling ?- Storage ? Real-Time ? On-Line ?

• Answer : IP Pkt level- all Pkt, no sampling,⇒ Aggregated Pkt number time series,⇒ Aggregated Byte number (volume) time series.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 5 / 48

Page 6: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Issue 4 : Aggregation Level

• What aggregation level ∆ ?- Typical Pkt InterArrival Time ' 0.1 ms,- Typical data collection (or stationarity) ' 1hr,- 10−3s ≤ ∆ ≤ 103s, choice within 6 orders of magnitude !

• Issues :- What goals ?- e.g., Anomaly detection : anomaly duration ? volume ?- Real-Time ? On-Line ?

• Answer :- do not single out an arbitrary ∆,- do the analysis for all ∆ jointly !⇒ MutiResolution Analysis and modeling.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 6 / 48

Page 7: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Issue 5 : Robust Analysis• Normal is wild :

- Heterogeneity : different traffics, usages, applications,requests, constraints, hardware, software,

- Superimposition : all mixed in a single trace,- Difficult statistics : long memory, heavy tail, non stationarity,⇒ Intrinsic (or natural) large variability,⇒ Normal ? Does it exist ? Anomalies constantly occurring ?

• Issues :- What is the confidence size of a parameter estimate ?- How much should you trust an analysis performed a

particular day ? on a particular traffic ?- How general ? What credit ?⇒ A single time series is not enough ! Illustrations !

• Answer :- Text book solution : Average,- On what ? different traces ? different days same hour ?

different hours same day ?⇒ Random Projections (Sketches).

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 7 / 48

Page 8: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

OutlineIssues

MultiresolutionMarginalsCovariance

Random Projections

Robust EstimationPrincipleSeven years . . . and One Day

Anomaly DetectionPrincipleSeven years

Future

Appendix

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 8 / 48

Page 9: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Gaussian or not Gaussian ?• Aggregated traffic X∆(t) : # of packets counted during ∆

(alternatively : # of bytes during ∆)• Marginal :• Poisson ? Exponential ? Gaussian ? depends on ∆ !

0 2 4 6 8 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 10 20 30 40 500

0.02

0.04

0.06

0.08

0.1

0 50 100 150 200 2500

0.005

0.01

0.015

0.02

∆ = 4ms ∆ = 32ms ∆ = 256ms

• Fit/Model : Gamma Γα,β(x) =1

βΓ(α)

(xβ

)α−1

exp(−xβ

).

Neither Exp. p(x) = e−x/β/β nor Gaussian : p(x) = e−(x−µ)2/2σ2

√2πσ

.

[Scherrer et al. IEEE TDSC’07]

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 9 / 48

Page 10: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Gamma Distributions

Γα,β(x) =1

βΓ(α)

„xβ

«α−1

exp„− xβ

«.

0 2 4 6 8 10 12 14 16 18 200

0.1

0.2

0.3

0.4

0.5

Gamma(1,1)Gamma(2,1)Gamma(3,1)Gamma(4,1)Gamma(6,1)Gamma(8,1)Gamma(10,1)

0 5 10 15 20 25 30 350

0.05

0.1

0.15

0.2

0.25

Gamma(3,1)Gamma(3,2)Gamma(3,3)Gamma(3,4)Gamma(3,5)Gamma(3,6)

• Shape parameter α : From Gaussian to exponential,1/α ' distance from Gaussian,

• Scale parameter β : Multiplicative factor.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 10 / 48

Page 11: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Gamma Fits

• Empirical PDFs and Gamma Fits LBL-TCP-3

0 2 4 6 8 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 10 20 30 40 500

0.02

0.04

0.06

0.08

0.1

0 50 100 150 200 2500

0.005

0.01

0.015

0.02

∆ = 4ms ∆ = 32ms ∆ = 256ms

• Accurately Fits data for all aggregation levels ∆,• Stability under addition :

X1 : Γα1,β ,X2 : Γα2,β , (X1,X2) Indep. =⇒ X1 + X2 : Γα1+α2,β ,• Aggregation : X2∆(k) = X∆(k) + X∆(k + 1).

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 11 / 48

Page 12: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Parameter Estimation : α∆, β∆

• Stability under addition and Independence

⇒{α(∆) = α0∆β(∆) = β0

0 50 100 150 200 250 3000

2

4

6

8

10

12

αβ

• α∆, β∆ accommodate correlations !

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 12 / 48

Page 13: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

OutlineIssues

MultiresolutionMarginalsCovariance

Random Projections

Robust EstimationPrincipleSeven years . . . and One Day

Anomaly DetectionPrincipleSeven years

Future

Appendix

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 13 / 48

Page 14: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Covariance : the wavelet point of view

• X∆ stationary stochastic process, with spectrum fX∆(ν),

• Wavelet Coefficients : dX (j , k),WaveletTransform

• Wavelet Spectrum : S(j) =1nj

nj∑k=1

|dX∆(j , k)|2,

IES(j) =

∫fX (ν)2j |Ψ0(2jν)|2dν.

• Spectral Estimation : fX (ν = 2−jν0) = S(j) .

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 14 / 48

Page 15: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Both Short and Long Range Dependencies

• Log-scale Diagram : log2 S2(j) vs. log2 2j = j .X∆, LBL-TCP-3, ∆ = 1ms

1 5 10 15

−2

0

2

4

6

8

j

log2S j

• Power law at coarse scales (low frequencies) :⇒ Long range dependence, LRD

• Short dependence at fine scales (low frequencies),• ⇒ Use a FARIMA(P,d ,Q) covariance form.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 15 / 48

Page 16: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

FARIMA(P,d ,Q) covariance

farima = fractionally Integrated ARMA.

1. fractional integration with parameter d ,2. ARMA(P,Q)→ ARMA(1,1), params. θ, φ.

fX∆(ν) = σ2

ε

∣∣∣1− e−i2πν∣∣∣−2d |1− θe−i2πν |2

|1− φe−i2πν |2,

• d controls Long Range Dep., with γ = 2d ,• P,Q control Short Range Dep.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 16 / 48

Page 17: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Empirical LDs and FARIMA(P,d ,Q) Fits LBL-TCP-3

1 5 10 15

−2

0

2

4

6

8

j

log 2S

j

1 5 10 15

−2

0

2

4

6

8

j

log 2S

j

1 5 10 15

−2

0

2

4

6

8

j

log 2S

j

∆ = 4ms ∆ = 32ms ∆ = 256ms

0 2 4 6 80

0.1

0.2

0.3

0.4

0.5

log2(∆)

d

0 2 4 6 8

−0.2

0

0.2

0.4

0.6

0.8

1

log2(∆)

θφ

• Accurately Fits data for all aggregation levels ∆,• LRD is persistent, SRD are cancelled out.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 17 / 48

Page 18: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Gamma-Farima Modeling

• Numerical synthesis procedures for bivariate GammaFarima processes,

NumericalSynthesis

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 18 / 48

Page 19: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Random Projections or sketches

Sketches = ensemble of outputs of random hash table[Muthukrishnan’03, Krishnamurty’03,...] [Abry+ SAINT’07, Dewaele+ Sigcomm LSAD’07]

• Random Hash Functions : hn- y = h(x),- M− outputs : y ∈ [1, . . . ,M],- k− universal Hash functions.

• Hash the Traffic :- Packet : i−th packet, n−tuple : ti ,PTscri ,PTdsti , IPsrci , IPdsti- Choose one specific key : e.g., Destination Address- Hash according to this key : mi = h(IPdsti ) ∈ [1, . . . ,M],- All packets with same mi = one sub-trace, sampled by

random projection.

- Aggregate traffic {ti ,mi}i∈I into M series X m∆ (t), bins of ∆s.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 19 / 48

Page 20: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Sketched Traffic

• Sketches = M sub-traces representing the total traffic• Total of outputs = total trace (constrained sampling)• Each sketched output = random flow-sampling

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 20 / 48

Page 21: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

OutlineIssues

MultiresolutionMarginalsCovariance

Random Projections

Robust EstimationPrincipleSeven years . . . and One Day

Anomaly DetectionPrincipleSeven years

Future

Appendix

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 21 / 48

Page 22: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Robust Estimation : Sketches + Multiresolution

• On each Sketch output, for each ∆ :- Γα,β(X∆) Fit and estimation of αm

∆, βm∆.

- Compute LDs and estimate Hm∆ (or FARIMA params)

- Combine estimates over m and ∆⇒ Adaptativity : Reference is given by data themselves and

not a priori !⇒ Robustness : Median is a robust average !⇒ Impact of outliers (Anomalies) decreased !

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 22 / 48

Page 23: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

MAWI data : B-US2Jp, 2005/07/11MiB/s

0s 150 300 450 600 750 900s0

0.5

1

1.5

2 103 Pkt/s

0s 150 300 450 600 750 900s0

1

2

3

4

LD for Byte count Hg=0.94

Hm

=0.88

2ms 16ms 128ms 1s 8s 64s

LD for Pkt count Hg=0.92

Hm

=0.90

2ms 16ms 128ms 1s 8s 64s

• All Hms are consistent ! Hms and Hg are consistent !

• LRDs on Bytes pr Pkts are consistent !

• Normal Traffic : no congestion (no anomaly ?)

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 23 / 48

Page 24: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

MAWI data : B-US2Jp, 2003/06/03, CongestionMiB/s

0s 150 300 450 600 750 900s0

0.5

1

1.5

2 103 Pkt/s

0s 150 300 450 600 750 900s0

1

2

3

4

LD for Byte count Hg=0.41

Hm

=0.80

2ms 16ms 128ms 1s 8s 64s

LD for Pkt count Hg=0.89

Hm

=0.83

2ms 16ms 128ms 1s 8s 64s

• HByteg ' 0.4 : no variability, no LRD, HByte

g 6= HPktg

• HBytem ' 0.9, Flow variability, significant LRD, HByte

m ' HPktm

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 24 / 48

Page 25: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

MAWI data : B-Jp2US, 2004/09/21, Anomalies

0 100 200 300 400 500 600 700 800 9000

2

4

6

8

10

12

14

16

18x 10

5 Sketched Traffic: Bytes Counts / 1 s

seconds0 100 200 300 400 500 600 700 800 900

0

1000

2000

3000

4000

5000

6000Sketched Traffic: Packet Counts / 1 s

seconds

0 2 4 6 8 10 12 14 16 1816

17

18

19

20

21

22

23

24Sketched Traffic: Bytes Counts − LD − H=0.777

scales0 2 4 6 8 10 12 14 16 18

−2

0

2

4

6

8

10Sketched Traffic: Packet Counts − LD − H=0.905

scales

• HByteg ' 0.7 : LD ? ? ?, HPkt

g ' 1, ? ? ?

• HBytem ' 0.8, LDs ok, significant LRD, HByte

m ' HPktm

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 25 / 48

Page 26: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

OutlineIssues

MultiresolutionMarginalsCovariance

Random Projections

Robust EstimationPrincipleSeven years . . . and One Day

Anomaly DetectionPrincipleSeven years

Future

Appendix

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 26 / 48

Page 27: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Longitunal study of MAWI backbone dataset

103 packets/s

Jp2US

103 packets/s

Jp2US

2001 2 3 4 5 6 7 20080

10

20103 packets/s

US2Jp

103 packets/s

US2Jp

2001 2 3 4 5 6 7 20080

10

20

MiBytes/s

Jp2US

MiBytes/s

Jp2US

2001 2 3 4 5 6 7 20080

5

10 MiBytes/s

US2Jp

MiBytes/s

US2Jp

2001 2 3 4 5 6 7 20080

5

10

Pkt Size Distrib.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 27 / 48

Page 28: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Robust Estimation with Sketches : H

H (

pack

ets)

Jp2USJp2US

2001 2 3 4 5 6 7 20080.40.60.8

11.2

H (

pack

ets)

US2JpUS2Jp

2001 2 3 4 5 6 7 20080.40.60.8

11.2

H (

byte

s)

Jp2USJp2US

2001 2 3 4 5 6 7 20080.40.60.8

11.2

H (

byte

s)

US2JpUS2Jp

2001 2 3 4 5 6 7 20080.40.60.8

11.2

• Congestion = global traffic goes to H ' 0.5• However : flows still see relevant LRD :

median on sketch’s output ∼ usual traffic, H ' 0.8 to 0.9

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 28 / 48

Page 29: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

LRD in Pkt or Bytes Times series ?(Non Robust Global Estimation)

Scatter plots of H(B) (byte) vs. H(P) (packet)

Hg(P)

Hg(B

)

Jp2US

Jp2USJp2USJp2USJp2US

0.4 0.6 0.8 1 1.2 1.4 1.6

0.4

0.6

0.8

1

1.2

1.4

1.6B normalB congestedB restrictedF

Hg(P)

Hg(B

)US2Jp

US2JpUS2JpUS2JpUS2Jp

0.4 0.6 0.8 1 1.2 1.4 1.6

0.4

0.6

0.8

1

1.2

1.4

1.6B normalB congestedB sasserF

o : B without congestion ; • : B with congestion ;+ : B anomaly (US2Jp) and � : F. : restricted traffic (Jp2US) ;Left : Jp2US ; Right : US2Jp.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 29 / 48

Page 30: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

LRD in Pkt or Bytes Times series ?(Robust Median-sketch Estimation)

Scatter plots of H(B) (byte) vs. H(P) (packet)

Hm

(P)

Hm

(B)

Jp2US

Jp2USJp2USJp2USJp2US

0.4 0.6 0.8 1 1.2 1.4 1.6

0.4

0.6

0.8

1

1.2

1.4

1.6B normalB congestedB restrictedF

Hm

(P)

Hm

(B)

US2Jp

US2JpUS2JpUS2JpUS2Jp

0.4 0.6 0.8 1 1.2 1.4 1.6

0.4

0.6

0.8

1

1.2

1.4

1.6B normalB congestedB sasserF

o : B without congestion ; • : B with congestion ;+ : B anomaly (US2Jp) and � : F. restricted traffic (Jp2US) ;Left : Jp2US ; Right : US2Jp.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 30 / 48

Page 31: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

More Gaussian or not ?Jp2US

α (j)

Jp2US

2001 2 3 4 5 6 7 20080

5

10

15

20

25

30

35

US2Jp

α (j)

US2Jp

2001 2 3 4 5 6 7 20080

5

10

15

20

25

30

35

Jp2US

α (j)

/ α

(J)

Jp2US

2001 2 3 4 5 6 7 2008

0

0.2

0.4

0.6

0.8

1US2Jp

α (j)

/ α

(J)

US2Jp

2001 2 3 4 5 6 7 2008

0

0.2

0.4

0.6

0.8

1

Top : indices αj , as a function of time, j = 2,4,6,8,9. Bottom :normalized α′j = αj/αJ (J = 9). Left : Jp2US ; Right : US2Jp.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 31 / 48

Page 32: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

and One Day, 2008/03/19 : Global vs. Median

0h 4h 8h 12h 16h 20h 0h0.40.60.8

11.2

Hpkt

0h 4h 8h 12h 16h 20h 0h0.40.60.8

11.2

HByt

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 32 / 48

Page 33: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

and One Day, 2008/03/19 : LRD ?

2ms 16ms 128ms 1s 8s 64s 512s

LDm

(6h)

LDm

(15min)

LD for Pkt count

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 33 / 48

Page 34: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Robust Estimation

⇒ [Borgnat et al.,”Seven Years and One Day : Sketching the Evolution ofInternet Traffic”, Infocom2009]

⇒ Find outliers→ Anomaly Detection !

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 34 / 48

Page 35: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

OutlineIssues

MultiresolutionMarginalsCovariance

Random Projections

Robust EstimationPrincipleSeven years . . . and One Day

Anomaly DetectionPrincipleSeven years

Future

Appendix

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 35 / 48

Page 36: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Anomaly Detection ?• Issue(s) 6 :

- What traffic ? What data ? What information available ?(e.g., Netflow vs. Pkt, single link vs. network widemonitoring, . . . )

- What are the rules and Goals ? (Computational load,memory, sampling, how precise location, IP identification,Nature of the anomaly ?)

- Signatures (determinist) vs. Profiles (Statistics) ? Sign vs Prof

• Answers :- MAWI database, no anomaly documentation, Single Link- Pkt Level (5−tuple), no payload, no bidirectionality- Statistical detection : no a priori list of known anomalies, low

signal to noise ratio, anomalous in a stat way (possibly legit)

• Example : DDoS

• References : Biblio.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 36 / 48

Page 37: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Anomaly Detection : Key Steps of our Contribution[Sketch based Anomaly Detection, Identification,.... Abry, Borgnat, Dewaele. SAINT’07][Extracting Hidden Anomalies using Sketch and Non Gaussian Multiresolution Statistical

Detection Procedures. Dewaele, Fukuda, Borgnat, Abry & Cho. LSAD Sigcomm’07]

- Step 1 : Sketches (for adaptive reference, no model, noprediction, no a priori, no learning)

- Step 2 :Multiresolution (to avoid a priori aggregation levelchoice)

- Step 3a :Gamma parameters (path to Gaussianity insteadof Gaussianity itself)

- Step 3b :Farima (Long vs. Short dependencies)- Step 4 : Detection : Comparison across aggregation levels

and across sketches

hey Man, u r late !

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 37 / 48

Page 38: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Step 1 : Sketches (random projection/sampling)

×N

• Sketch of M Outputs × N different choices of hash tables• Hashing Key : IPSource , IPDestination...

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 38 / 48

Page 39: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Step 2 : Multiresolution or Multi-Scale AggregationAnalysis

• Aggregated traffic with scales : 5ms, 10ms, ..., 1s

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 39 / 48

Page 40: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Step 3 :- Modeling with non-Gaussian statistics

• Gamma laws : parameters α(∆) and β(∆)

Γα,β(x) =1

βΓ(α)

(xβ

)α−1

exp(−xβ

).

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 40 / 48

Page 41: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Step 4 : Comparisons across Scales and Sketches

• Compute median and standard deviation across outputs.• Anomaly : one output is too far from the average.• Too far : Mahalanobis distance :

Dα =

1J

J∑j=1

|αn∆j− αRef

∆j|2

σ2α,∆j

1/2

>threshold.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 41 / 48

Page 42: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Algo. : Sketches + Multiresolution + Gamma statistics

• Enhanced contrast of anomaly wrt. background• Adaptive Reference (extracted from traffic, not a priori)• No a priori on Typical Time Scales,• Identification of IPAddress associated of anomaly.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 42 / 48

Page 43: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Identification of IP involvedUse N different sketches (or hash tables) on the same key

• IP that are not always in anomalous outputs = normal• IP that are always in anomalous outputs = anomalies• Collisions : #C = NIPM−2N � 1⇒ N > 5 (with M = 32).

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 43 / 48

Page 44: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

OutlineIssues

MultiresolutionMarginalsCovariance

Random Projections

Robust EstimationPrincipleSeven years . . . and One Day

Anomaly DetectionPrincipleSeven years

Future

Appendix

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 44 / 48

Page 45: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Anomaly Detection : Longitudinal Study

2001 2 3 4 5 6 7 2008

Jp2U

S0

14

Ping

2001 2 3 4 5 6 7 2008

US

2Jp

014

Sasser

Ping

Red : Definitely attacks : Ping/SYN floods, spoofed,...Yellow : Potentially attacks : various mechanisms.Green : Suspicious traffic : WWW, P2P, GRE, DNS.

• Numerous Anomalies, Each Day (more than 12 large),Large Varieties (in nature, time scales, goals, impacts,...)• Normal Traffic barely exists⇒ Need for Sketches• No Ground Truth⇒ Human Inspection,• A heuristic rule based classifier (port #) classifier

U r really late, Man !

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 45 / 48

Page 46: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Anomaly Detection versus Traffic Classification

• Rule-based traffic classification (port #, heuristic rules,...)+ Anomaly Detection

Classif.

• Host-based traffic classification to cross-validate/helpAnomaly Detection (under progress)

- Random projections- Minimum Spanning Tree

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 46 / 48

Page 47: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Conclusions and Perspectives

• Conclusions :- Gaussian vs. non Gaussian, Long vs. short memory,- Multiresolution Analysis and Modeling,- Random projections (Sketches)⇒ Robustness, Adaptativeness

• Perspectives :- Host based traffic classification, for anomaly detection- Scientific comparisons and assessments :

What (public) data ?What rules ? What Information allowed to use ?What goals ?How to compare ? Methodology ? Framework ?Willingness ?

- Care for history ! POP : Colosseum Example !

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 47 / 48

Page 48: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

References

• Non Gaussian and Long Memory Statistical Modeling ofInternet Traffic, Scherrer et al. IEEE TDSC07.

• Sketch based Anomaly Detection, Identification,... Abry,Borgnat, Dewaele. SAINT’07.

• Extracting Hidden Anomalies using Sketch and NonGaussian Multiresolution Statistical Detection Procedures,Dewaele, Fukuda, Borgnat, Abry & Cho. LSADSigcomm’07.

• Seven Years and One Day : Sketching the Evolution ofInternet Traffic, Borgnat et al., Infocom2009.

perso.ens-lyon.fr/patrice.abry

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 48 / 48

Page 49: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

MAWI data : B-US2Jp, 2005/07/11

MiB/s

0s 150 300 450 600 750 900s0

0.51

1.52

PDF (#MiB/Deltaj)

0 0.1 0.2 0.3

8ms32ms64ms128ms

LD for Byte count H=0.94

2ms 16ms 128ms 1s 8s 64s

103 Pkt/s

0s 150 300 450 600 750 900s01234

PDF (#Pkt/Deltaj)

0 200 400 600

8ms32ms64ms128ms

LD for Pkt count H=0.92

2ms 16ms 128ms 1s 8s 64s

• Compares well with current knowledge and theory/models

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 49 / 48

Page 50: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

MAWI data : B-US2Jp, 2003/06/03

MiB/s

0s 150 300 450 600 750 900s0

0.51

1.52

PDF (#MiB/Deltaj)

0 0.1 0.2 0.3

8ms32ms64ms128ms

LD for Byte count H=0.41

2ms 16ms 128ms 1s 8s 64s

103 Pkt/s

0s 150 300 450 600 750 900s01234

PDF (#Pkt/Deltaj)

0 200 400 600

8ms32ms64ms128ms

LD for Pkt count H=0.89

2ms 16ms 128ms 1s 8s 64s

• Congestion.

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 50 / 48

Page 51: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

MAWI data : B-Jp2US, 2004/09/21

MiB/s

0s 150 300 450 600 750 900s0

0.51

1.52

PDF (#MiB/Deltaj)

0 0.1 0.2 0.3

8ms32ms64ms128ms

LD for Byte count H=0.73

2ms 16ms 128ms 1s 8s 64s

103 Pkt/s

0s 150 300 450 600 750 900s01234

PDF (#Pkt/Deltaj)

0 200 400 600

8ms32ms64ms128ms

LD for Pkt count H=1.00

2ms 16ms 128ms 1s 8s 64s

• Anomalies :network scan, spoofed flooding, attack on a Realserver

Back

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 51 / 48

Page 52: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Longitunal study : Pkt Size

2001 2 3 4 5 6 7 20080%

20%

40%

60%

80%

100%

Jp2US smallmediumlarge

2001 2 3 4 5 6 7 20080%

20%

40%

60%

80%

100%

US2Jp smallmediumlarge

Back

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 52 / 48

Page 53: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Long Range Dependence

Definition of Long Range Dependence

Covariance is a non-summable power-law→ spectrum fX∆(ν) :

fX∆(ν) ∼ C|ν|−γ , |ν| → 0, with 0 < γ < 1.

Long Range Dependence and Wavelets

IES(j) =

∫fX (ν)2j |Ψ0(2jν)|2du ' fX (ν = 2−jν0).

LRD =⇒ IES(j) ∼ C2j(γ−1),2j → +∞.

Back

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 53 / 48

Page 54: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Wavelet Transform• Let ψ0 denote an elementary mother wavelet,• Shifted and dilated templates of ψ0 :ψj,k (t) = 2−j/2ψ0(2−j t − k),

• Wavelet Coefficients : dX∆(j , k) = 〈ψj,k ,X∆〉.

Back

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 54 / 48

Page 55: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Anomaly Detection : Some references

• Dimension reduction :- PCA, subspaces [Lakhina ’04]→ “normality” in time- Sketches [Muthukrishnan ’03], [Krishnamurty ’03]

• Model + prediction in time :- Anomaly = observation is different from prediction- [Brutlag ’00], [Barford ’02], [Zhang ’05]...

• Our contribution ?

Back

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 55 / 48

Page 56: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Anomalies in Internet Traffic – Detection ?Overview of strategies for anomaly detection

• Methods based on signatures• recognition of packets• avantage : robust• drawbacks : limited to known anomalies, with specific

signatures, scalability with increasing number ofanomalies ?

• Methods based on anomalies or statistical profile• use statistical properties of traffic : normal vs. abnormal• avantage : versatile, indifferent to number of signatures• drawbacks : variability of traffic• statistics→ false alarm vs. detection prob. trade-off

Back

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 56 / 48

Page 57: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Anomaly Detection : DDoSSchematic scenario of DDoS

• Attack with packets without specific signatures• Objective : detection in low SNR = close to the source

Back

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 57 / 48

Page 58: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Results : Longitudinal analysis of traffic + anomaliesMAWI dataset : 15’ per day, trans-pacific backbone

2001 2 3 4 5 6 7 2008

Jp2U

S10

0%0%

HTTP

Peer to peerP

ing

flood

suspected P2P

2001 2 3 4 5 6 7 2008

US

2Jp

0%10

0%

HTTP

Peer to peer

Pin

g fl.

Sas

ser

wor

m

suspectedP2P

• Bottom to top :Ping, DNS, common services, MS vulnerarities, Sasser,HTTP, broadcast, suspected P2P, identified P2P, otherTCP/UDP,INLSP (left) / GRE (right).

• Large proportion of hidden P2P, and of anomalies !

Back

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 58 / 48

Page 59: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Synthesis of a Γ-farima processProcedure.

• Mapping – 1st order stat. : if Yj(k) is a Gaussian r.v. withvariance β/2, then

X (k) =2α∑j=1

Yj(k)2 (1)

is a Γα,β r.v.• Mapping – 2nd order stat. : as a consequence,

γY (k) =√γX (k)/4α. (2)

• Procedure : generate 2α Gaussian processes withcovariance γY derived with (2) from the farima covariance,then obtain X from (1).

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 59 / 48

Page 60: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Synthesis of a Γ-farima processProcedure.

• Mapping – 1st order stat. : if Yj(k) is a Gaussian r.v. withvariance β/2, then

X (k) =2α∑j=1

Yj(k)2 (3)

is a Γα,β r.v.• Mapping – 2nd order stat. : as a consequence,

γY (k) =√γX (k)/4α. (4)

• Procedure : generate 2α Gaussian processes withcovariance γY derived with (2) from the farima covariance,then obtain X from (1).

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 60 / 48

Page 61: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

Anomaly Detection : DDoSSchematic scenario of DDoS

• Attack with packets without specific signatures• Objective : detection in low SNR = close to the source

Back

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 61 / 48

Page 62: Seven years and one day in the life of Internet ...perso.ens-lyon.fr/patrice.abry/ARTICLES_PDF/09tma_ABRY.pdf · Issues Multiresolution Random Projections Robust Estimation Anomaly

Issues Multiresolution Random Projections Robust Estimation Anomaly Detection Future Appendix

POP : Colosseum Example

Back

Multiresolution and Random Projections. for Robust Estimation - P. Abry - TMA Eu. Cost Action - Barcelona, Spain - Oct. 2009 - 62 / 48