Download - Clase Kolmogorov Smirnov para Java
-
8/15/2019 Clase Kolmogorov Smirnov para Java
1/50
/** Licensed to the Apache Software Foundation(ASF) under one or more* contributor license agreements. See the
NO!"# file distributed with* this wor$ for additional informationregarding cop%right ownership.* he ASF licenses this file to &ou under theApache License' ersion .* (the +License+), %ou ma% not use this filee-cept in compliance with* the License. &ou ma% obtain a cop% of the
License at**http//www.apache.org/licenses/L!"#NS#.** 0nless re1uired b% applicable law or agreedto in writing' software* distributed under the License isdistributed on an +AS !S+ 2AS!S'
* 3!4O0 3A55AN!#S O5 "ON6!!ONS OF AN&7!N6' either e-press or implied.* See the License for the specific languagego8erning permissions and* limitations under the License.*/
pac$age
org.apache.commons.math9.stat.inference,
import :a8a.math.2ig6ecimal,import :a8a.util.Arra%s,import :a8a.util.4ashSet,
importorg.apache.commons.math9.distribution.#numerated5eal6istribution,
-
8/15/2019 Clase Kolmogorov Smirnov para Java
2/50
importorg.apache.commons.math9.distribution.5eal6istribution,import
org.apache.commons.math9.distribution.0niform5eal6istribution,importorg.apache.commons.math9.e-ception.!nsufficient6ata#-ception,importorg.apache.commons.math9.e-ception.;athArithmetic#-ception,
importorg.apache.commons.math9.e-ception.;ath!nternal#rror,importorg.apache.commons.math9.e-ception.NullArgument#-ception,importorg.apache.commons.math9.e-ception.Number!sooLarge#-ception,
importorg.apache.commons.math9.e-ception.OutOf5ange#-ception,importorg.apache.commons.math9.e-ception.oo;an%!terations#-ception,importorg.apache.commons.math9.e-ception.util.Local
i
-
8/15/2019 Clase Kolmogorov Smirnov para Java
3/50
importorg.apache.commons.math9.linear.Arra%65owField;atri-,import
org.apache.commons.math9.linear.Field;atri-,importorg.apache.commons.math9.linear.;atri-0tils,importorg.apache.commons.math9.linear.5eal;atri-,importorg.apache.commons.math9.random.=675andom>enerator,
importorg.apache.commons.math9.random.5andom>enerator,importorg.apache.commons.math9.random.3ell?@@9c,importorg.apache.commons.math9.util."ombinatorics0tils,import
org.apache.commons.math9.util.Fast;ath,importorg.apache.commons.math9.util.;athArra%s,importorg.apache.commons.math9.util.;ath0tils,
/** * !mplementation of the Ba
hrefC+http//en.wi$ipedia.org/wi$i/7olmogoro8Smirno8Dtest+E * 7olmogoro8Smirno8 (7S) testB/aE fore1ualit% of continuous distributions. * BpE* he 7S test uses a statistic based on thema-imum de8iation of the empiricaldistribution of
-
8/15/2019 Clase Kolmogorov Smirnov para Java
4/50
* sample data points from the distributione-pected under the null h%pothesis. For onesample tests* e8aluating the null h%pothesis that a set
of sample data points follow a gi8endistribution' the* test statistic is (6DnCsupD- GFDn(-)F(-)G)' where (F) is the e-pecteddistribution and* (FDn) is the empirical distribution ofthe (n) sample data points. hedistribution of
* (6Dn) is estimated using a method basedon H?I with certain 1uic$ decisions fore-treme 8alues* gi8en in HI.* B/pE* BpE* wosample tests are also supported'e8aluating the null h%pothesis that the twosamples
* JKcode - and JKcode % come from the sameunderl%ing distribution. !n this case' thetest* statistic is (6DJn'mCsupDt G FDn(t)FDm(t)G) where (n) is the length of JKcode-' (m) is* the length of JKcode %' (FDn) is theempirical distribution that puts mass (?/n)
at each of* the 8alues in JKcode - and (FDm) is theempirical distribution of the JKcode %8alues. he* default sample test method' JKlin$M$olmogoro8Smirno8est(doubleHI' doubleHI)wor$s as* follows* BulE
-
8/15/2019 Clase Kolmogorov Smirnov para Java
5/50
* BliEFor small samples (where the product ofthe sample si#DSA;L#D5O60")' the methodpresented in HI is used to compute the
* e-act p8alue for the sample test.B/liE* BliE3hen the product of the sample si#DSA;L#D5O60"' theas%mptotic* distribution of (6DJn'm) is used. SeeJKlin$ Mappro-imate(double' int' int) fordetails on* the appro-imation.B/liE
* B/ulEB/pEBpE* !f the product of the sample si#DSA;L#D5O60" and thesample* data contains ties' random :itter is addedto the sample data to brea$ ties beforeappl%ing* the algorithm abo8e. Alternati8el%' theJKlin$ Mbootstrap(doubleHI' doubleHI' int'
boolean)* method' modeled after BahrefC+http//se$hon.ber$ele%.edu/matching/$s.boot.html+E$s.bootB/aE* in the 5 ;atching pac$age H9I' can be usedif ties are $nown to be present in the data.* B/pE* BpE
* !n the twosample case' (6DJn'm) has adiscrete distribution. his ma$es the p8alue* associated with the null h%pothesis (4D 6DJn'm ge d ) differ from (4D 6DJn'mE d )* b% the mass of the obser8ed 8alue (d). odistinguish these' the twosample tests use aboolean* JKcode strict parameter. his parameter is
ignored for large samples.
-
8/15/2019 Clase Kolmogorov Smirnov para Java
6/50
* B/pE* BpE* he methods used b% the sample defaultimplementation are also e-posed directl%
* BulE* BliEJKlin$ Me-act(double' int' int'boolean) computes e-act sample p8aluesB/liE* BliEJKlin$ Mappro-imate(double' int' int)uses the as%mptotic distribution he JKcodeboolean* arguments in the first two methods allow
the probabilit% used to estimate the p8alueto be* e-pressed using strict or nonstrictine1ualit%. See* JKlin$ M$olmogoro8Smirno8est(doubleHI'doubleHI' boolean).B/liE* B/ulE* B/pE* BpE
* 5eferences* BulE* BliEH?I BahrefC+http//www.:statsoft.org/8P/i?P/+E#8aluating 7olmogoro8Qs 6istributionB/aE b%* >eorge ;arsaglia' 3ai 3an sang' and =ingbo3angB/liE* BliEHI Ba
hrefC+http//www.:statsoft.org/89@/i??/+E"omputing the woSided 7olmogoro8Smirno8* 6istributionB/aE b% 5ichard Simard andierre LQ#cu%erB/liE* BliEH9I =as:eet S. Se$hon. ??. BahrefC+http//www.:statsoft.org/article/8iew/8i+E* ;ulti8ariate and ropensit% Score ;atchingSoftware with Automated 2alance Optimi
-
8/15/2019 Clase Kolmogorov Smirnov para Java
7/50
* he ;atching pac$age for 5B/aE =ournal ofStatistical Software' () ?R.B/liE* BliEHI 3ilco-' 5and. ?. !ntroduction to5obust #stimation and 4%pothesis esting'
* "hapter R' 9rd #d. Academic ress.B/liE* B/ulE* Bbr/E* Note that H?I contains an error incomputing h' refer to Ba*hrefC+https//issues.apache.org/:ira/browse/;A49+E;A49B/aE for details.
* B/pE** Ksince 9.9*/public class 7olmogoro8Smirno8est J
/*** 2ound on the number of partial sums inJKlin$ M$sSum(double' double' int)
*/protected static final int;A!;0;DA5!ALDS0;D"O0N C ?,
/** "on8ergence criterion for JKlin$M$sSum(double' double' int) */protected static final double7SDS0;D"A0"4&D"5!#5!ON C ?#,
/** "on8ergence criterion for the sums inMpelood(double' double' int) */protected static final double>DS0;D5#LA!#D#55O5 C ?.e?,
/** No longer used. */K6eprecatedprotected static final int
S;ALLDSA;L#D5O60" C ,
-
8/15/2019 Clase Kolmogorov Smirnov para Java
8/50
/*** 3hen product of sample sienerator rng,
/*** "onstruct a 7olmogoro8Smirno8est instancewith a default random data generator.*/public 7olmogoro8Smirno8est() Jrng C new 3ell?@@9c(),
/*** "onstruct a 7olmogoro8Smirno8est with thepro8ided random data generator.* he Mmonte"arlo(double' int' int' boolean'int) that uses the generator supplied to this* constructor is deprecated as of 8ersion9.T.
*
-
8/15/2019 Clase Kolmogorov Smirnov para Java
9/50
* Kparam rng random data generator used b%JKlin$ Mmonte"arlo(double' int' int'boolean' int)*/
K6eprecatedpublic 7olmogoro8Smirno8est(5andom>eneratorrng) Jthis.rng C rng,
/*** "omputes the BiEp8alueB/iE' or BiEobser8ed
significance le8elB/iE' of a onesample Ba*hrefC+http//en.wi$ipedia.org/wi$i/7olmogoro8Smirno8Dtest+E 7olmogoro8Smirno8 testB/aE* e8aluating the null h%pothesis that JKcodedata conforms to JKcode distribution. !f* JKcode e-act is true' the distributionused to compute the p8alue is computed using* e-tended precision. See JKlin$
Mcdf#-act(double' int).** Kparam distribution reference distribution* Kparam data sample being being e8aluated* Kparam e-act whether or not to force e-actcomputation of the p8alue* Kreturn the p8alue associated with thenull h%pothesis that JKcode data is a sample
from* JKcode distribution* Kthrows !nsufficient6ata#-ception if JKcodedata does not ha8e length at least * Kthrows NullArgument#-ception if JKcodedata is null*/public double$olmogoro8Smirno8est(5eal6istribution
distribution' doubleHI data' boolean e-act) J
-
8/15/2019 Clase Kolmogorov Smirnov para Java
10/50
return ?d cdf($olmogoro8Smirno8Statistic(distribution'data)' data.length' e-act),
/*** "omputes the onesample 7olmogoro8Smirno8test statistic' (6DnCsupD- GFDn(-)F(-)G)where* (F) is the distribution (cdf) functionassociated with JKcode distribution' (n)is the
* length of JKcode data and (FDn) is theempirical distribution that puts mass (?/n)at* each of the 8alues in JKcode data.** Kparam distribution reference distribution* Kparam data sample being e8aluated* Kreturn 7olmogoro8Smirno8 statistic (6Dn)
* Kthrows !nsufficient6ata#-ception if JKcodedata does not ha8e length at least * Kthrows NullArgument#-ception if JKcodedata is null*/public double$olmogoro8Smirno8Statistic(5eal6istributiondistribution' doubleHI data) J
chec$Arra%(data),final int n C data.length,final double nd C n,final doubleHI data"op% C new doubleHnI,S%stem.arra%cop%(data' ' data"op%' ' n),Arra%s.sort(data"op%),double d C d,for (int i C ?, i BC n, iUU) J
-
8/15/2019 Clase Kolmogorov Smirnov para Java
11/50
final double %i Cdistribution.cumulati8erobabilit%(data"op%Hi ?I),final double curr6 C Fast;ath.ma-(%i (i
?) / nd' i / nd %i),if (curr6 E d) Jd C curr6,return d,/**
* "omputes the BiEp8alueB/iE' or BiEobser8edsignificance le8elB/iE' of a twosample Ba*hrefC+http//en.wi$ipedia.org/wi$i/7olmogoro8Smirno8Dtest+E 7olmogoro8Smirno8 testB/aE* e8aluating the null h%pothesis that JKcode- and JKcode % are samples drawn from thesame* probabilit% distribution. Specificall%'
what is returned is an estimate of theprobabilit%* that the JKlin$M$olmogoro8Smirno8Statistic(doubleHI'doubleHI) associated with a randoml%* selected partition of the combined sampleinto subsamples of si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
12/50
* in HI' implemented in JKlin$Me-act(double' int' int' boolean). B/liE* BliE3hen the product of the sample si#DSA;L#D5O60"' the
* as%mptotic distribution of (6DJn'm) isused. See JKlin$ Mappro-imate(double' int'int)* for details on the appro-imation.B/liE* B/ulEBpE* !f JKcode -.length * %.length B JK8alueMLA5>#DSA;L#D5O60" and the combined setof 8alues in
* JKcode - and JKcode % contains ties'random :itter is added to JKcode - andJKcode % to* brea$ ties before computing (6DJn'm) andthe p8alue. he :itter is uniforml%distributed* on (min6elta / ' min6elta / ) wheremin6elta is the smallest pairwise differencebetween
* 8alues in the combined sample.B/pE* BpE* !f ties are $nown to be present in thedata' JKlin$ Mbootstrap(doubleHI' doubleHI'int' boolean)* ma% be used as an alternati8e method forestimating the p8alue.B/pE*
* Kparam - first sample dataset* Kparam % second sample dataset* Kparam strict whether or not theprobabilit% to compute is e-pressed as astrict ine1ualit%* (ignored for large samples)* Kreturn p8alue associated with the nullh%pothesis that JKcode - and JKcode %represent
* samples from the same distribution
-
8/15/2019 Clase Kolmogorov Smirnov para Java
13/50
* Kthrows !nsufficient6ata#-ception if eitherJKcode - or JKcode % does not ha8e lengthat* least
* Kthrows NullArgument#-ception if eitherJKcode - or JKcode % is null* Ksee Mbootstrap(doubleHI' doubleHI' int'boolean)*/public double $olmogoro8Smirno8est(doubleHI-' doubleHI %' boolean strict) Jfinal long lengthroduct C (long) -.length *
%.length,doubleHI -a C null,doubleHI %a C null,if (lengthroduct B LA5>#DSA;L#D5O60" VVhasies(-'%)) J-a C ;athArra%s.cop%Of(-),%a C ;athArra%s.cop%Of(%),fi-ies(-a' %a), else J
-a C -,%a C %,if (lengthroduct B LA5>#DSA;L#D5O60") Jreturn e-act($olmogoro8Smirno8Statistic(-a'%a)' -.length' %.length' strict),return
appro-imate($olmogoro8Smirno8Statistic(-'%)' -.length' %.length),
/*** "omputes the BiEp8alueB/iE' or BiEobser8edsignificance le8elB/iE' of a twosample Ba*hrefC+http//en.wi$ipedia.org/wi$i/7olmogoro8
Smirno8Dtest+E 7olmogoro8Smirno8 testB/aE
-
8/15/2019 Clase Kolmogorov Smirnov para Java
14/50
* e8aluating the null h%pothesis that JKcode- and JKcode % are samples drawn from thesame* probabilit% distribution. Assumes the
strict form of the ine1ualit% used to computethe* p8alue. See JKlin$M$olmogoro8Smirno8est(5eal6istribution'doubleHI' boolean).** Kparam - first sample dataset* Kparam % second sample dataset
* Kreturn p8alue associated with the nullh%pothesis that JKcode - and JKcode %represent* samples from the same distribution* Kthrows !nsufficient6ata#-ception if eitherJKcode - or JKcode % does not ha8e lengthat* least * Kthrows NullArgument#-ception if either
JKcode - or JKcode % is null*/public double $olmogoro8Smirno8est(doubleHI-' doubleHI %) Jreturn $olmogoro8Smirno8est(-' %' true),
/**
* "omputes the twosample 7olmogoro8Smirno8test statistic' (6DJn'mCsupD- GFDn(-)FDm(-)G)* where (n) is the length of JKcode -' (m) is the length of JKcode %' (FDn) isthe* empirical distribution that puts mass (?/n) at each of the 8alues in JKcode - and(FDm)
-
8/15/2019 Clase Kolmogorov Smirnov para Java
15/50
* is the empirical distribution of the JKcode% 8alues.** Kparam - first sample
* Kparam % second sample* Kreturn test statistic (6DJn'm) used toe8aluate the null h%pothesis that JKcode -and* JKcode % represent samples fromthe same underl%ing distribution* Kthrows !nsufficient6ata#-ception if eitherJKcode - or JKcode % does not ha8e length
at* least * Kthrows NullArgument#-ception if eitherJKcode - or JKcode % is null*/public double$olmogoro8Smirno8Statistic(doubleHI -'doubleHI %) Jreturn integral7olmogoro8Smirno8Statistic(-'
%)/((double)(-.length * (long)%.length)),
/*** "omputes the twosample 7olmogoro8Smirno8test statistic' (6DJn'mCsupD- GFDn(-)FDm(-)G)* where (n) is the length of JKcode -'
(m) is the length of JKcode %' (FDn) isthe* empirical distribution that puts mass (?/n) at each of the 8alues in JKcode - and(FDm)* is the empirical distribution of the JKcode% 8alues. Finall% (n m 6DJn'm) isreturned* as long 8alue.
*
-
8/15/2019 Clase Kolmogorov Smirnov para Java
16/50
* Kparam - first sample* Kparam % second sample* Kreturn test statistic (n m 6DJn'm) usedto e8aluate the null h%pothesis that JKcode
- and* JKcode % represent samples fromthe same underl%ing distribution* Kthrows !nsufficient6ata#-ception if eitherJKcode - or JKcode % does not ha8e lengthat* least * Kthrows NullArgument#-ception if either
JKcode - or JKcode % is null*/pri8ate longintegral7olmogoro8Smirno8Statistic(doubleHI-' doubleHI %) Jchec$Arra%(-),chec$Arra%(%),// "op% and sort the sample arra%sfinal doubleHI s- C ;athArra%s.cop%Of(-),
final doubleHI s% C ;athArra%s.cop%Of(%),Arra%s.sort(s-),Arra%s.sort(s%),final int n C s-.length,final int m C s%.length,
int ran$ C ,int ran$& C ,
long cur6 C l,
// Find the ma- difference between cdfD- andcdfD%long sup6 C l,do Jdouble < C 6ouble.compare(s-Hran$I's%Hran$&I) BC W s-Hran$I s%Hran$&I,while(ran$ B n VV 6ouble.compare(s-Hran$I'
-
8/15/2019 Clase Kolmogorov Smirnov para Java
17/50
ran$ UC ?,cur6 UC m,while(ran$& B m VV 6ouble.compare(s%Hran$&I'
-
8/15/2019 Clase Kolmogorov Smirnov para Java
18/50
public double$olmogoro8Smirno8est(5eal6istributiondistribution' doubleHI data) Jreturn $olmogoro8Smirno8est(distribution'
data' false),
/*** erforms a BahrefC+http//en.wi$ipedia.org/wi$i/7olmogoro8Smirno8Dtest+E 7olmogoro8Smirno8* testB/aE e8aluating the null h%pothesis
that JKcode data conforms to JKcodedistribution.** Kparam distribution reference distribution* Kparam data sample being being e8aluated* Kparam alpha significance le8el of the test* Kreturn true iff the null h%pothesis thatJKcode data is a sample from JKcodedistribution
* can be re:ected with confidence ? JKcode alpha* Kthrows !nsufficient6ata#-ception if JKcodedata does not ha8e length at least * Kthrows NullArgument#-ception if JKcodedata is null*/public boolean
$olmogoro8Smirno8est(5eal6istributiondistribution' doubleHI data' double alpha) Jif ((alpha BC ) GG (alpha E .R)) Jthrow newOutOf5ange#-ception(LocaliN!F!"AN"#DL##L' alpha' ' .R),return $olmogoro8Smirno8est(distribution'data) B alpha,
-
8/15/2019 Clase Kolmogorov Smirnov para Java
19/50
/*** #stimates the BiEp8alueB/iE of a twosample
* BahrefC+http//en.wi$ipedia.org/wi$i/7olmogoro8Smirno8Dtest+E 7olmogoro8Smirno8 testB/aE* e8aluating the null h%pothesis that JKcode- and JKcode % are samples drawn from thesame* probabilit% distribution. his methodestimates the p8alue b% repeatedl% sampling
sets of si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
20/50
final int %Length C %.length,final doubleHI combined C new doubleH-LengthU %LengthI,S%stem.arra%cop%(-' ' combined' ' -Length),
S%stem.arra%cop%(%' ' combined' -Length'%Length),final #numerated5eal6istribution dist C new#numerated5eal6istribution(rng' combined),final long d Cintegral7olmogoro8Smirno8Statistic(-' %),int greater"ount C ,int e1ual"ount C ,
doubleHI cur,doubleHI cur&,long cur6,for (int i C , i B iterations, iUU) Jcur C dist.sample(-Length),cur& C dist.sample(%Length),cur6 Cintegral7olmogoro8Smirno8Statistic(cur'cur&),
if (cur6 E d) Jgreater"ountUU, else if (cur6 CC d) Je1ual"ountUU,return strict W greater"ount / (double)iterations
(greater"ount U e1ual"ount) / (double)iterations,
/*** "omputes JKcode bootstrap(-' %' iterations'true).* his is e1ui8alent to $s.boot(-'%'nbootsCiterations) using the 5 ;atching
-
8/15/2019 Clase Kolmogorov Smirnov para Java
21/50
* pac$age function. See Mbootstrap(doubleHI'doubleHI' int' boolean).** Kparam - first sample
* Kparam % second sample* Kparam iterations number of bootstrapresampling iterations* Kreturn estimated p8alue*/public double bootstrap(doubleHI -' doubleHI%' int iterations) Jreturn bootstrap(-' %' iterations' true),
/*** "alculates ((6Dn B d)) using the methoddescribed in H?I with 1uic$ decisions fore-treme* 8alues gi8en in HI (see abo8e). he resultis not e-act as with* JKlin$ Mcdf#-act(double' int) because
calculations are based on* JKcode double rather than JKlin$org.apache.commons.math9.fraction.2igFraction.** Kparam d statistic* Kparam n sample si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
22/50
return cdf(d' n' false),
/**
* "alculates JKcode (6Dn B d). he resultis e-act in the sense that2igFraction/2ig5eal is* used e8er%where at the e-pense of 8er% slowe-ecution time. Almost ne8er choose this inreal* applications unless %ou are 8er% sure, thisis almost solel% for 8erification purposes.
* Normall%' %ou would choose JKlin$Mcdf(double' int). See the class* :a8adoc for definitions and algorithmdescription.** Kparam d statistic* Kparam n sample si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
23/50
* Kparam d statistic* Kparam n sample si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
24/50
return res, else if (? nin8 BC d VV d B ?) Jreturn ? * ;ath.pow(? d' n),
else if (? BC d) Jreturn ?,if (e-act) Jreturn e-act7(d' n),if (n BC ?) Jreturn rounded7(d' n),
return pelood(d' n),
/*** "alculates the e-act 8alue of JKcode (6DnB d) using the method described in H?I(reference* in class :a8adoc abo8e) and JKlin$
org.apache.commons.math9.fraction.2igFraction (see* abo8e).** Kparam d statistic* Kparam n sample si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
25/50
final int $ C (int) ;ath.ceil(n * d),
final Field;atri-B2igFractionE 4 C
this.create#-act4(d' n),final Field;atri-B2igFractionE 4power C4.power(n),
2igFraction pFrac C 4power.get#ntr%($ ?' $ ?),
for (int i C ?, i BC n, UUi) J
pFrac C pFrac.multipl%(i).di8ide(n),
/** 2igFraction.doublealue con8erts numeratorto double and the denominator to double and* di8ides afterwards. hat gi8es NaN 1uiteeas%. his does not (scale is the number of* digits)
*/return pFrac.big6ecimalalue('2ig6ecimal.5O0N6D4ALFD0).doublealue(),
/*** "alculates JKcode (6Dn B d) using methoddescribed in H?I and doubles (see abo8e).
** Kparam d statistic* Kparam n sample si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
26/50
final 5eal;atri- 4power C 4.power(n),
double pFrac C 4power.get#ntr%($ ?' $ ?),for (int i C ?, i BC n, UUi) J
pFrac *C (double) i / (double) n,
return pFrac,
/*** "omputes the elood appro-imation for
((6Dn B d)) as described in HI in theclass :a8adoc.** Kparam d 8alue of dstatistic (- in HI)* Kparam n sample si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
27/50
int $ C ?,for (, $ B ;A!;0;DA5!ALDS0;D"O0N, $UU) J$erm C * $ ?,increment C Fast;ath.e-p(DS0;D5#LA!#D#55O5 * sum)Jbrea$,if ($ CC ;A!;0;DA5!ALDS0;D"O0N) J
throw newoo;an%!terations#-ception(;A!;0;DA5!ALDS0;D"O0N),ret C sum * Fast;ath.s1rt( * Fast;ath.!) /
-
8/15/2019 Clase Kolmogorov Smirnov para Java
28/50
if ($ CC ;A!;0;DA5!ALDS0;D"O0N) Jthrow new
oo;an%!terations#-ception(;A!;0;DA5!ALDS0;D"O0N),final double s1rt4alfi CFast;ath.s1rt(Fast;ath.! / ),// !nstead of doubling sum' di8ide b% 9instead of Tret UC sum * s1rt4alfi / (9 *
-
8/15/2019 Clase Kolmogorov Smirnov para Java
29/50
-
8/15/2019 Clase Kolmogorov Smirnov para Java
30/50
for ($ C , $ B ;A!;0;DA5!ALDS0;D"O0N, $UU) J$erm C $ U .R,$erm C $erm * $erm,
$erm C $erm * $erm,$ermT C $erm * $erm,increment C (piT * $ermT * (R 9 *
-
8/15/2019 Clase Kolmogorov Smirnov para Java
31/50
throw newoo;an%!terations#-ception(;A!;0;DA5!ALDS0;D"O0N),
return ret U (s1rt4alfi / (s1rtN * n)) *(sum / (9 *
-
8/15/2019 Clase Kolmogorov Smirnov para Java
32/50
2igFraction h C null,tr% Jh C new 2igFraction(h6ouble' ?.e' ?), catch (final Fraction"on8ersion#-ception
e?) Jtr% Jh C new 2igFraction(h6ouble' ?.e?' ?), catch (final Fraction"on8ersion#-ceptione) Jh C new 2igFraction(h6ouble' ?.eR' ?),
final 2igFractionHIHI 4data C new2igFractionHmIHmI,
/** Start b% filling e8er%thing with either or ?.*/for (int i C , i B m, UUi) Jfor (int : C , : B m, UU:) J
if (i : U ? B ) J4dataHiIH:I C 2igFraction.[#5O, else J4dataHiIH:I C 2igFraction.ON#,
/** Setting up powerarra% to a8oid calculatingthe same 8alue twice howersHI C hY? ...* howersHm?I C hYm*/final 2igFractionHI howers C new2igFractionHmI,howersHI C h,for (int i C ?, i B m, UUi) J
howersHiI C h.multipl%(howersHi ?I),
-
8/15/2019 Clase Kolmogorov Smirnov para Java
33/50
/** First column and last row has special
8alues (each other re8ersed).*/for (int i C , i B m, UUi) J4dataHiIHI C 4dataHiIHI.subtract(howersHiI),4dataHm ?IHiI C 4dataHm ?IHiI.subtract(howersHm i ?I),
/** H?I states +For ?/ B h B ? the bottomleft element of the matri- should be (? *hYm U* (h ?)Ym )/mX+ Since BC h B ?' then ifh E ?/ is sufficient to chec$*/if (h.compareo(2igFraction.ON#D4ALF) CC ?) J
4dataHm ?IHI C 4dataHm ?IHI.add(h.multipl%().subtract(?).pow(m)),
/** Aside from the first column and last row'the (i' :)th element is ?/(i : U ?)X if i
* : U ? EC ' else . ?Qs and Qs are alread%put' so onl% di8ision with (i : U ?)X is* needed in the elements that ha8e ?Qs. hereis no need to calculate (i : U ?)X and then* di8ide small steps a8oid o8erflows. Notethat i : U ? E BCE i U ? E : instead of* :Qing all the wa% to m. Also note that itis started at g C because di8iding b% ?isnQt
* reall% necessar%.
-
8/15/2019 Clase Kolmogorov Smirnov para Java
34/50
*/for (int i C , i B m, UUi) Jfor (int : C , : B i U ?, UU:) Jif (i : U ? E ) J
for (int g C , g BC i : U ?, UUg) J4dataHiIH:I C 4dataHiIH:I.di8ide(g),return newArra%65owField;atri-B2igFractionE(2igFractio
nField.get!nstance()' 4data),
/**** "reates JKcode 4 of si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
35/50
/** Start b% filling e8er%thing with either or ?.*/
for (int i C , i B m, UUi) Jfor (int : C , : B m, UU:) Jif (i : U ? B ) J4dataHiIH:I C , else J4dataHiIH:I C ?,
/** Setting up powerarra% to a8oid calculatingthe same 8alue twice howersHI C hY? ...* howersHm?I C hYm*/final doubleHI howers C new doubleHmI,howersHI C h,
for (int i C ?, i B m, UUi) JhowersHiI C h * howersHi ?I,
/** First column and last row has special8alues (each other re8ersed).*/
for (int i C , i B m, UUi) J4dataHiIHI C 4dataHiIHI howersHiI,4dataHm ?IHiI C howersHm i ?I,
/** H?I states +For ?/ B h B ? the bottomleft element of the matri- should be (? *hYm U
-
8/15/2019 Clase Kolmogorov Smirnov para Java
36/50
* (h ?)Ym )/mX+ Since BC h B ?' then ifh E ?/ is sufficient to chec$*/if (6ouble.compare(h' .R) E ) J
4dataHm ?IHI UC Fast;ath.pow( * h ?'m),
/** Aside from the first column and last row'the (i' :)th element is ?/(i : U ?)X if i
* : U ? EC ' else . ?Qs and Qs are alread%put' so onl% di8ision with (i : U ?)X is* needed in the elements that ha8e ?Qs. hereis no need to calculate (i : U ?)X and then* di8ide small steps a8oid o8erflows. Notethat i : U ? E BCE i U ? E : instead of* :Qing all the wa% to m. Also note that itis started at g C because di8iding b% ?isnQt
* reall% necessar%.*/for (int i C , i B m, UUi) Jfor (int : C , : B i U ?, UU:) Jif (i : U ? E ) Jfor (int g C , g BC i : U ?, UUg) J4dataHiIH:I /C g,
return ;atri-0tils.create5eal;atri-(4data),
/*** erifies that JKcode arra% has length atleast .
*
-
8/15/2019 Clase Kolmogorov Smirnov para Java
37/50
* Kparam arra% arra% to test* Kthrows NullArgument#-ception if arra% isnull* Kthrows !nsufficient6ata#-ception if arra%
is too short*/pri8ate 8oid chec$Arra%(doubleHI arra%) Jif (arra% CC null) Jthrow newNullArgument#-ception(Locali
-
8/15/2019 Clase Kolmogorov Smirnov para Java
38/50
* Kreturn 7olmogoro8 sum e8aluated at t* Kthrows oo;an%!terations#-ception if theseries does not con8erge*/
public double $sSum(double t' doubletolerance' int ma-!terations) Jif (t CC .) Jreturn .,
// O6O for small t (sa% less than ?)' thealternati8e e-pansion in part 9 of H?I
// from class :a8adoc should be used.
final double - C * t * t,int sign C ?,long i C ?,double partialSum C .Rd,double delta C ?,while (delta E tolerance VV i Bma-!terations) J
delta C Fast;ath.e-p(- * i * i),partialSum UC sign * delta,sign *C ?,iUU,if (i CC ma-!terations) Jthrow newoo;an%!terations#-ception(ma-!terations),
return partialSum * ,
/*** >i8en a dstatistic in the range H' ?I andthe two sample si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
39/50
* comparison with other integral dstatistics. 6epending whether JKcode strictis* JKcode true or not' the returned 8alue
di8ided b% (n*m) is greater than* (resp greater than or e1ual to) the gi8en d8alue (allowing some tolerance).** Kparam d a dstatistic in the range H' ?I* Kparam n first sample si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
40/50
* JKlin$M$olmogoro8Smirno8Statistic(doubleHI'doubleHI) for the definition of (6DJn'm).* BpE
* he returned probabilit% is e-act'implemented b% unwinding the recursi8efunction* definitions presented in HI (class:a8adoc).* B/pE** Kparam d 6statistic 8alue
* Kparam n first sample si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
41/50
* JKlin$M$olmogoro8Smirno8Statistic(doubleHI'doubleHI) for the definition of (6DJn'm).* BpE
* Specificall%' what is returned is (? $(ds1rtJmn / (m U n))) where ($(t) C ? U * sumDJiC?Yinft% (?)Yi eYJ iY tY).See JKlin$ M$sSum(double' double' int) for* details on how con8ergence of the sum isdetermined. his implementation passes JKcode$sSum* JK8alue M7SDS0;D"A0"4&D"5!#5!ON as JKcode
tolerance and* JK8alue M;A!;0;DA5!ALDS0;D"O0N asJKcode ma-!terations.* B/pE** Kparam d 6statistic 8alue* Kparam n first sample si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
42/50
* he method uses a simplified 8ersion of theFisher&ates shuffle algorithm.* 2% processing first the JKcode true 8aluesfollowed b% the remaining JKcode false
8alues* less random numbers need to be generated.he method is optimienerator
rng) JArra%s.fill(b' true),for (int $ C numberOfruealues, $ Bb.length, $UU) Jfinal int r C rng.ne-t!nt($ U ?),bH(bHrI) W r $I C false,
/*** 0ses ;onte "arlo simulation toappro-imate ((6DJn'm E d)) where (6DJn'm) is the* sample 7olmogoro8Smirno8 statistic. See* JKlin$M$olmogoro8Smirno8Statistic(doubleHI'doubleHI) for the definition of (6DJn'm).
* BpE
-
8/15/2019 Clase Kolmogorov Smirnov para Java
43/50
* he simulation generates JKcode iterationsrandom partitions of JKcode m U n into an* JKcode n set and an JKcode m set'computing (6DJn'm) for each partition and
returning* the proportion of 8alues that are greaterthan JKcode d' or greater than or e1ual to* JKcode d if JKcode strict is JKcodefalse.* B/pE** Kparam d 6statistic 8alue
* Kparam n first sample si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
44/50
* BpE* 4ere d is the 6statistic represented aslong 8alue.* he real 6statistic is obtained b%
di8iding d b% n*m.* See also JKlin$ Mmonte"arlo(double' int'int' boolean' int).** Kparam d integral 6statistic* Kparam n first sample si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
45/50
-
8/15/2019 Clase Kolmogorov Smirnov para Java
46/50
final doubleHI 8alues C;athArra%s.uni1ue(;athArra%s.concatenate(-'%)),if (8alues.length CC -.length U %.length) J
return, // here are no ties
// Find the smallest difference between8alues' or ? if all 8alues are the samedouble min6elta C ?,double pre8 C 8aluesHI,double delta C ?,
for (int i C ?, i B 8alues.length, iUU) Jdelta C pre8 8aluesHiI,if (delta B min6elta) Jmin6elta C delta,pre8 C 8aluesHiI,min6elta /C ,
// Add :itter using a fi-ed seed (so samearguments alwa%s gi8e same results)'// lowinitialienerator(?)' min6elta'min6elta),
// !t is theoreticall% possible that :itterdoes not brea$ ties' so repeat// until all ties are gone. 2ound the loopand throw ;!# if bound is e-ceeded.int ct C ,boolean ties C true,do J:itter(-' dist),:itter(%' dist),
ties C hasies(-' %),
-
8/15/2019 Clase Kolmogorov Smirnov para Java
47/50
ctUU, while (ties VV ct B ?),if (ties) Jthrow new ;ath!nternal#rror(), // Should
ne8er happen
/*** 5eturns true iff there are ties in thecombined sample* formed from - and %.
** Kparam - first sample* Kparam % second sample* Kreturn true if - and % together containties*/pri8ate static boolean hasies(doubleHI -'doubleHI %) Jfinal 4ashSetB6oubleE 8alues C new
4ashSetB6oubleE(),for (int i C , i B -.length, iUU) Jif (X8alues.add(-HiI)) Jreturn true,for (int i C , i B %.length, iUU) Jif (X8alues.add(%HiI)) J
return true,return false,
/*** Adds random :itter to JKcode data usingde8iates sampled from JKcode dist.
* BpE
-
8/15/2019 Clase Kolmogorov Smirnov para Java
48/50
* Note that :itter is applied inplace i.e.' the arra%* 8alues are o8erwritten with the result ofappl%ing :itter.B/pE
** Kparam data input/output data arra% entries o8erwritten b% the method* Kparam dist probabilit% distribution tosample for :itter 8alues* Kthrows Nullointer#-ception if either ofthe parameters is null*/
pri8ate static 8oid :itter(doubleHI data'5eal6istribution dist) Jfor (int i C , i B data.length, iUU) JdataHiI UC dist.sample(),
/*** he function "(i' :) defined in HI (class
:a8adoc)' formula (R.R).* defined to return ? if Gi/n :/mG BC c, otherwise. 4ere c is scaled up* and recoded as a long to a8oid roundingerrors in comparison tests' so what* is actuall% tested is Gim :nG BC cmn.** Kparam i first path parameter
* Kparam : second path paramter* Kparam m first sample si
-
8/15/2019 Clase Kolmogorov Smirnov para Java
49/50
-
8/15/2019 Clase Kolmogorov Smirnov para Java
50/50
* "ompute n(?'?)' n(?')...n('?)' n(')...up to n(i':)' one row at a time.* 3hen n(i'*) are being computed' lagHI holdsthe 8alues of n(i ?' *).
*/final doubleHI lag C new doubleHnI,double last C ,for (int $ C , $ B n, $UU) JlagH$I C c(' $ U ?' m' n' cnm' strict),for (int $ C ?, $ BC i, $UU) Jlast C c($' ' m' n' cnm' strict),
for (int l C ?, l BC :, lUU) JlagHl ?I C c($' l' m' n' cnm' strict) *(last U lagHl ?I),last C lagHl ?I,return last,