機率統計 -- 使用 r 軟體

1. -- R 2013 3

2. -- R 1. 1. 2. 2. 1. 2. 3. 4. 5. R 6. 3. 1. 2. 3. 4. 5. 6. 7. 3. 8. 9. 10. 4. 1. 2. 3. (Probabilistic Density Function) 4. (Cumulative Distribution Function) 5. 6. 5. 1. 2. (Bernoulli trial) 3. (Binomial distribution) 4. (Geometric distribution) 5. 6. (Poisson distribution) 7. (Uniform distribution) 8. (Normal Distribution) 9. 10. 4. 6. 1. 2. 3. 4. k (Kth Ordinary Moment) 5. 6. 7. 1. 2. 3. 4. (Covariance, ) 5. (Correlation) 6. 7. 8. 1. 2. 3. 4. 9. 5. 1. 2. 3. 4. 5. R 6. 7. 10. 1. 2. 3. 4. T 5. 6. 7. 8. 11. 1. 2. 3. p 4. M 6. 5. 6. 7. 8. 9. 10. 11. 12. (ANOVA) 1. 2. 3. (Analysis of Variance, ANOVA) 4. 5. 6. 13. 1. 2. R lm() 3. 4. 5. 6. 7. 7. 8. 9. 14. 1. 2. 1 (Rank=2) 3. 2 (Rank=3) 4. 3 (Rank=3 ) 5. 6. 7. 15. A 1. (Binomial distribution) 2. (Netative binomial distribution) 3. (Geometric distribution) 4. (Hypergeometric distribution) 5. (Poisson distribution) 6. (Uniform distribution) 7. (Normal Distribution) 8. SPSS SAS R SPSS SAS R R 9. =>=>=>... 2012/10/23 1. 2. 10. 1 6 1/36 1/32 11. 12. 13. 1/2 508 492 0.508 0.492 (Poisson) 14. R 95% 3.6146570 4.1440593 2.5726955 5.2325581 2.0635500 2.6294660 2.8541827 2.4816312 1.5836851 3.2193062 2.8205306 3.5037204 2.6107131 4.1870588 2.4506509 2.4849244 4.5343839 0.7606934 3.5219675 1.7019120 15. (Hidden Markov Model) EM (Expectation-Maximization Algorithm) R R R SPSS, SAS, MINITAB, S-PLUS 16. R R R S-PLUS Rick Becker, Allan Wilks, John Chambers S R GNU S R S S4 R http://www.r-project.org/ CRAN (Comprehensive R Archive Network) http://cran.r-project.org/ R Windows R http://cran.r-project.org/bin/windows/base/ 17. R Download R 2.15.2 for Windows http://cran.r-project.org/bin/windows/base/R-2.15.2-win.exe Youtube R http://www.youtube.com/watch?v=AipnE4s8sKk R R R S R R R 18. R ?rnorm R rnorm R rnorm (The Normal Distribution) 19. R R xxxx d, p, q, r Normal Distribution ( norm) dnorm, pnorm, qnorm, rnorm dnormdnorm(x, mean = 0, sd = 1, log = FALSE)pnormpnorm(q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)qnormqnorm(p, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)rnormrnorm(n, mean = 0, sd = 1) mean sd Standard Deviation ()n x q p n = ( mean=0, sd=1, log=FALSE, ....) ( x, q, p, n) rnorm(100) rnorm(100, mean = 0, sd = 1) mean=0 20. sd=1 100 ddnorm(1.96)P(X=x)ppnorm(1.96)=0.975P(Xx)qnorm(0.975)=1.96q p ; qnorm(pnorm(1.96)) =(CDF) q1.96rrnorm(100) 100 ?rnorm 21. x = r n o r m (10000, mean=5, sd=4) h i s t (x) x = rnorm(10000, mean=5, sd=4) 5, 4 10000 x hist(x) (Histogram) r n o r m (10, 3, 2) > x [1] 2.5810213 0.5399127 5.0005020 5.3402693 2.7900723 3.9638088 5.2119685 [8] 2.2209882 2.9935943 7.0308419 > a=d n o r m (1.96) > a [1] 0.05844094 > b=p n o r m (1.96) > b [1] 0.9750021 > c=q n o r m (b) 22. > c [1] 1.96 > d=r n o r m (10) > d [1] -0.329136770.77788306 -1.808624960.16694598 -0.65656254 -1.76305925[7]0.19651748 -0.078986850.739709331.18237502> : Wikipedia:Probability Theory PDF () -- http://en.wikipedia.org/wiki/Probability_density_function PMF () -- http://en.wikipedia.org/wiki/Probability_mass_function http://en.wikipedia.org/wiki/Statistics http://en.wikipedia.org/wiki/Descriptive_statistics http://en.wikipedia.org/wiki/Inferential_statistics http://en.wikipedia.org/wiki/Bayesian_Inference http://en.wikipedia.org/wiki/Correlation http://en.wikipedia.org/wiki/Analysis_of_variance http://en.wikipedia.org/wiki/Design_of_experiments 23. http://en.wikipedia.org/wiki/Regression_analysis http://en.wikipedia.org/wiki/Student%27s_t-test 24. S U {} {1,2,3,4,5,6} A {1, 3, 5} P(A) = P({1,3,5}) () 1. (Personal Approach) : () 25. 2. (Relative Frequency Approach) f A n 3. (Classical Approach) n(A) A N(S) (outcome) (equaly likely) 26. (1). (2). (3). (1) S 1 (2) A (3) A1, A2 A1 A2 1. 2. 27. 3. A, B 28. A,B A, B, C A, B, C 29. 1 : ; (3) ; ; 1 ; (3) 2 : P(A') = 1-P(A) ; A' A 30. ; A' ; 3 ; 3 1 3 : ; () ; (); 3 31. ; 1 S={, } X P() = 0.5 P() = 0.5 2 32. S={1,2,3,4,5,6} P(1) = P(2) = ... = P(6) = 1/6 5 6 5 6 R R R sample() ? R ?sample R > ?sample 33. starting httpd help server ... done 34. sample Help sample sample(x, size, replace = FALSE, prob = NULL) > sample(1:6, 10) sample(1:6, 10) : cannot take a sample larger than the population when 'replace = FALSE' > sample(1:6, 10, replace=TRUE) [1] 3 2 4 4 4 2 6 3 3 3 > sample(1:6, 10, replace=TRUE) 3 2 4 4 4 2 6 3 3 3 replace=T (TRUE) k sample(1:6, k) > sample(1:6, 6) [1] 2 6 4 1 5 3 35. sample () > face = c ("", "") > s a m p l e (face, 10, replace=TRUE) [1] "" "" "" "" "" "" "" "" "" "" face = c("", "") [ , ] sample(face, 10, replace=TRUE) 10 () sample(x, size, replace = FALSE, prob = NULL) prob 0.6 0.4 > s a m p l e (face, 10, replace=TRUE, c (0.6, 0.4)) [1] "" "" "" "" "" "" "" "" "" "" 1 : () 36. 59 1 59 1 > x=s a m p l e (1:59, 10000, TRUE) > h i s t (x, breaks=0.5:60) 37. 2 : > x=s a m p l e (1:6, 10000, T) 38. > y=s a m p l e (1:6, 10000, T) > z=s a m p l e (1:6, 10000, T) > h i s t (x, breaks=0.5:7) > h i s t (y, breaks=0.5:7) > h i s t (z, breaks=0.5:7) > h i s t (x+y, breaks=1.5:13) > h i s t (x+y+z, breaks=2.5:19) 39. x+y+z 40. A B P(B|A) 1 ( A=) 3 ( B=3) P(B|A) = P(3|) 2 41. B A B 3 P(B|A) = P(3|) : A B A, B A, B A, B A,B 42. 1. A=, B=3 P(A) = 3/6 = 1/2 P(B) = 1/6 P(A B) = 0 P(A) P(B) = 1/2 * 1/6 =1/12 P(A B) P(A) P(B) 2. A=, B= 4 43. P(A) = 3/6 = 1/2 P(B) = 4/6 = 2/3 P(A B) = P({2, 4}) = 2/6 = 1/3 P(A) P(B) = 1/2 * 2/3 = 1/3 P(A B) = P(A) P(B) 1. A , B 2. 3. : 44. 1. A=, B=3 2. A=, B= 4 45. A B C ; Artificial Intelligence: A Modern Approach475 46. X:() (Cavity) Y:() (Toothache) Z:() (Catch) (Y=1) (Y=0) (Z=1) (Z=0) (Z=1) (Z=0)(X=1)0.1080.0120.0720.008 (X=0)0.0160.0640.1440.576 1 P() = ? 2 P( | ) = ? 3 4 P( | ) = ? 5 P(, ) = ? 6 P( | ), P(), P(), P( | ) 47. P(|) = P(|) P()/P() R (Column Major Order) http://cran.r-project.org/doc/manuals/R-lang.pdf2.2.2 The dim attribute is used to implement arrays. The content of the array is stored in a vector in columnmajor order and the dim attribute is a vector of integers specifying the respective extents of the array. R ensures that the length of the vector is the product of the lengths of the dimensions. The length of one or more dimensions may be zero. (Column Major Order) XYZP(X,Y,Z)0000.5761000.008 48. 0100.0641100.0120010.1441010.0720110.0161110.108 R 1 C 0 XYZP(X,Y,Z)1110.5762110.0081210.0642210.012 49. 1120.1442120.0721220.0162220.108> p p , , 1 [,1][,2][1,] 0.576 0.064 [2,] 0.008 0.012 , , 2 [,1][,2][1,] 0.144 0.016 50. [2,] 0.072 0.108 > p[1,1,1] [1] 0.576 > p[2,1,1] [1] 0.008 > p[1,2,1] [1] 0.064 > p[2,2,1] [1] 0.012 > p[1,1,2] [1] 0.144 > p[2,1,2] [1] 0.072 > p[1,2,2] [1] 0.016 > p[2,2,2] [1] 0.108 > d i m n a m e s (p)[[1]] = c ("", "") > d i m n a m e s (p)[[2]] = c ("", "") > d i m n a m e s (p)[[3]] = c ("", "") 51. > p , , 0.576 0.064 0.008 0.012, , 0.144 0.016 0.072 0.1081P() = 0.8 > p[,"",] 0.576 0.144 0.008 0.072> s u m (p[,"",]) [1] 0.8 52. 2P( | ) = 0.62 > p[,,""] 0.144 0.016 0.072 0.108> s u m (p[,,""]) [1] 0.34 > s u m (p[,"",""]) [1] 0.124 > s u m (p[,"",""])/s u m (s u m (p[,"",])) [1] 0.62 3 ( 1 0 1) > s u m (p) [1] 1 > 0 s u m (p[,"",""]) [1] 0.124 6 P( | ), P(), P(), P( | ) 54. P( | ) = p(|) * p()/p() P( | ) = 0.5294118, P()=0.2, P()=0.34, P( | )=0.9 P( | ) = 0.5294118 = 0.9 * 0.2/0.34 = = p(|) * p()/p() > pab = s u m (p["",,""])/s u m (p[,,""]) # pab = P( | ) > pba = s u m (p["",,""])/s u m (p["",,]) # pba = P( | ) > pa = s u m (p["",,]) # pa = P() > pb = s u m (p[,,""]) # pb = P() > pab [1] 0.5294118 > pba [1] 0.9 > pa [1] 0.2 > pb [1] 0.34 55. > pba*pa/pb [1] 0.5294118 > pab-pba*pa/pb [1] 0 p (|) = s u m (p["",,""])/s u m (p[,,""]) = pab = pba * pa / pb = p (| ) * p ()/p () = s u m (p["",,""])/s u m (p[,,""])* s u m (p[,,""])/ s u m (p["",,]) > p p , , 1 [,1][,2][1,] 0.576 0.064 56. [2,] 0.008 0.012 , , 2 [,1][,2][1,] 0.144 0.016 [2,] 0.072 0.108 > p[1,1,1] [1] 0.576 > p[2,1,1] [1] 0.008 > p[1,2,1] [1] 0.064 > p[2,2,1] [1] 0.012 > p[1,1,2] [1] 0.144 > p[2,1,2] [1] 0.072 > p[1,2,2] 57. [1] 0.016 > p[2,2,2] [1] 0.108 > d i m n a m e s (p)[[1]] = c ("", "") > d i m n a m e s (p)[[2]] = c ("", "") > d i m n a m e s (p)[[3]] = c ("", "") > p , , 0.576 0.064 0.008 0.012, , 0.144 0.016 0.072 0.108> p[,"",] 58. 0.576 0.144 0.008 0.072> p[,,""] 0.144 0.016 0.072 0.108> s u m (p[,,""]) [1] 0.34 > s u m (p[,"",""]) [1] 0.124 > s u m (p[,"",""])/s u m (s u m (p[,"",])) [1] 0.62 > s u m (p) [1] 1 > 0 s u m (p["",,""])/s u m (p["",,]) [1] 0.9 > s u m (p[,"",""])/s u m (p[,"",]) [1] 0.62 > s u m (p[,"",""]) [1] 0.124 > pab = s u m (p["",,""])/s u m (p[,,""]) # pab = P( | ) > pba = s u m (p["",,""])/s u m (p["",,]) # pba = P( | ) > pa = s u m (p["",,]) # pa = P() > pb = s u m (p[,,""]) # pb = P() > pab [1] 0.5294118 > pba 60. [1] 0.9 > pa [1] 0.2 > pb [1] 0.34 > pba*pa/pb [1] 0.5294118 > pab-pba*pa/pb [1] 0 > 61. X S e X(e) r X (Random Variable) S X(s) X (probability space) S R 62. S = {} 1/2 P() = 1/2 P() = 1/2 X X {} {1,0} )=1X() = 0 S={} X( 63. X2 X2 () S2={} X2 S2 {2, 1, 0} X2() = 2 X2() = 1 X2() = 1 X2() = 0 P[X2=2] P[X2=1] P[X2=0] 36 64. X ( ) Y ( ) X 11 Y 6 ; ; 1 () a. 1 0 b. {} c. { } 65. X({}) = 0 X({}) = 1 X({}) = 1 X({}) = 0 0.6 0.4 P({}) = P() * P() = 0.6*0.6 = 0.36 P({}) = P() * P() = 0.6*0.4 = 0.24 P({}) = P() * P() = 0.4*0.6 = 0.24 P({}) = P() * P() = 0.4*0.4 = 0.16 P(X=1) = P({, }) = P({}) + P({}) = 0.24 + 0.24 = 0.48 66. P(X=0) = P({, }) = P({}) + P({}) = 0.36 + 0.16 = 0.52 2 1. X S = = {w1,w2,......,wn} 2. 1/1000 1X(A) = |A| A S A = {w1, w5, w9} |A| A 67. A = {w1, w5, w9} |A| = 3 B = {w2, w8} |B| = 2 C = {} |C| = 0 D = S |D| = n 2P(X=3) = P({A | X(A) = 3}) = P({{w1, w2, w3}) + P({w1, w2, w4}) + ...... 1/1000 1/1000 1/1000 P(w1) = P(w2) = .... P(wn) = 1/1000 P(w1w3) = P(w1) * P(w3) = 1/1000 * 1/1000 P(w1w3) 68. P(X=3) P(X=k) n () 69. X X X X (Probabilistic Density Function) (Probabilistic Density Function, PDF) P P[X=x] x 70. S P[X=2] 1 S={, } X X() =1, X() = 0 P[X=1] = P({}) = 0.5 P[X=0] = P({}) = 0.5 2 71. S={, , , } X X() =2, X() = X() = 1, X() = 0 P[X=2] = P({}) = 0.25 P[X=1] = P({,}) = 0.5 P[X=0] = P({}) = 0.25 3 S={1,2,3,4,5,6} X X(1) =1, X(2) = 2, ... X(6) = 6 P[X=1] = P[X=2] = ... = P[X=6] = 1/6 4 S={1,2,3,4,5,6} 72. Y Y(1) =0, Y(2) = 1, Y(3) = 0, Y(4) = 1, Y(5) = 0, Y(6) = 1 P[Y=1] = P[Y=0]= 1/2 (Cumulative Distribution Function) (Cumulative Distribution Function, CDF) F(x) x 73. P[X=1] P(1) f(1)P[X=x] P(x) f(x) P(x) P[X=x] f(x) P[X=x] S X, Y, ... S R 1. X(s) 2. Y(s) 3. ... +, -, * X, Y X+Y, X-Y, X*Y 3X +, -, * 74. 3X Z=3X Z s X 3 Z(s) = 3 * X(s) X X(k)=k (k=1..6) 3X Z(k )=3*X(k)=3k Z = 3X 1. P[Z=3] = ?, (1/6) 2. P[Z=1] = ?, (0) 3. P[Z=18] = ?, (1/6) 4. P[Z=5] = ?, (0) X X()=1, X()=0 Z=3X 75. Z()=3Z()=0 X+Y X+Y Z=X+Y Z s X + Y Z(s) = X(s)+Y(s) X, Y X(k)=Y(k)=k (k=1..6) X+Y Z(k)=2k X Y X+Y X Y SX, SY 76. X+Y SX = {1 ,...., 6} , SY = {, } X+Y {1,...., 6} {, } 12 S S== { (1, ), (1,), (2, ), (2,), ....(6, ), (6,)}Z = X+Y Z X Y R Z(s) = Z(x, y) = X(x)+Y(y)P(Z=2) P(X+Y = 2) P({(1, ), (2,)}) P(Z=2) 2/12 = 1/6X Y X Y Z=X Y Z s X Y Z(s) = X(s) Y(s) 77. X Y X Y X Y = { (1, ), (1,), (2, ), (2,), ....(6, ), (6,)}S=Z = X Y Z X Y R P(Z=2) P(X Y = 2) P({(2, )}) P(Z=2) 1/12X^k Z s X(s) k 78. X 1 P(Z=4) P(Z=4) P(X=2) = P({2}) = 1/6 Z ({1,...., 6}) 1,4,9,16,25,36 X, Y, Z, ... S X s x 79. (Bernoulli trial) YES or NO (1 or 0) 80. 30 2 1 : X 1 0 P[X=1]=0.5, P[X=0]=0.5 R Sample sample(0:1, 10, replace=T, prob=c(0.5,0.5)) 10 0.5 > s a m p l e (0:1, 10, replace=TRUE, prob=c (0.5,0.5)) [1] 1 0 1 1 0 1 0 1 0 1 > s a m p l e (0:1, 10, replace=T) [1] 0 1 1 1 0 0 1 1 1 0 X()=0, X()=1, 0:1 {} replace replace=TRUE, 81. prob 0.5 2 : X({})=1, X({})=0, 0.53, 0.47 P[X=1]=0.53, P[X=0]=0.47 > s a m p l e (0:1, 10, replace=T, prob=c (0.47, 0.53)) [1] 0 1 1 0 1 1 0 1 1 0 (Binomial distribution) n P(ti=1) = p, P(ti=0)=1-p {t1, t2, ...., tn} n n (t1 t2 ... tn) P(t1 t2 .... tn) = P(t1) P(t2) .... P(tn) X (t1 t2 ... tn) (Yes) n k 1 82. 5 3 p=0.5 0.53, 0.47 A 3 X({})=1, X({})=0, p = 0.53, (1-p) = 0.47 R > d b i n o m (1, 3, 0.53)+d b i n o m (2,3, 0.53)+d b i n o m (3,3,0.53) 83. [1] 0.896177 > s u m (d b i n o m (c (1,2,3), 3, 0.53)) [1] 0.896177 > x=c (1,2,3) > x [1] 1 2 3 > p=d b i n o m (x, 3, 0.53) > p [1] 0.351231 0.396069 0.148877 > s u m (p) [1] 0.896177 > p a r (mfrow=c (2,2)) > x = 0:5 > b5 = d b i n o m (x, 5, 0.5) > p l o t (x, b5, type="h") > b3 = d b i n o m (x, 5, 0.3) > p l o t (x, b3, type="h") > b7 = d b i n o m (x, 5, 0.7) > p l o t (x, b7, type="h") 84. > b1 = d b i n o m (x, 5, 0.1) > p l o t (x, b1, type="h") 85. 1. 10 2. n d g e o m (0, 0.47) [1] 0.47 > d g e o m (1, 0.47) [1] 0.2491 > s u m (d g e o m (c (0,1,2), 0.47)) [1] 0.851123 87. 1. 1 k 2. 1 6 k r k p=0.5 R R dbinom (x+n)/((n) x!) p^n (1-p)^x n=r, x=k-r-1 x n (n) (x+n)/((n) x!) 88. (x+n-1)!/((n-1)! x!) R > d n b i n o m (0, 3, 0.5) [1] 0.125 > d n b i n o m (1, 3, 0.5) [1] 0.1875 > d n b i n o m (0:10, 3, 0.5) [1] 0.125000000 0.187500000 0.187500000 0.156250000 0.117187500 0.082031250 [7] 0.054687500 0.035156250 0.021972656 0.013427734 0.008056641 > n=3 > x=1 > p=0.5 > g a m m a (x+n)/(g a m m a (n)*p r o d (1:x)) * p^n * (1-p)^x [1] 0.1875 > c h o o s e (x+n, n) * p^n * (1-p)^x [1] 0.25 > c h o o s e (x+n-1, x) * p^n * (1-p)^x [1] 0.1875 89. 0.53, 0.47 5 X({})=1, X({})=0, , p=0.47, r=3 R > d n b i n o m (3, 3, 0.47) [1] 0.1545686 > d n b i n o m (4, 3, 0.47) [1] 0.122882 > p=d n b i n o m (c (3,4,5), 3, 0.47) > p [1] 0.15456857 0.12288201 0.09117845 90. > s u m (p) [1] 0.368629 > p a r (mfrow=c (2,2)) > nb5 = d n b i n o m (x, 5, 0.5) > p l o t (nb5, type="h") > nb7 = d n b i n o m (x, 5, 0.7) > p l o t (nb7, type="h") > nb2 = d n b i n o m (x, 5, 0.2) > p l o t (nb2, type="h") > nb9 = d n b i n o m (x, 5, 0.9) > p l o t (nb9, type="h") 91. (Poisson distribution) 92. http://en.wikipedia.org/wiki/Poisson_distribution n 93. n http://en.wikipedia.org/wiki/File:Binomial_versus_poisson.svg 1. X S = = {w1,w2,......,wn} 2. 1/1000 1X(A) = |A| 94. A S A = {w1, w5, w9} |A| A A = {w1, w5, w9} |A| = 3 B = {w2, w8} |B| = 2 C = {} |C| = 0 D = S |D| = n 2P(X=3) = P({A | X(A) = 3}) = P({{w1, w2, w3}) + P({w1, w2, w4}) + ...... 1/1000 1/1000 1/1000 P(w1) = P(w2) = .... P(wn) = 1/1000 P(w1w3) = P(w1) * P(w3) = 1/1000 * 1/1000 95. P(w1w3) P(X=3) P(X=k) n 96. () 1CC 10 1CC 8 = 10 R > ?dpois > d p o i s (8, 10) [1] 0.112599 > 10^8*e x p (-10)/p r o d (1:8) [1] 0.112599 x=8 97. > p a r (mfrow=c (2,2)) > x = 0:10 > p3 = d p o i s (x, lambda=3) > p l o t (p3, type="h") > p7 = d p o i s (x, lambda=7) > p l o t (p7, type="h") > p1 = d p o i s (x, lambda=1) > p l o t (p1, type="h") > p5 = d p o i s (x, lambda=5) > p l o t (p5, type="h") 98. (Uniform distribution) 99. > d u n i f (0.5) [1] 1 > d u n i f (0.9) [1] 1 > d u n i f (2) [1] 0 > d u n i f (-1) [1] 0 > p a r (mfrow=c (2,2)) > x=0:10 > c u r v e (d u n i f (x, min=0, max=1), from=-1, to=11) > c u r v e (d u n i f (x, min=0, max=10), from=-1, to=11) > c u r v e (d u n i f (x, min=3, max=6), from=-1, to=11) > c u r v e (d u n i f (x, min=2, max=9), from=-1, to=11) 100. (Normal Distribution) > d n o r m (0) [1] 0.3989423 > d n o r m (0.5) [1] 0.3520653 > d n o r m (2.5) [1] 0.0175283 > p a r (mfrow=c (2,2)) > c u r v e (d n o r m (x, mean=0, sd=1), from=-10, to=10) > c u r v e (d n o r m (x, mean=0, sd=5), from=-10, to=10) > c u r v e (d n o r m (x, mean=5, sd=1), from=-10, to=10) > c u r v e (d n o r m (x, mean=-3, sd=3), from=-10, to=10) 101. R binom(n:size, p:prob)n:, p:, n x multinom(n:size,n:, p[1..n]:p(1..k):prob) nbinom(size, prob)x:, , p:, r geom(p:prob)p: , 102. hyper(N:m,n:n,r:k)m:, n:, k: , pois(lambda)k:,, s k R unif(a:min, b:max)a:, b: (Uniform) (Normal)norm(mean, sd)x1+x2+...+xk; k 103. gamma(shape,(Gamma)rate = 1, scale = 1/rate) Gamma exp(rate)(Exponential)() Wchisq(df, ncp)(Chi-Square)( ) cauchy(b:location,(Cauchy)a:scale)weibull(a:shape,(Weibull)b:scale)f(x) , 104. R(t) ,T (T)t(df, ncp)F (F)f(df1, df2, ncp) F beta(a:shape1,(Beta)b:shape2, ncp)lnorm(meanlog,(Log Normal)sdlog)logis(location, scale)Signranksignrank(n)wilcox(m, n)a,b 105. E(X) , (); 1.,; 106. 2.;3.; 1: E[c] = c; ; ; P(x) 2: E[c X] = c E[X]; 107. ; ; 3 : E[X + Y] = E[X] + E[Y] X, Y ; ; ; 108. Var(X) X Var(X) 1. Var(X) X 2. X ( X ) x 3. +, -, * > ; 109. ; ; ; . E[g(X)] X g(X) E[g(X)] X 110. 1, 2, 3 ; ; . X 1 X 2 X 3 ....k (Kth Ordinary Moment) X k (Kth ordinary moment) k 111. X (Moment Generating Function, m.g.f) (-h, h) 112. k (-h, h) P(X) X, Y 113. 1 P(X) P(Y) 1f(x) 0 () () 2 114. f(x) (Characteristic function) 3X , X+Y , X-2Y ,X*Y, X*X*X*X S X, Y +, - * 115. 116. 1.;2.; 2 () 117. 1.2.3. 1. X 2. Y ;;; 118. 1. X 2. Y E[H(X,Y)]1. 2. 119. 1. 2. 3. 4. (Covariance, ) Cov(X,Y) 120. X, Y E[X Y] = E[X] E[Y] (Correlation) 121. R > x = s a m p l e (1:10, 10) > x [1]18 105> c o r (x, x+1) [1] 1 > c o r (x, -x) [1] -1 > c o r (x, 0.5*x)379426 122. [1] 1 > c o r (x, 0.5*x+1) [1] 1 > c o r (x, -0.5*x+1) [1] -1 > y=s a m p l e (1:100, 10) > y [1]4 53 20 68 29 74 17 49 78 62> c o r (x,y) [1] -0.06586336 > X,Y X, Y 123. 1.;2.;3.; A , B A B C 124. S SS k S SS...S X, Y 125. 126. () () x1, x2, ...., xn n => () X1, X2, .... , Xn n => x1, x2, ...., xn n R sample(1:100, 10) 1 100 10 > x = s a m p l e (1:100, 10) > x [1] 12 17 50 33 98 77 39 797 26sample sample(x, size, replace = FALSE, prob = NULL) replace FALSE 127. R R binom(n:size,n:, p:,p:prob)n x pois(lambda)unif(a:min,a:, b: (Uniform)b:max)norm(mean,x1+x2+...+xk; k (Normal)sd)exp(rate)((Exponential)) W r 128. > r b i n o m (20, 5, 0.5) [1] 4 3 3 4 2 4 3 1 2 3 4 3 2 2 2 4 2 3 1 1 > r p o i s (20, 3.5) [1] 2 1 4 2 1 6 3 6 1 3 3 6 6 0 4 2 6 4 6 2 > r u n i f (20, min = 3, max = 8) [1] 3.933526 3.201883 7.592147 5.207603 4.897806 3.848298 4.521461 4.437873 [9] 3.655640 5.633540 6.557995 5.430671 6.502675 5.637283 7.713699 5.841052 [17] 6.859493 5.987991 3.752924 7.480678 > r n o r m (20, mean = 5.0, sd = 2.0) [1] 6.150209 4.743013 3.328734 5.096294 4.922795 6.272768 4.862825 8.036376 [9] 4.198432 5.467984 2.046450 6.452511 2.088256 5.349187 3.074408 3.628072 [17] 3.421388 7.242598 3.125895 9.865341 > r e x p (20, rate=2.0) [1] 0.17667426 0.49729383 0.12786107 0.13983412 0.44683515 1.30482842 [7] 0.28512544 1.61472266 0.23220649 0.39089780 0.05947224 1.42892610 [13] 0.02555552 0.69409186 0.68228242 0.22542362 0.33590791 0.14684937 [19] 0.34995146 0.80595369 100,000 129. hist() > x = r b i n o m (100000, 5, 0.5) > h i s t (x) 130. rbinom(100000, 5, 0.5) > y = r p o i s (100000, 3.5) 131. > h i s t (y)rpois(100000, 3.5) 132. > z = r u n i f (100000, min=3, max=8) > h i s t (z) 133. runif(100000, min=3, max=8) > w = r n o r m (100000, mean=5.0, sd=2.0) > h i s t (w) 134. rnorm(100000, mean=5.0, sd=2.0) > v = r e x p (100000, rate=2.0) 135. > h i s t (v)rexp(100000, rate=2.0) 136. MeanMedianSample VarianceSample Standard / S Deviation Range 137. Outlier WildInterQuartile Range, IQR 3 1 n-1 n ( n-1 ) (7 4 6 8 9 4 5 6 2 8) 1. (Mean) 2. (Sample Variance) 3. (Sample Standard Deviation) 4. (Median) 5. (Range) 6. (q1) 7. (q3) 138. 8. (iqr) 1. (Mean)mean(x) = (7+4+6+8+9+4+5+6+2+8)/10 = 5.92. (Sample Variance)3. (Sample Standard Deviation) 139. 4. (Median)M = (2 4 4 5 6 6 7 8 8 9) = (6+6)/2 = 65. (Range)range(x) = 9-2 = 76. (q1) 0 1 2 3 4 5 6 7 8 9 2 4 4 5 6 6 7 8 8 9 q1 0.25 * 9/10 = 0.225 q1 = 4+0.25 * (5-4) = 4.25 140. 7. (q3)q3 0.75 * 9/10 = 0.675 q3 = 7+0.75 * (8-7) = 7.758. (iqr)iqr(x) = q3-q1 = 7.75-4.25 = 3.5 R > x = s a m p l e (1:100, 10) > x [1] 12 17 50 33 98 77 39 79 > m e a n (x) [1] 43.8 > m e d i a n (x) [1] 367 26 141. > v a r (x) [1] 984.1778 > s d (x) [1] 31.37161 > r a n g e (x) [1]7 98> m a x (x) [1] 98 > m i n (x) [1] 7 > m a x (x)-m i n (x) [1] 91 > q1 = q u a n t i l e (x, 0.25) > q1 25% 19.25 > q3 = q u a n t i l e (x, 0.75) > q3 75% 70.25 > q3-q1 142. 75% 51 > i q r (x) : "iqr" > I Q R (x) [1] 51 > f i v e n u m (x) [1]7 17 36 77 98> s u m m a r y (x) Min. 1st Qu. 7.0019.25Median 36.00Mean 3rd Qu. 43.8070.25Max. 98.00 R (8.9 , 4.5 , 3.7 , 10.0 , 11.5 , 8.9 , 5.6 , 15.4 , 16.6 , 1.0) () 1. (Mean) 2. (Sample Variance) 3. (Sample Standard Deviation) 4. (Median) 143. 5. (Range) 6. (q1) 7. (q3) 8. (iqr) [1] 25.37433 > s d (x) [1] 5.037294 > m e d i a n (x) [1] 8.9 > m a x (x)-m i n (x) [1] 15.6 > q1 = q u a n t i l e (0.25, x) q u a n t i l e . d e f a u l t (0.25, x) : 'probs' outside [0,1] > q1 = q u a n t i l e (x, 0.25) > q1 25% 4.775 144. > q3 = q u a n t i l e (x, 0.75) > q3 75% 11.125 > q3-q1 75% 6.35 > I Q R (x) [1] 6.35 > R Histogramhist(x) Stem-and-Leaf Diagramstem(x) 145. Boxplotsboxplot(x) (Relative Cumulative)Frequency Ogiveplot(ecdf(x)) q1, q3, f1, f3 (inner fences), a1, a3 F1, F3 (outer fances) f1 = q1 - 1.5 iqr; f3 = q3 + 1.5 iqr; F1 = q1 - 3.0 iqr; F3 = q3 + 3.0 iqr; a1 f1 ; a3 f3 146. R : > x = r n o r m (100) > x [1]0.389381081 -0.274522826[6]0.736573742[11] -1.3565903511.492670583 -1.5632286090.7664051080.297407135 -1.324130406 -1.3765982311.6617271751.309122339 -1.1938210850.365801091 -0.952034088[16] -0.277610568 -0.599980091 -0.124105876 -1.1077131620.560637570[21]0.7144491380.1119690570.505171739 -2.4182975990.318797182[26]2.7166465160.3452894220.0194346151.0877589510.033917165[31] -0.356786424 -1.2848090661.5804113270.552931291 -0.615928762[36] -0.087069820 -0.814632197 -0.570882510 -0.107731447 -1.453838416 [41] -0.2571152091.1668661201.072692716 -0.0225948520.441221144[46]1.053900960 -1.025193547 -1.1192005870.2646682031.409504515[51]0.241644132 -0.9554078000.4462973810.2318876490.769308731[56]0.2696245790.822638573 -0.904380789 -0.4295274040.496109294[61] -2.050582772 -0.586973281 -1.1927534031.158321933 -0.151319360[66]0.558858868 -0.656174351 -2.8589644030.366785049[71]0.369315063 -0.953560954 -0.762608370 -1.017449547 -0.1277385620.896958092 147. [76] -1.922030980 -0.8398979301.332972530 -0.0011511040.104336360[81] -0.2089078131.4013357980.019330593 -0.6875592890.445371885[86]0.5045326892.168626000 -1.742886230[91]1.6760595941.132849957 -1.047073217 -0.912548540 -2.235854777[96] -1.1941041280.121106118 -1.1784152240.831058071 0.214196778> s t e m (x) The decimal point is at the | -2 | 9421 -1 | 97654433222211000000 -0 | 998887766664433322111100 0 | 00011122233333344444445556667788889 1 | 11112233445677 2 | 027 > h i s t (x, main="Frequency Histogram of x") > h i s t (x, main="Probability Histogram of x", freq=F) > Fx = e c d f (x)2.011604088 0.280714044 148. > p l o t (x) > p l o t (Fx) > b o x p l o t (x) 149. 150. (covariance) (correlation) X, Y -1.0 1.0 R covariancecov(x,y)correlation / cor(x,y) R cov() cor() r u n i f (10, 1, 5) > x [1] 1.375135 1.863417 2.403693 2.639902 1.694610 4.419406 4.032262 2.147783 [9] 1.501733 1.497732 > c o v (x, x) [1] 1.144697 > c o v (x, x+1) [1] 1.144697 151. > c o r (x, x) [1] 1 > c o r (x, x+1) [1] 1 > c o v (x, -x) [1] -1.144697 > c o r (x, -x) [1] -1 > c o r (x, 0.5*x) [1] 1 > y = r u n i f (10, 1, 5) > y [1] 1.114662 2.358270 2.089179 4.581484 4.170922 2.630044 1.450336 1.320637 [9] 1.705649 3.506064 > c o r (x, y) [1] -0.04560485 > c o r (y, y) [1] 1 > 152. R ; 153. (Weak law) : (Strong law) : 1 154. 2 () 1/4 3 () 1/9 4 () 1/16 k () 1/k2 40 50 10 30 10 n 0 1 20 ( ) 155. n X n n Z n ( 20 ) n 156. $frac{x_1+x_2+...+x_n}{n}=bar{x}$ 1. 2. 3. 4. 5. 6. > p n o r m (1)-p n o r m (-1) [1] 0.6826895 > p n o r m (2)-p n o r m (-2) [1] 0.9544997 > p n o r m (3)-p n o r m (-3) 157. [1] 0.9973002 > p n o r m (4)-p n o r m (-4) [1] 0.9999367 > p n o r m (5)-p n o r m (-5) [1] 0.9999994 > p n o r m (6)-p n o r m (-6) [1] 1 > o p t i o n s (digits=10) > p n o r m (6)-p n o r m (-6) [1] 0.999999998 99.9999998% n R t.test R R R > u y h i s t (u[,1]) > h i s t (y) > ?apply > 1. u 50 uniform 50*10000 2. y u apply ( u, 2, mean ) mean(col of u) 3. y Uniform Distribution 50 4. hist(y) 159. hist(u[,1]) 160. hist(y) CLT = function(x) { op L2=p n o r m (2, mean=0, sd=1) > L1=p n o r m (-2, mean=0, sd=1) > L1 [1] 0.02275013 > L2 [1] 0.9772499 > L2-L1 [1] 0.9544997 > 1.0-(L2-L1) [1] 0.04550026 Z (-2, 2) 0.9544997 0.04550026 mean sd 176. R N(mean=5, sd=3) (3, 6) > L2=p n o r m (6, mean=5, sd=3) > L1=p n o r m (3, mean=5, sd=3) > L1 [1] 0.2524925 > L2 [1] 0.6305587 > L2-L1 [1] 0.3780661 (3, 6) 0.3780661 N(mean=5, sd=3) 10 > x=r n o r m (10, mean=5, sd=3) > x [1]6.3871687.2920184.6802022.225559 11.2082457.0401072.739477 177. [8]2.3161054.4826584.913032> 3 x 3 s u m (3 s u m (3 x=r n o r m (100, mean=5, sd=3) > s u m (3 s u m (3 x=r n o r m (100000, mean=5, sd=3) > s u m (3 s u m (3 L1=q n o r m (0.01, mean=5, sd=3) > L1 [1] -1.979044 > L2=q n o r m (0.99, mean=5, sd=3) > L2 [1] 11.97904 > P1=p n o r m (L1, mean=5, sd=3) > P1 [1] 0.01 179. > P2=p n o r m (L2, mean=5, sd=3) > P2 [1] 0.99 > P2 (100-98)% = 2% qnorm(0.01, mean=5, sd=3) L1 L2=qnorm(0.99, mean=5, sd=3) L2 98% (L1=-1.979044, L2=11.97904) P1=pnorm(L1, mean=5, sd=3) L1 0.01 ( 1%) L2 0.99 (99%) P2 - P1 = 0.99-0.01 = 0.98 98% R > x = r n o r m (100000, mean=5, sd=3) > p = s u m (L1 p [1] 0.98012 0.9812 180. 0.98 N(mean=5, sd=3) mean=5 5 () mean R > x = r n o r m (25, mean=mu, sd=2) > x [1] 10.8299237.7863206.9750806.9803638.9995097.3434105.9280519.15897.0420438.434972 10.5301587.2584138.9905318.4844759.10447.0119666.40576211 [9] 10.116548 62 [17] 141.2235684.449411 11.4654737.382751 10.305355 10.2018 181. [25] 11.802796 > m e a n (x) [1] 8.168483 > s d (x) [1] 2.332146 > sd=2 mean=mu ( mu ) mu 25 mean(x) = 8.168483 sd(x) = 2.332145 mean ( mu) () mean(x) 8.168483 mean = mu 8.0 8.168483 ( ) mu 182. mean.range = function(x, alpha=0.05, sd) { n = l e n g t h (x) # n = mx = m e a n (x) # mx mu r1 = q n o r m (alpha/2) # alpha/2 r2 = q n o r m (1-alpha/2) # alpha/2 L1 = mx-r2*sd/s q r t (n) # L2 = mx-r1*sd/s q r t (n) # range = c (L1, mx, L2) # 183. } > mean.range = function(x, alpha=0.05, sd) { +n = l e n g t h (x) # n = +mx = m e a n (x) # mx mu +r1 = q n o r m (alpha/2) # alpha/2+r2 = q n o r m (1-alpha/2) # alpha/2+L1 = mx-r2*sd/s q r t (n) # +L2 = mx-r1*sd/s q r t (n) # +range = c (L1, mx, L2) # + } > m e a n . r a n g e (x, sd=2) > R = m e a n . r a n g e (x, sd=2) > R [1] 7.384497 8.168483 8.952468 x 95% ( alpha=0.05, 1-alpha=0.95) (7.384497, 8.952468) mean(x) 8.168483 184. mu sd ( R )T sd T mean T T S () T 185. T T (+1) T > c u r v e (dnorm, from=-3, to=3, col="black") > c u r v e (d t (x, df=25), from=-3, to=3, add=T, ylab="T25", col="blue") > c u r v e (d t (x, df=10), from=-3, to=3, add=T, ylab="T10", col="red") > c u r v e (d t (x, df=3), from=-3, to=3, add=T, ylab="T3", col="green") > 186. T (=3, 10, 25 4, 11, 26 ) T sd > t . t e s t (x) One Sample t-test data:xt = 17.5128, df = 24, p-value = 3.562e-15 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: 7.205820 9.131145 sample estimates: mean of x 8.168483 T 95% (7.205820, 9.131145) t.test 187. test (alternative hypothesis) mu0 (true mean is not equal to 0) mu=0 3.562e-15 ( N(mean=8, sd=2) 0 mu 8 > t . t e s t (x, mu=8) One Sample t-test data:xt = 0.3612, df = 24, p-value = 0.7211 alternative hypothesis: true mean is not equal to 8 95 percent confidence interval: 7.205820 9.131145 sample estimates: mean of x 8.168483) 188. mu=8 p-value = 0.7211 mu=0 p-value = 3.562e-15 mu=8 df = 24T (25-1=24) p-value=0.7211 t df=24 0.7211 > half = (1-0.7211)/2 > df1= 24> from=q t (half, df=df1) > to=q t (1-half, df=df1) > p t (to, df=df1)-p t (from, df=df1) [1] 0.7211 H1 H0 189. H0 H1 (research hypothesis) (alternhative hypothesis) H0 (null hypothesis) () H1 H0 H1 ( H0 H1 H1 = not H0) H0 H1 mu = 0 H0 H1 H0: mu=0 H1: mu0 H0 H1 H0 I (H0 H0) H0 II (H1 H0) mu 190. H0 : mu=0 H1 : mu 0 mu=0 I ( mu=0 ) mu=0 II ( mu 0 mu=0) H1 H0 H0 H1 P mu=8 95% (7.205820 9.131145) mu=8 mu=8 mu=0 95% (7.205820 9.131145) mu 0 (P , p-value) P mu=0 P mu=0 3.562e-15 H0:mu=0 H1:mu 0 191. mu=8 (p-value) mu=8 mu=0 p-value = 3.562e-15 mu=0 mu 0 R visualizationTools install.packages("visualizationTools") p l o t (t . t e s t (x, mu=5)) > x = r n o r m (25, mean=5.5, sd=2) > t . t e s t (x, mu=5) One Sample t-test data:xt = 0.8254, df = 24, p-value = 0.4173 192. alternative hypothesis: true mean is not equal to 5 95 percent confidence interval: 4.508086 6.147536 sample estimates: mean of x 5.327811 > i n s t a l l . p a c k a g e s ("visualizationTools") --- Please select a CRAN mirror for use in this session -- URL 'http://cran.csie.ntu.edu.tw/bin/windows/contrib/3.0/visualizationTools_0 .2.05.zip' Content type 'application/zip' length 102524 b y t e s (100 Kb) URL downloaded 100 Kb package visualizationTools successfully unpacked and MD5 sums checked The downloaded binary packages are in C:UserscccAppDataLocalTempRtmpyCqQmqdownloaded_packages > l i b r a r y ("visualizationTools") > p l o t (t . t e s t (x, mu=5)) 193. > 194. plot(t.test(x, mu=5)) n-1 T x x = c (46.26534, 49.30766, 53.79364, 53.18000, 48.97584, 51.92664, 44.58280, 62.26 655, 54.52493, 55.08502, 56.78329, 45.00972, 46.99871, 43.88388, 52.63184, 53.1560 0, 48.39374, 51.07595, 47.36923, 52.09186, 46.54074, 54.46617, 47.87038, 42.94228, 48.69307) 195. 1. mu 95% 2. mu 98% 3. mu=50 a. b. c. p-value d. mu=50 e. mu 50 x mu=r u n i f (1, 0, 10) sd1 = r u n i f (1, 1,2) x=r n o r m (25, mean=mu, sd=sd1) x 1. mu 95% 2. mu 98% 3. mu=5 a. b. 196. c. p-value d. mu=5 e. mu 5 -- ( R ) -- http://ccckmit.wikidot.com/st:main ()(Milton 4/e), Milton, , 2008 4, ISBN9789861574080 197. R H0 H1 198. p M () -- T () -- p () -- Z M () -- Wilcoxon Sign-Rank () () ( M () ) () > x = r n o r m (25, mean=5, sd=2) > x [1] 6.6148290 8.4660415 4.7084610 8.0959357 5.0618158 3.6971976 7.7887572 [8] 5.2229378 4.7763453 4.3595627 4.7674163 2.8655986 4.5051726 1.2974370 199. [15] 6.9794643 0.4042951 8.0391053 6.7884780 6.5557084 3.7146943 0.3457576 [22] 7.4302876 6.7216046 9.1046976 7.0879767 > s d (x) [1] 2.430731 > m e a n (x) [1] 5.415983 > t . t e s t (x, alternative="greater", mu=4.8) One Sample t-test data:xt = 1.2671, df = 24, p-value = 0.1086 alternative hypothesis: true mean is greater than 4.8 95 percent confidence interval: 4.584244Infsample estimates: mean of x 5.415983 > t . t e s t (x, alternative="less", mu=4.8) 200. One Sample t-test data:xt = 1.2671, df = 24, p-value = 0.8914 alternative hypothesis: true mean is less than 4.8 95 percent confidence interval: -Inf 6.247722 sample estimates: mean of x 5.415983 > t . t e s t (x, alternative="two.sided", mu=4.8) One Sample t-test data:xt = 1.2671, df = 24, p-value = 0.2173 alternative hypothesis: true mean is not equal to 4.8 95 percent confidence interval: 4.412627 6.419339 sample estimates: 201. mean of x 5.415983 t.test(x, alternative="greater", mu=4.8) t = 1.2671, df = 24, p-value = 0.1086 p-value=0.1086 > 1-p t (1.2671, df=24) [1] 0.1086388 t.test(x, alternative="less", mu=4.8) t = 1.2671, df = 24, p-value = 0.8914 pvalue=0.8914 > p t (1.2671, df=24) [1] 0.8913612 t.test(x, alternative="two.sided", mu=4.8) t = 1.2671, df = 24, p-value = 0.2173 p-value=0.2173 > 1-(p t (1.2671, df=24)-p t (-1.2671, df=24)) [1] 0.2172776 > 202. p > p r o p . t e s t (25, 100, correct=T, p=0.25) 1-sample proportions test without continuity correction data:25 out of 100, null probability 0.25X-squared = 0, df = 1, p-value = 1 alternative hypothesis: true p is not equal to 0.25 95 percent confidence interval: 0.1754521 0.3430446 sample estimates: p 0.25 > p r o p . t e s t (25, 100, correct=T, p=0.01) 1-sample proportions test with continuity correction data:25 out of 100, null probability 0.01 203. X-squared = 557.8283, df = 1, p-value < 2.2e-16 alternative hypothesis: true p is not equal to 0.01 95 percent confidence interval: 0.1711755 0.3483841 sample estimates: p 0.25 Warning message: In p r o p . t e s t (25, 100, correct = T, p = 0.01) : Chi-squared approximation may be incorrect > p r o p . t e s t (25, 100, correct=T, p=0.2) 1-sample proportions test with continuity correction data:25 out of 100, null probability 0.2X-squared = 1.2656, df = 1, p-value = 0.2606 alternative hypothesis: true p is not equal to 0.2 95 percent confidence interval: 0.1711755 0.3483841 sample estimates: 204. p 0.25 M > w i l c o x . t e s t (x, mu=4.8) Wilcoxon signed rank test data:xV = 207, p-value = 0.2411 alternative hypothesis: true location is not equal to 4.8 () T (pooled T test) -- T T T () T T 205. S T 206. R T > x=r n o r m (25, mean=3.0, sd=2) > y=r n o r m (25, mean=3.2, sd=2) > x [1]5.12770813 -0.692018413.113595321.937150937.768801723.54159714[7]1.471593314.275559753.484212322.251914423.467429887.85327689[13]3.524936675.410721904.396684690.29868134 -0.195210051.30992501[19]2.554715683.892143936.01076126 -0.02217834[25]4.158521901.036814575.68719430> y [1]4.05655813.96179626.35133764.99982174.44192586.3198375[7] -1.04836225.18098457.54353072.60480845.67646632.66871812.7981462 -0.35643320.86371994.20323714.58797453.14287645.95774572.33345313.26621931.6285190[13][19] -0.3657162 [25]4.04002082.2731483> t . t e s t (x, y, var.equal=T) ## () T (pooled T test) -- T T 207. ## T () Two Sample t-test data:x and yt = -0.3409, df = 48, p-value = 0.7346 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.5080211.070757sample estimates: mean of x mean of y 3.2665813.485213> t . t e s t (x,y, pair=T)## () T (Paired T Test)## (1) 2 (normally distributed) ##(2) (mutually independently) Paired t-testdata:x and y 208. t = -0.3438, df = 24, p-value = 0.734 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.5311341.093870sample estimates: mean of the differences -0.218632 H0 H1 209. F X, Y SX2 X,Y F R var.test() F var.test() > v a r . t e s t (x,y) F test to compare two variances data:x and yF = 1.0973, num df = 24, denom df = 24, p-value = 0.8219 210. alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval: 0.4835609 2.4901548 sample estimates: ratio of variances 1.097334 H0 H1> x=c (100, 200) > y=c (300, 400) > p r o p . t e s t (x,y) 2-sample test for equality of proportions with continuity 211. correction data:x out of yX-squared = 18.7698, df = 1, p-value = 1.475e-05 alternative hypothesis: two.sided 95 percent confidence interval: -0.24201562 -0.09131771 sample estimates: prop 1prop 20.3333333 0.5000000 ()Wilcoxon Rank-Sum X, Y H0 212. H1> x = r n o r m (20, mean=5, sd=2) > y = r n o r m (20, mean=5.5, sd=2) > x [1] 3.962665 4.592900 2.708658 4.302144 9.140617 6.579571 4.711547 4.842238 [9] 5.634979 8.826325 7.492737 5.349967 6.028533 5.326150 3.280819 2.589442 [17] 6.391175 3.299716 5.681381 3.188571 > y [1] 7.537479 5.810962 7.340678 4.048306 6.179672 5.152021 6.780724 3.354434 [9] 6.484613 8.752706 4.116139 4.939286 4.074703 2.954187 4.489012 5.697258 [17] 5.260137 6.299990 8.188696 5.743851 > w i l c o x . t e s t (x, y, exact=F, correct=F) Wilcoxon rank sum test data:x and yW = 162, p-value = 0.304 alternative hypothesis: true location shift is not equal to 0 213. ()Wilcoxon Signed-Rank (X, Y) 1..n MY W W+=Ri>0Ri W=Ri w i l c o x . t e s t (x,y, exact=F, correct=F, paired=T) Wilcoxon signed rank test data:x and yV = 83, p-value = 0.4115 alternative hypothesis: true location shift is not equal to 0 R 214. (ANOVA, Analysis Of Variance) //R -- http://ccckmit.wikidot.com/r:main // ( R ) -- http://ccckmit.wikidot.com/st:main 215. (ANOVA) T T (pooled T test) R T > x = r n o r m (20, 5, 1) > x [1] 6.240855 4.229226 5.349843 6.023241 5.613232 5.300235 4.696128 5.452365 [9] 4.567735 5.260747 3.800322 6.168725 6.196059 4.969572 6.251078 3.549983 [17] 6.432844 5.308146 4.978811 4.944134 > y = r n o r m (20, 5, 1) > y [1] 5.969639 5.400434 4.231995 4.804537 3.098015 5.481365 6.016810 2.769489 [9] 6.687201 4.240217 6.602660 4.777928 4.825274 4.110038 5.651073 5.829578 216. [17] 4.651262 6.036818 3.459562 5.993473 > t . t e s t (x, y, var.equal=TRUE) Two Sample t-test data:x and yt = 0.7519, df = 38, p-value = 0.4567 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.39735990.8669515sample estimates: mean of x mean of y 5.2666645.031868> z = r n o r m (20, 4, 1) > t . t e s t (x, z, var.equal=TRUE) Two Sample t-test data:x and zt = 5.9399, df = 38, p-value = 6.883e-07 217. alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 1.079955 2.196671 sample estimates: mean of x mean of y 5.2666643.628351> x, y rnorm(20, 5, 1) ( 5 1) t.test(x, y, var.equal=TRUE) p-value = 0.4567 1-95% = 0.05 (-0.3973599, 0.8669515) H0 t.test(x, z, var.equal=TRUE) z rnorm(20, 4, 1) ( 4 1) p-value = 6.883e-07 0.05 (1.079955, 2.196671) 0 (Analysis of Variance, ANOVA) k 218. T (1, 2),(1, 3), ... (1, k), (2, 3), (2,4), ... (2, k), .... (k-1, k) () H0 R > X = r n o r m (40, 5, 1) # ( 5 1) > X [1] 5.584603 4.052913 4.434469 5.844309 5.286695 5.188169 4.796683 3.913132 [9] 5.467150 5.740397 4.528423 4.395270 4.994147 4.014513 6.259213 6.898331 [17] 3.792135 3.879688 5.334643 5.887895 5.647250 5.603816 5.465186 6.703394 [25] 5.153999 4.855386 2.129850 5.477026 4.785934 4.138114 5.726216 3.581281 [33] 5.255695 4.515353 6.391714 3.726963 5.744328 5.314164 4.647955 4.848313 > A = f a c t o r (r e p (1:4, each=10)) # 1, 2, 3, 4 10 > A [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 [37] 4 4 4 4 219. > XA = d a t a . f r a m e (X, A) # XA X 1, 2, 3, 4 > aov.XA = a o v (X~A, data=XA) # X A > s u m m a r y (aov.XA) # Df Sum Sq Mean Sq F value P r (>F) A Residuals35.0151.671836 28.4082.1190.1150.7891> p l o t (XA$X~XA$A) # X~A X = rnorm(40, 5, 1) A = factor(rep(1:4, each=10)) XA = data.frame(X, A) 10 1, 2, 3, 4 XA (frame) `aov.XA = aov(X~A, data=XA)' summary(aov.XA) Pr(>F) = 0.115 (1-95%) 0.05 H0 plot(XA$X~XA$A) 220. X A rnorm(10, 4, 1) X 5 4 ( 5) > Y = c (X, r n o r m (10, 4, 1)) # X 10 4 Y 221. > Y [1] 5.584603 4.052913 4.434469 5.844309 5.286695 5.188169 4.796683 3.913132 [9] 5.467150 5.740397 4.528423 4.395270 4.994147 4.014513 6.259213 6.898331 [17] 3.792135 3.879688 5.334643 5.887895 5.647250 5.603816 5.465186 6.703394 [25] 5.153999 4.855386 2.129850 5.477026 4.785934 4.138114 5.726216 3.581281 [33] 5.255695 4.515353 6.391714 3.726963 5.744328 5.314164 4.647955 4.848313 [41] 3.516310 4.174873 2.541251 2.851404 4.862902 2.739729 2.848565 3.169462 [49] 4.245488 3.543660 > B = c (A, r e p (5, 10)) # 10 5 A B 10 > B [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 [37] 4 4 4 4 5 5 5 5 5 5 5 5 5 5 > YB = d a t a . f r a m e (Y, B) # YB XA 10 > aov.YB = a o v (Y~B, data=YB)# Y B > s u m m a r y (aov.YB) # Df Sum Sq Mean Sq F value B Residuals110.1510.1524849.52P r (>F)9.84 0.00292 **1.032--Signif. codes:0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 222. > > p l o t (YB$Y~YB$B) # Y~B aov.YB = aov(Y~B, data=YB) Pr(>F) 0.00292 0.05 plot() 223. Y B aov(Y~B, data=YB) H0 X .... k 224. ... i j X +i i ; ; ; i () ; 225. ; ; .; . H0 ; SS_A SS_E 226. ; F Y~B pairwise.t.test X~A > p a i r w i s e . t . t e s t (X, A) Pairwise comparisons using t tests with pooled SD data: 1 2 3X and A 227. 2 1 - 3 1 1 4 1 1 1 P value adjustment method: holm (1, 2) (1,3), (1,4), (2,3), (2,4), (3,4) 1 pairwise.t.test(Y, B) Y B > p a i r w i s e . t . t e s t (Y, B) Pairwise comparisons using t tests with pooled SD data: 1Y and B 2342 1.0000 ---3 1.0000 1.0000 --4 1.0000 1.0000 1.0000 5 0.0053 0.0060 0.0060 0.0060 228. P value adjustment method: holm 5 0.0053 0.0060 0.0060 0.0060 5 holm (P) (X,A) (Y,B) > p a i r w i s e . t . t e s t (X, A, p.adjust.method="none") Pairwise comparisons using t tests with pooled SD data: 1X and A 232 0.94 --3 0.94 1.00 4 0.90 0.96 0.96 P value adjustment method: none > p a i r w i s e . t . t e s t (Y, B, p.adjust.method="none") 229. Pairwise comparisons using t tests with pooled SD data:Y and B12342 0.93936 ---3 0.93482 0.99545 --4 0.89612 0.95654 0.96108 5 0.00053 0.00067 0.00068 0.00079 P value adjustment method: none 5 0.00053 0.00067 0.00068 0.00079 0.05 5 R aov() pairwise.t.test() 230. R, : , ISBN: 9787040250626 231. y x y x y= a + b * x a b (x,y) (x1, y1) (x2, y2) ..... (xk, yk) R lm() R lm() lm(formula, data, subset, weights, na.action, method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL, offset, ...) 232. formula data 25 xy = (x1, y1) (x2, y2) .... (x25, y25) lm x, y lm(y~x, xy) y = a + b1 * x1 + b2 * x2 a, b1, b2 lm(y~x1+x2, xy) R > x = s a m p l e (1:10, 25, replace=TRUE) # 1 10 25 > x [1]7[25]19412 10633 109595 10218487469> y = 1+3*x # x y > y [1] 22 [25] 284 28 1347 31 19 10 10 31 28 16 28 16 3174 25 13 25 22 13 19 233. > p l o t (x,y)# (x,y) 3 234. y=1+3*x plot(x,y) > xy = data.frame(x, y)# (x,y) frame lm(formula, data) data > xy xy17 222139 2844 135146274710 3186 1993 10103 1011 10 31 129 28135 16149 28 235. 155 1616 10 31 17271814198 25204 13218 25227 22234 13246 19259 28> model = lm(y~x, data=xy) # > model Call: lm(formula = y ~ x, data = xy) Coefficients: (Intercept)x13 lm =1x =3 y=1+3*x 236. y x > x = s a m p l e (1:10, 25, replace=TRUE)# 1 10 25 > x [1] 10 [25]199435645946881835482543> y = 1 + 3*x + r n o r m (25, mean=0, sd=1) # x y rnorm() > y [1] 32.0849452.594432 26.981934 29.395315 11.737131 11.347425 14.292503[8] 18.543848 12.025822 15.848780 27.211809 13.063985 19.708064 24.161863 [15] 25.271777 [22]4.317383 25.6944769.167196 15.871665 14.244046 24.1653935.994046 16.639490 12.682938 10.978489> p l o t (x,y) 237. y = 1 + 3*x + rnorm(25, mean=0, sd=1) plot(x,y) > xy = d a t a . f r a m e (x,y)# (x,y) frame lm(formula, data) data > xy x 1y10 32.084945212.59443239 26.98193449 29.39531554 11.73713163 11.34742575 14.29250386 18.54384894 12.025822105 15.848780119 27.211809124 13.063985136 19.708064148 24.161863 238. 158 25.271777161178 25.694476183195 15.871665204 14.244046218 24.165393222235 16.639490244 12.682938253 10.9784894.317383 9.1671965.994046> model2 = l m (y~x, xy, x=T) # > model2 # intercept 0.5345, x 3.1447 y=0.5345+3.1447*xy = 1 + 3*x + Call: l m (formula = y ~ x, data = xy, x = T) Coefficients: (Intercept)x0.53453.1447 239. > model2$x (Intercept)x115217318414513612713814915101411171217131214141512161 1017171813 240. 1913201221172215231 10241251 107a t t r (,"assign") [1] 0 1 > (X,Y) n X, Y ; i=1,2,....,n, (X,Y) 241. (X,Y) (X, Y) 242. 1. t 2. F 3. R y=a + b1 * x1 + b2 * x2 > x1 = s a m p l e (1:10, 25, replace=TRUE) # 25 > x2 = s a m p l e (1:8, 25, replace=TRUE) # 25 > y = 5 + 3 * x1 - 2 * x2 # (x1, x2) y > x1 [1]8[25]882634154216424157292 10451 187 215> x2 [1] 7 7 1 8 5 5 5 2 6 8 5 7 4 6 8 5 6 8 2 5 7 2 7 6 5 > y [1] 15 15 27 -5 13474811 -6 155 -57 -44 22[25] 10 > yx12 = d a t a . f r a m e (y, x1, x2)# (y, x1, x2) frame 243. lm(formula, data) data > yx12.model = l m (y~x1+x2, yx12) # > yx12.model # intercept 5, x1 3x2 -2 y=5+3*x1-2*x2 Call: l m (formula = y ~ x1 + x2, data = yx12) Coefficients: (Intercept)x1x253-2> > x1 = s a m p l e (1:10, 25, replace=TRUE) # 25 > x2 = s a m p l e (1:8, 25, replace=TRUE) # 25 244. > y2 = 5 + 3*x1-2*x2 + r n o r m (25, mean=0, sd=5) > y2x12 = d a t a . f r a m e (y2, x1, x2)# (y, x1, x2) frame lm(formula, data) data > y2x12 y2 x1 x2 110.206941287211.576046787324.8724883814-3.44061102859.06504156568.262122735718.7755635458-5.175351812914.17957085610 -2.958823648114.49314022512 -9.17067401713 15.78264136414 11.16846724615 -4.21083252816 14.055787745 245. 172.978781816180.22772535819 31.34661577220 11.23111462521 17.939731697222223 17.5177323 107246.1773147 1.11890834625 15.569662655> y2x12.model = l m (y~ x1+x2, y2x12) # > y2x12.model # intercept 5.315, x1 2.886x 2 -1.997 y=5.315+2.886*x1-1.997x2 y = 5 + 3*x1-2*x2 + Call: l m (formula = y ~ x1 + x2, data = y2x12) Coefficients: (Intercept)x1x25.3152.886-1.997 246. > lm() lm() R (4) R (5) //R -- http://ccckmit.wikidot.com/r:main // ( R ) -- http://ccckmit.wikidot.com/st:main 247. (Principle Component Analysis) [A] n*n n k 248. ... () Rank 3*3 rank 2 ( 0) 1 (Rank=2) R 4 25 3, 4 1, 2 4*25 rank 2 249. > x1=r n o r m (25, mean=5, sd=1) # x1 25 > x2=r n o r m (25, mean=5, sd=1) # x2 25 > x3=x1+x2# x3=x1+x2, x1, x2 > x4=x1+2*x3# x4=x1+2*x3=x1+2*(x1+x2)=3x1+2x2, x1, x2 > x14 = d a t a . f r a m e (x1, x2, x3, x4) # frame x14 > pr = p r i n c o m p (x14, cor=TRUE) # > s u m m a r y (pr, loading=TRUE) # Importance of components: Comp.1 Standard deviationComp.2Comp.3Comp.41.7281767 1.0066803 4.712161e-08 8.339758e-09Proportion of Variance 0.7466487 0.2533513 5.551115e-16 1.738789e-17 Cumulative Proportion0.7466487 1.0000000 1.000000e+00 1.000000e+00Loadings: Comp.1 Comp.2 Comp.3 Comp.4 x1 -0.4490.6260.637x2 -0.367 -0.7680.4950.176x3 -0.576-0.311 -0.750x4 -0.576-0.502>0.638 250. (Cumulative Proportion) Comp.1 0.7466487 Comp.2 1.0 Loadings Comp.1 Comp.1 = -0.449 x1 - 0.367 x2 - 0.576 x3 - 0.576 x4 7 (0.7466487) Comp.2 = 0.626 x1 0.768 x2 100% ( rank 2) 2 (Rank=3) x3 x4=3x1+2x2+x3 > x1=r n o r m (25, mean=5, sd=1) # x1 25 > x2=r n o r m (25, mean=5, sd=1) # x2 25 > x3=r n o r m (25, mean=5, sd=1) # x3 25 > x4=3*x1+2*x2+x3# x4=3*x1+2*x2+x3, x1, x2, x3 > x14 = d a t a . f r a m e (x1, x2, x3, x4) # frame x14 > pr = p r i n c o m p (x14, cor=TRUE) # > s u m m a r y (pr, loading=TRUE)) # Importance of components: Comp.1Comp.2Comp.3Comp.4 251. Standard deviation1.4659862 1.1233489 0.767445 4.712161e-08Proportion of Variance 0.5372789 0.3154782 0.147243 5.551115e-16 Cumulative Proportion0.5372789 0.8527570 1.000000 1.000000e+00Loadings: Comp.1 Comp.2 Comp.3 Comp.4 x10.6340.1040.4580.615x20.310 -0.669 -0.6250.259x30.1940.146x40.6820.736 -0.632-0.731 (Cumulative Proportion) 1.0 Cumulative Proportion0.5372789 0.8527570 1.000000 1.000000e+00 (Standard deviation) (Proportion of Variance) Comp.4 0 (4.712161e-08, 5.551115e-16) 3 (Rank=3 ) x4 252. > x1=r n o r m (25, mean=5, sd=1) # x1 25 > x2=r n o r m (25, mean=5, sd=1) # x2 25 > x3=r n o r m (25, mean=5, sd=1) # x3 25 > x4=3*x1+2*x2+x3+r n o r m (25, mean=0, sd=1) # x4=3*x1+2*x2+x3, x1, x2, x3 > x14 = d a t a . f r a m e (x1, x2, x3, x4) # frame x14 > pr = p r i n c o m p (x14, cor=TRUE) # > s u m m a r y (pr, loading=TRUE)) # Importance of components: Comp.1 Standard deviationComp.2Comp.3Comp.41.4565751 1.1233728 0.7704314 0.151189097Proportion of Variance 0.5304027 0.3154916 0.1483911 0.005714536 Cumulative Proportion0.5304027 0.8458943 0.9942855 1.000000000Loadings: Comp.1 Comp.2 Comp.3 Comp.4 x1 -0.6420.4100.637x2 -0.306 -0.662 -0.6450.228x3 -0.1730.103x4 -0.681 >0.1170.740 -0.641-0.729 253. (Cumulative Proportion) 99.4% (0.9942855) 100% Cumulative Proportion0.5304027 0.8458943 0.9942855 1.000000000 (Standard deviation) (Proportion of Variance) Comp.1 Standard deviationComp.2Comp.3Comp.41.4565751 1.1233728 0.7704314 0.151189097Proportion of Variance 0.5304027 0.3154916 0.1483911 0.005714536 (Factor Analysis) > x1=r n o r m (25, mean=5, sd=1) # x1 25 254. > x2=r n o r m (25, mean=5, sd=1) # x2 25 > x3=r n o r m (25, mean=5, sd=1) # x3 25 > x4=3*x1+2*x2+x3+r n o r m (25, mean=0, sd=1) # x4=3*x1+2*x2+x3, x1, x2, x3 > x14 = d a t a . f r a m e (x1, x2, x3, x4) # frame x14 > fa = f a c t a n a l (x14, factors=2) f a c t a n a l (x14, factors = 2) : 2 factors is too many for 4 variables > fa = f a c t a n a l (x14, factors=1) > fa Call: f a c t a n a l (x = x14, factors = 1) Uniquenesses: x1x2x3x40.126 0.834 0.951 0.005 Loadings: Factor1 x1 0.935 255. x2 0.407 x3 0.222 x4 0.998 Factor1 SS loadings2.084Proportion Var0.521Test of the hypothesis that 1 factor is sufficient. The chi square statistic is 21.97 on 2 degrees of freedom. The p-value is 1.7e-05 > fa = f a c t a n a l (x14, factors=3) f a c t a n a l (x14, factors = 3) : 3 factors is too many for 4 variables 256. R, : , ISBN: 9787040250626 257. A (Binomial distribution) (Bernoulli trial) YES or NO (1 or 0) 30 n n 258. dbinom(x; n, p) n x ( p) R dbinom(x; n, p) = p(x) = choose(n,x) p^x (1-p)^(n-x) R binom(size=n:, prob=p:) http://stat.ethz.ch/R-manual/R-patched/library/stats/html/Binomial.html 1. 2. 1. 10 2. n y m1 m2 y [1] 1299[25]97 10[49]99 12 11 10 11 4 10 1059 10 13787 168 10 146 12 139 128 11 11 10 1099 137 12 157 13585 11 138> m1 [1] 9.72 > m2 [1] 19.44 > m3 m3 [1] 6.8816 > y 25 rbinom(50, 25, .4) 50*25 R 260. > n=10; p=0.3; k=s e q (0,n) > p l o t (k, d b i n o m (k,n,p), type='h', main='dbinom(0:20, n=10, p=0.3)', xlab='k') > 261. R () n n*min(p, 1-p) > 5 op=p a r (mfrow=c (2,2)) n=3; p=0.3; k=s e q (0,n) p l o t (k, d b i n o m (k,n,p), type='h', main='dbinom(n=3, p=0.3)', xlab='k') c u r v e (d n o r m (x,n*p,s q r t (n*p*(1-p))), add=T, col='blue') n=5; p=0.3; k=s e q (0,n) p l o t (k, d b i n o m (k,n,p), type='h', main='dbinom(n=5, p=0.3)', xlab='k') c u r v e (d n o r m (x,n*p,s q r t (n*p*(1-p))), add=T, col='blue') n=10; p=0.3; k=s e q (0,n) p l o t (k, d b i n o m (k,n,p), type='h', main='dbinom(n=10, p=0.3)', xlab='k') c u r v e (d n o r m (x,n*p,s q r t (n*p*(1-p))), add=T, col='blue') 262. n=100; p=0.3; k=s e q (0,n) p l o t (k, d b i n o m (k,n,p), type='h', main='dbinom(n=100, p=0.3)', xlab='k') c u r v e (d n o r m (x,n*p,s q r t (n*p*(1-p))), add=T, col='blue') 263. R > x = r b i n o m (100000, 100, 0.8) > h i s t (x, nclas=m a x (x)-m i n (x)+1) > 264. Distributions in the stats package -- http://stat.ethz.ch/R-manual/R-patched/library/stats/html/Distributions.html 265. Wikipedia: -http://zh.wikipedia.org/wiki/%E4%BA%8C%E9%A0%85%E5%88%86%E4%BD%88 Wikipedia:Binomial_distribution -- http://en.wikipedia.org/wiki/Binomial_distribution (Netative binomial distribution) r=1,2,3,.... ; x= r, r+1, r+2, .... r x; R nbinom(size, prob) ; r:size:, p:prob: http://stat.ethz.ch/R-manual/R-patched/library/stats/html/NegBinomial.html 1. 2. 266. 1. x 2. r x R > n=20; p=0.4; k=s e q (0,50) > p l o t (k, d n b i n o m (k,n,p), type='h', main='dnbinom(k,n=20,p=0.4)', xlab='k') > 267. > x = r n b i n o m (100000, 100, 0.8) > h i s t (x, nclass=m a x (x)-m i n (x)+1) > 268. R () r e q u i r e (graphics) 269. x x size p e r s p (x,size, dnb t i t l e (tit > image(x,size, l o g 1 0 (dnb), main= p a s t e ("log [",tit,"]"))> c o n t o u r (x,size, l o g 1 0 (dnb),add=TRUE) > > ## Alternative parametrization > x1 x2 x3 h1 h2 h3 b a r p l o t (r b i n d (h1 ![](../timg/d7d2ac3ced7f.jpg) counts, h3$counts), +beside = TRUE, col = c ("red","blue","cyan"),+names.arg = r o u n d (h1 ![](../timg/728bbfdb9a46.jpg) breaks)])) 272. Wikipedia: Wikipedia:Negative_binomial_distribution (Geometric distribution)r=1,2,3,.... ; x= r, r+1, r+2, .... R geom(prob) ; p:prob:, x-1:size:, q: R R x R x (x-1) http://stat.ethz.ch/R-manual/R-patched/library/stats/html/Geometric.html 1. 273. 2. 1. 1 x 2. 1 6 x R p=0.7; k=s e q (0,10) p l o t (k, d g e o m (k, p), type='h', main='dgeom(p=0.5)', xlab='k')R q g e o m ((1:9)/10, prob = .2) Ni q g e o m ((1:9)/10, prob = .2) [1]00123457 10> Ni Wikipedia: Wikipedia:Geometric_distribution (Hypergeometric distribution)N r N-r n x ; () R hyper(m,n,k) = choose(m, x) choose(n, k-x) / choose(m+n, k) 275. R m+n m n k x ; ( ) R http://stat.ethz.ch/R-manual/R-patched/library/stats/html/Hypergeometric.html R N=>m+n; n=>k; r=>m R 1. 2. R m=10; n=5; k=8 x=s e q (0,10) p l o t (x, d h y p e r (x, m, n, k), type='h', main='dhyper(m=10,n=5,k=8)', xlab='x') 276. R m ?dpois x=8 285. > d p o i s (8, 10) [1] 0.112599 > 10^8*e x p (-10)/p r o d (1:8) [1] 0.112599R lambda=5.0; k=s e q (0,20); p l o t (k, d p o i s (k, lambda), type='h', main='dpois(lambda=4.0)', xlab='k') 286. R r e q u i r e (graphics) -l o g (d p o i s (0:7, lambda=1) * g a m m a (1+ 0:7)) # == 1 Ni > -l o g (d p o i s (0:7, lambda=1) * g a m m a (1+ 0:7)) # == 1 [1] 1 1 1 1 1 1 1 1 > Ni 14368 11 11433 288. > 1 - p p o i s (10*(15:25), lambda=100)# becomes 0 (cancellation)[1] 1.233094e-06 1.261664e-08 7.085799e-11 2.252643e-13 4.440892e-16 [6] 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 [11] 0.000000e+00 >p p o i s (10*(15:25), lambda=100, lower.tail=FALSE)# no cancellation[1] 1.233094e-06 1.261664e-08 7.085800e-11 2.253110e-13 4.174239e-16 [6] 4.626179e-19 3.142097e-22 1.337219e-25 3.639328e-29 6.453883e-33 [11] 7.587807e-37 > > p a r (mfrow = c (2, 1)) > x p l o t (x, p p o i s (x, 1), type="s", ylab="F(x)", main="Poisson(1) CDF") > p l o t (x, p b i n o m (x, 100, 0.01),type="s", ylab="F(x)", + >main="Binomial(100, 0.01) CDF") 289. Wikipedia: Wikipedia:Poisson_distribution (Uniform distribution) (a,b) R unif(a:min, b:max) R f(x) = 1/(max-min) a:min , b:max R > op=p a r (mfrow=c (2,2)) > c u r v e (d u n i f (x, 0, 1), -2, 10) > c u r v e (d u n i f (x, 1, 5), -2, 10) > c u r v e (d u n i f (x, -1, 9), -2, 10) 290. > c u r v e (d u n i f (x, 10, 110), 0, 200) > (Normal Distribution) R 1, 2, 3, c u r v e (d n o r m (x), -5, 5, col="black") c u r v e (d n o r m (x, sd=2), -5, 5, col="blue", add=T) c u r v e (d n o r m (x, sd=3), -5, 5, col="green", add=T) 291. R norm(mean, sd) 292. R f(x) = 1/((2 ) ) e^-((x - )^2/(2 ^2)) n 1 .2 .3 > p n o r m (1)-p n o r m (-1) [1] 0.6826895 > p n o r m (2)-p n o r m (-2) [1] 0.9544997 > p n o r m (3)-p n o r m (-3) [1] 0.9973002 > p n o r m (4)-p n o r m (-4) [1] 0.9999367 > p n o r m (5)-p n o r m (-5). 293. [1] 0.9999994 > p n o r m (6)-p n o r m (-6) [1] 1 R > d n o r m (0) [1] 0.3989423 > d n o r m (0.5) [1] 0.3520653 > d n o r m (2.5) [1] 0.0175283 > c u r v e (d n o r m (x), from = -3.5, to = 3.5, ylab="pdf", main="N(0,1)") > 294. > x = r n o r m (100) > h i s t (x, nclass=8) > 295. > x = r n o r m (1000) > h i s t (x, nclass=50) >

機率統計 -- 使用 r 軟體

Education