introdução ao r - usp · introdu¸c˜ao ao r ricardo ehlers ehlers@icmc.usp.br departamento de...

Introducao ao R

Ricardo Ehlersehlers@icmc.usp.br

Departamento de Matematica Aplicada e Estatıstica

Universidade de Sao Paulo

Introducao

“A big computer, a complex algorithm and a long time does notequal science.” Robert Gentleman

“Far better an approximate answer to the right question than theexact answer to the wrong question.” John Tukey

Alguns periodicos,

• Communications in Statistics - Simulation and Computation

• Computational Statistics

• Computational Statistics & Data Analysis

• Journal of Computational and Graphical Statistics

• Journal of Statistical Computation and Simulation

• Journal of Statistical Software

• The R Journal

• Statistics and Computing

O Pacote R

Introducao ao R

Pacote estatistico gratuito e de codigo aberto. Disponivel em,

http://www.r-project.org

para sistemas Unix, Windows e Mac OS X.

• Programavel.

• Excelentes recursos graficos.

• Inumeras ferramentas estatisticas.

• Simulacao de distribuicoes de probabilidade.

• Otimizacao numerica.

• Ajuste de varios modelos padrao (regressao, MLG, etc).

• Roda rotinas Fortran e C pre-compiladas.

• Etc.

Pacotes nao incluidos na distribuicao base podem ser instalados.

• Os pacotes disponiveis estao em,http://CRAN.R-project.org/web/packages/

• Topicos especiais,http://CRAN.R-project.org/web/views/

• Varios manuais estao disponiveis em,http://CRAN.R-project.org/manuals.html

Alguns Comandos Simples

> x = c(1,2,3,4,5,6)

[1] 1 2 3 4 5 6

> y = c(x,0,0,x,1)

[1] 1 2 3 4 5 6 0 0 1 2 3 4 5 6 1

> z = c(1,3,5,7,9,11)

> 3 * x + z

[1] 4 9 14 19 24 29

> range(x)

[1] 1 65

> length(z)

> seq(from=0, to=10, by=0.5)

[1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5

[16] 7.5 8.0 8.5 9.0 9.5 10.0

> seq(from=0, to=10, length=10)

[1] 0.000000 1.111111 2.222222 3.333333 4.444444 5.555556

[8] 7.777778 8.888889 10.000000

> rep(x,times=2)

[1] 1 2 3 4 5 6 1 2 3 4 5 6

> rep(x,each=2)

[1] 1 1 2 2 3 3 4 4 5 5 6 66

> log(x)

[1] 0.0000000 0.6931472 1.0986123 1.3862944 1.6094379 1.7917595

> exp(x)

[1] 2.718282 7.389056 20.085537 54.598150 148.413159 403.428793

> sqrt(x)

[1] 1.000000 1.414214 1.732051 2.000000 2.236068 2.449490

> x > 4

[1] FALSE FALSE FALSE FALSE TRUE TRUE

> ! (x > 4)

[1] TRUE TRUE TRUE TRUE FALSE FALSE

> x[x < 4]

[1] 1 2 3

> (x+1)[x < 4]

[1] 2 3 4

Representacao Numerica

Em computadores de 32 bits pode-se pensar na seguinterepresentacao para um numero inteiro u,

(xi 2i−1)− 231

sendo x = (x1, . . . , x32) um vetor de 0’s e 1’s. Nesta representacao,o maior inteiro possıvel de ser armazenado e obtido fazendo todosxi = 1,

u =32∑

2i−1 − 231 = 2147483647

e o menor inteiro e obtido fazendo todos xi = 0,

u = −231 = −2147483648.

> u <- function(x){

+ i = 1:32

+ aux = x*2^(i-1)

+ k = sum(aux) -2^31

+ return(k)

> u(rep(0,32))

[1] -2147483648

> u(rep(1,32))

[1] 2147483647

No R as caracteristicas numericas da sua maquina estao na variavel.Machine,

> class(.Machine)

[1] "list"

> names(.Machine)

[1] "double.eps" "double.neg.eps" "double.xmin"

[4] "double.xmax" "double.base" "double.digits"

[7] "double.rounding" "double.guard" "double.ulp.digits"

[10] "double.neg.ulp.digits" "double.exponent" "double.min.exp"

[13] "double.max.exp" "integer.max" "sizeof.long"

[16] "sizeof.longlong" "sizeof.longdouble" "sizeof.pointer"

> .Machine$integer.max

[1] 2147483647

> is.integer(.Machine$integer.max)

[1] TRUE

> .Machine$integer.max + as.integer(1)

[1] NA

Aritmetica de Ponto Flutuante

Num computador, um numero finito de numeros reais pode serrepresentado. Quais dentre os infinitos numeros reais podem serrepresentados?

(−1)s︸︷︷︸

(d0d1d2 . . . dt−1︸︷︷︸

mantissa

) ( β︸︷︷︸

sendo Emin < e < Emax, 0 ≤ di ≤ β − 1 e o numero de digitos t ea precisao.

Numeros finitos, infinitos e NaN

> pi / 0

[1] Inf

> is.finite(pi / 0)

[1] FALSE

> is.infinite(pi / 0)

[1] TRUE

> 0 / 0

[1] NaN

> is.nan(0/0)

[1] TRUE 13

Criando Matrizes

> matrix(0,nrow=3,ncol=3)

[,1] [,2] [,3]

[1,] 0 0 0

[2,] 0 0 0

[3,] 0 0 0

> matrix(1:9,nrow=3,ncol=3)

[,1] [,2] [,3]

[1,] 1 4 7

[2,] 2 5 8

[3,] 3 6 9

> matrix(c(-1,2.2,3,4.1,5,6,7,8.32,9), nrow=3, ncol=3)

[,1] [,2] [,3]

[1,] -1.0 4.1 7.00

[2,] 2.2 5.0 8.32

[3,] 3.0 6.0 9.00

> matrix(1:9, nrow=3, ncol=4)

[,1] [,2] [,3] [,4]

[1,] 1 4 7 1

[2,] 2 5 8 2

[3,] 3 6 9 3

> A = matrix(1:9, nrow=3, ncol=3, byrow=TRUE)

[,1] [,2] [,3]

[1,] 1 2 3

[2,] 4 5 6

[3,] 7 8 9

> A = matrix(0, nrow=3, ncol=3)

> A[1,1] = 2

> A[1,3] = 4

> A[3,2] = 5

> A[3,3] = 6

> A[2,1] = 3

[,1] [,2] [,3]

[1,] 2 0 4

[2,] 3 0 0

[3,] 0 5 6

Operando com matrizes

> nrow(A)

> ncol(A)

> A[1,]

[1] 2 0 4

> A[,2]

[1] 0 0 5

> A[1:2,2:3]

[,1] [,2]

[1,] 0 4

[2,] 0 0

> diag(A)

[1] 2 0 6

> diag(diag(A))

[,1] [,2] [,3]

[1,] 2 0 0

[2,] 0 0 0

[3,] 0 0 6

Transposta, determinante e inversa de uma matriz

> t(A)

[,1] [,2] [,3]

[1,] 2 3 0

[2,] 0 0 5

[3,] 4 0 6

> det(A)

[1] 60

> solve(A)

[,1] [,2] [,3]

[1,] 0.00 0.3333333 0.0

[2,] -0.30 0.2000000 0.2

[3,] 0.25 -0.1666667 0.0

Autovalores e autovetores de uma matriz

> eigen(A)

eigen() decomposition

$values

[1] 7.468906+0.000000i 0.265547+2.821842i 0.265547-2.821842i

$vectors

[,1] [,2] [,3]

[1,] 0.5744240+0i 0.0559305+0.5943459i 0.0559305-0.5943459i

[2,] 0.2307262+0i 0.6318703+0.0000000i 0.6318703+0.0000000i

[3,] 0.7853677+0i -0.4435397-0.2182595i -0.4435397+0.2182595i

Operacoes com matrizes

> B = matrix(c(1,2.3,-3,4,5,6.7,7,-8,-9.1),

+ nrow=3, ncol=3)

[,1] [,2] [,3]

[1,] 1.0 4.0 7.0

[2,] 2.3 5.0 -8.0

[3,] -3.0 6.7 -9.1

> A * B

[,1] [,2] [,3]

[1,] 2.0 0.0 28.0

[2,] 6.9 0.0 0.0

[3,] 0.0 33.5 -54.6

> A %*% B

[,1] [,2] [,3]

[1,] -10.0 34.8 -22.4

[2,] 3.0 12.0 21.0

[3,] -6.5 65.2 -94.6

> cbind(A,B)

[,1] [,2] [,3] [,4] [,5] [,6]

[1,] 2 0 4 1.0 4.0 7.0

[2,] 3 0 0 2.3 5.0 -8.0

[3,] 0 5 6 -3.0 6.7 -9.1

> rbind(A,B)

[,1] [,2] [,3]

[1,] 2.0 0.0 4.0

[2,] 3.0 0.0 0.0

[3,] 0.0 5.0 6.0

[4,] 1.0 4.0 7.0

[5,] 2.3 5.0 -8.0

[6,] -3.0 6.7 -9.1

Listas

> lista = list(a=1:5, b=c("x","y","z"), c=c(-1,4,7), d=TRUE)

> names(lista)

[1] "a" "b" "c" "d"

> lista

[1] 1 2 3 4 5

[1] "x" "y" "z"

[1] -1 4 7

[1] TRUE23

Leitura de Dados

> data(package="MASS")

Data sets in package MASS:

Aids2 Australian AIDS Survival Data

Animals Brain and Body Weights for 28 Species

Boston Housing Values in Suburbs of Boston

Cars93 Data from 93 Cars on Sale in the USA in 1993

Cushings Diagnostic Tests on Patients with Cushings Syndrome

DDT DDT in Kale

GAGurine Level of GAG in Urine of Children

Insurance Numbers of Car Insurance claims

Melanoma Survival from Malignant Melanoma

OME Tests of Auditory Perception in Children with

Pima.te Diabetes in Pima Indian Women

Pima.tr Diabetes in Pima Indian Women

Pima.tr2 Diabetes in Pima Indian Women

Rabbit Blood Pressure in Rabbits

Rubber Accelerated Testing of Tyre Rubber

SP500 Returns of the Standard and Poors 500

Sitka Growth Curves for Sitka Spruce Trees in 1988

Sitka89 Growth Curves for Sitka Spruce Trees in 1989

Skye AFM Compositions of Aphyric Skye Lavas

Traffic Effect of Swedish Speed Limits on Accidents

> library(MASS)

> Animals

body brain

Mountain beaver 1.350 8.1

Cow 465.000 423.0

Grey wolf 36.330 119.5

Goat 27.660 115.0

Guinea pig 1.040 5.5

Dipliodocus 11700.000 50.0

Asian elephant 2547.000 4603.0

Donkey 187.100 419.0

Horse 521.000 655.0

Potar monkey 10.000 115.0

Cat 3.300 25.6

Giraffe 529.000 680.0

Gorilla 207.000 406.0

Human 62.000 1320.0

African elephant 6654.000 5712.0

Triceratops 9400.000 70.0

Rhesus monkey 6.800 179.0

Kangaroo 35.000 56.0

Golden hamster 0.120 1.0

Mouse 0.023 0.4

Rabbit 2.500 12.1

Sheep 55.500 175.0

Jaguar 100.000 157.0

Chimpanzee 52.160 440.0

Rat 0.280 1.9

Brachiosaurus 87000.000 154.5

Mole 0.122 3.0

Pig 192.000 180.0

> road

deaths drivers popden rural temp fuel

Alabama 968 158 64.0 66.0 62 119.0

Alaska 43 11 0.4 5.9 30 6.2

Arizona 588 91 12.0 33.0 64 65.0

Arkanas 640 92 34.0 73.0 51 74.0

Calif 4743 952 100.0 118.0 65 105.0

Colo 566 109 17.0 73.0 42 78.0

Conn 325 167 518.0 5.1 37 95.0

Dela 118 30 226.0 3.4 41 20.0

DC 115 35 12524.0 0.0 44 23.0

Florida 1545 298 91.0 57.0 67 216.0

Georgia 1302 203 68.0 83.0 54 162.0

Idaho 262 41 8.1 40.0 36 29.0

Ill 2207 544 180.0 102.0 33 350.0

Ind 1410 254 129.0 89.0 37 196.0

Iowa 833 150 49.0 100.0 30 109.0

Kansas 669 136 27.0 124.0 42 94.0

Kent 911 147 76.0 65.0 44 104.0

Louis 1037 146 72.0 40.0 65 109.0

Maine 1196 46 31.0 19.0 30 37.0

Maryl 616 157 314.0 29.0 44 113.0

Mass 766 255 655.0 17.0 37 166.0

Mich 2120 403 137.0 95.0 33 306.0

Minn 841 189 43.0 110.0 22 132.0

Miss 648 85 46.0 59.0 57 77.0

Mo 1289 234 63.0 100.0 40 180.0

Mont 259 38 4.6 72.0 29 31.0

> help(road)

road package:MASS R Documentation

Road Accident Deaths in US States

Description:

A data frame with the annual deaths in road accidents for half the

US states.

Usage:

Format:

Columns are:

"state" name.

"deaths" number of deaths.

"drivers" number of drivers (in 10,000s).

"popden" population density in people per square mile.

"rural" length of rural roads, in 1000s of miles.

"temp" average daily maximum temperature in January.

"fuel" fuel consumption in 10,000,000 US gallons per year.

Source:

Imperial College, London M.Sc. exercise

> class(road)

[1] "data.frame"

> rownames(road)

[1] "Alabama" "Alaska" "Arizona" "Arkanas" "Calif" "Colo"

[8] "Dela" "DC" "Florida" "Georgia" "Idaho" "Ill"

[15] "Iowa" "Kansas" "Kent" "Louis" "Maine" "Maryl"

[22] "Mich" "Minn" "Miss" "Mo" "Mont"

> colnames(road)

[1] "deaths" "drivers" "popden" "rural" "temp" "fuel"

> dim(road)

[1] 26 6

> transform(road, temp2 = temp**2)

deaths drivers popden rural temp fuel temp2

Alabama 968 158 64.0 66.0 62 119.0 3844

Alaska 43 11 0.4 5.9 30 6.2 900

Arizona 588 91 12.0 33.0 64 65.0 4096

Arkanas 640 92 34.0 73.0 51 74.0 2601

Calif 4743 952 100.0 118.0 65 105.0 4225

> summary(road)

deaths drivers popden rural

Min. : 43.0 Min. : 11.0 Min. : 0.40 Min. : 0.00

1st Qu.: 571.5 1st Qu.: 86.5 1st Qu.: 31.75 1st Qu.: 30.00

Median : 799.5 Median :148.5 Median : 66.00 Median : 65.50

Mean :1000.7 Mean :191.2 Mean : 595.74 Mean : 60.71

3rd Qu.:1265.8 3rd Qu.:226.2 3rd Qu.: 135.00 3rd Qu.: 93.50

Max. :4743.0 Max. :952.0 Max. :12524.00 Max. :124.00

temp fuel

Min. :22.00 Min. : 6.20

1st Qu.:33.75 1st Qu.: 67.25

Median :41.50 Median :104.50

Mean :43.69 Mean :115.24

3rd Qu.:53.25 3rd Qu.:154.50

Max. :67.00 Max. :350.00

> colMeans(road)

1000.65385 191.19231 595.73462 60.70769 43.69231 115.23846

> ttemp = cut(road$temp, breaks=c(22,35,60,67))

> ttemp

[1] (60,67] (22,35] (60,67] (35,60] (60,67] (35,60] (35,60] (35,60]

[10] (60,67] (35,60] (35,60] (22,35] (35,60] (22,35] (35,60] (35,60]

[19] (22,35] (35,60] (35,60] (22,35] <NA> (35,60] (35,60] (22,35]

Levels: (22,35] (35,60] (60,67]

> levels(ttemp)=c("baixa","media","alta")

> ttemp

[1] alta baixa alta media alta media media media media alta media

[13] baixa media baixa media media alta baixa media media baixa <NA>

[25] media baixa

Levels: baixa media alta

> table(ttemp)

baixa media alta

6 14 5

Use o comando read.table para ler de arquivo ou URL noformato data.frame.

> write.table(road, file="road.txt")

Use scan para ler do console, de arquivo ou URL.

> x = read.table(file="road.txt",header=TRUE)

Alabama 968 158 64.0 66.0 62 119.0

Alaska 43 11 0.4 5.9 30 6.2

Arizona 588 91 12.0 33.0 64 65.0

Arkanas 640 92 34.0 73.0 51 74.0

Calif 4743 952 100.0 118.0 65 105.0

Colo 566 109 17.0 73.0 42 78.0

Conn 325 167 518.0 5.1 37 95.0

Dela 118 30 226.0 3.4 41 20.0

DC 115 35 12524.0 0.0 44 23.0

Florida 1545 298 91.0 57.0 67 216.0

Georgia 1302 203 68.0 83.0 54 162.0

Idaho 262 41 8.1 40.0 36 29.0

Ill 2207 544 180.0 102.0 33 350.0

Ind 1410 254 129.0 89.0 37 196.0

Iowa 833 150 49.0 100.0 30 109.0

Kansas 669 136 27.0 124.0 42 94.0

Kent 911 147 76.0 65.0 44 104.0

Louis 1037 146 72.0 40.0 65 109.0

Maine 1196 46 31.0 19.0 30 37.0

Maryl 616 157 314.0 29.0 44 113.0

Mass 766 255 655.0 17.0 37 166.0

Mich 2120 403 137.0 95.0 33 306.0

Minn 841 189 43.0 110.0 22 132.0

Miss 648 85 46.0 59.0 57 77.0

Mo 1289 234 63.0 100.0 40 180.0

Mont 259 38 4.6 72.0 29 31.0

Programacao

> x = 1

> if (x>5) x else -x

[1] -1

> ifelse(x>5,x,-x)

[1] -1

> x= c(-3.2, 2,3,-4.5,5,6,0)

> log(x)

[1] NaN 0.6931472 1.0986123 NaN 1.6094379 1.7917595

> any(x<=0)

[1] TRUE

> if (any(x<=0)) x[x<=0]

[1] -3.2 -4.5 0.0

> which(x<=0)

[1] 1 4 7

> all(x==0)

[1] FALSE

> for (i in 1:5) print(i**2)

[1] 16

[1] 25

> i = 1

> while (i<=5) {

+ print(i**2)

+ i = i+1

[1] 16

[1] 25

Porem e bom evitar loops no R.

> args(apply)

function (X, MARGIN, FUN, ...)

[,1] [,2] [,3]

[1,] 2 0 4

[2,] 3 0 0

[3,] 0 5 6

> apply(A, MARGIN = 1,FUN = mean)

[1] 2.000000 1.000000 3.666667

> apply(A, MARGIN = 2,FUN = sum)

[1] 5 5 10

> apply(A, MARGIN = 1,FUN = function(x) ifelse(x>0,log(x),0))

[,1] [,2] [,3]

[1,] 0.6931472 1.098612 0.000000

[2,] 0.0000000 0.000000 1.609438

[3,] 1.3862944 0.000000 1.791759

Criando Funcoes

Seja X ∼ Exponencial(2). Sua funcao de densidade deprobabilidade e,

f (x) = 2 exp(−2x)I (x ≥ 0).

> f1 <- function(x) {

+ fx = ifelse(x < 0, 0, 2 * exp(-2 * x))

+ return(fx)

Deixando livre o parametro da distribuicao X ∼ Exponencial(β),

f (x) = β exp(−βx)I (x ≥ 0), β > 0.

> dexp <- function(x,b) {

+ fdp = ifelse(x<=0, 0, b *exp(-b*x))

+ return(fdp)

> f1(x=3)

[1] 0.004957504

> dexp(x=3,b=2)

[1] 0.004957504

> integrate(dexp,-Inf,Inf,b=2)

1 with absolute error < 5e-07

> integrate(f1,0,2)

0.9816844 with absolute error < 1.1e-14

> integrate(f1,2,Inf)

0.01831564 with absolute error < 2.8e-06

Graficos

> par(mfrow=c(1,2))

> boxplot(road$deaths , xlab="deaths", border="red")

> boxplot(road$drivers, xlab="drivers", cex.lab=2)

deaths

drivers

> par(mfrow=c(1,2))

> boxplot(road$popden , xlab="popden" , outline=FALSE)

> boxplot(road$rural , xlab="rural", horizontal=TRUE)

popden

0 40 80 120

> par(mfrow=c(1,2))

> boxplot(road$temp, xlab="temp", width=2)

> boxplot(road$fuel, xlab="fuel", notch=TRUE)

> boxplot(road, outline=FALSE)

> par(mfrow=c(1,3))

> hist(road[,1],main=colnames(road[1]),xlab="")

> hist(road[,2],main=colnames(road[2]),xlab="",col="grey")

> hist(road[,4],main=colnames(road[4]),xlab="",nclass=15)

deaths

0 2000 4000

drivers

0 200 600 1000

0 40 80 120

> par(mfrow=c(1,3))

> for (i in 1:3) {

+ qqnorm(road[,i],main=colnames(road[i]))

+ qqline(road[,i])

−2 −1 0 1 2

deaths

Theoretical Quantiles

−2 −1 0 1 2

drivers

−2 −1 0 1 2

popden

> head(airquality)

Ozone Solar.R Wind Temp Month Day

1 41 190 7.4 67 5 1

2 36 118 8.0 72 5 2

3 12 149 12.6 74 5 3

4 18 313 11.5 62 5 4

5 NA NA 14.3 56 5 5

6 28 NA 14.9 66 5 6

> names(airquality)

[1] "Ozone" "Solar.R" "Wind" "Temp" "Month" "Day"

> dim(airquality)

[1] 153 6

> attach(airquality, pos=2)

> par(mfrow=c(1,2))

> plot(Month,Ozone)

> boxplot(Ozone~Month)

5 6 7 8 9

> par(mfrow=c(2,2), mar=c(3,3,1,0), mgp=c(1.7,1,0))

> plot(Temp,Ozone)

> plot(Ozone~Temp, subset = Month <= 6, main="Meses 5 e 6")

> plot(Ozone~Temp, subset = Month==7 | Month==8, main="Meses 7 e 8")

> plot(Ozone~Temp, subset = Month==9, main="Mes 9")

60 70 80 90

60 70 80 900

Meses 5 e 6

75 80 85 90 95

Meses 7 e 8

65 70 75 80 85 90

> par(mfrow=c(2,2), mar=c(3,3,1,0), mgp=c(1.7,1,0))

> plot(Ozone, ty="l", xlab="")

> plot(Solar.R, ty="l", xlab="")

> plot(Wind, ty="l", xlab="")

> plot(Temp, ty="l", xlab="")

0 50 100 150

0 50 100 1500

0 50 100 150

> plot(airquality[,1:4])

0 100 200 300 60 70 80 90

Solar.R

0 50 100 150

5 10 15 20

> library(lattice)

> barley[1:15,]

yield variety year site

1 27.00000 Manchuria 1931 University Farm

2 48.86667 Manchuria 1931 Waseca

3 27.43334 Manchuria 1931 Morris

4 39.93333 Manchuria 1931 Crookston

5 32.96667 Manchuria 1931 Grand Rapids

6 28.96667 Manchuria 1931 Duluth

7 43.06666 Glabron 1931 University Farm

8 55.20000 Glabron 1931 Waseca

9 28.76667 Glabron 1931 Morris

10 38.13333 Glabron 1931 Crookston

11 29.13333 Glabron 1931 Grand Rapids

12 29.66667 Glabron 1931 Duluth

13 35.13333 Svansota 1931 University Farm

14 47.33333 Svansota 1931 Waseca

15 25.76667 Svansota 1931 Morris

> table(barley$year,barley$site)

Grand Rapids Duluth University Farm Morris Crookston Waseca

1932 10 10 10 10 10 10

1931 10 10 10 10 10 10

> figura = barchart(yield ~ variety | site,

+ data = barley,groups = year,

+ layout = c(1,6),

+ ylab = "Barley Yield (bushels/acre)",

+ scales = list(x = list(abbreviate = TRUE,minlength = 5)))

> plot(figura)B

2030405060

Svnst N.462 Mnchr N.475 Velvt Ptlnd Glbrn N.457 WN.38 Trebi

Grand Rapids

2030405060

Duluth

2030405060

University Farm

2030405060

Morris

2030405060

Crookston

2030405060

Waseca

Grafico em 3D

> x=seq(-pi,pi,length=50)

> y = x

> f = outer(x,y,function(x,y)cos(y)/(1+x^2))

> persp(x,y,f,theta=30,phi=40)

Veja tambem:

CRAN Task View: Graphic Displays, etc.

The R Graph Gallery

R Graphics by Paul Murrell

Construindo Tabelas no Latex

> m = lm(deaths ~ temp + popden, data=road)

> summary(m)

lm(formula = deaths ~ temp + popden, data = road)

Residuals:

Min 1Q Median 3Q Max

-903.5 -519.4 -235.9 231.2 3236.0

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 82.35216 646.30806 0.127 0.900

temp 22.03153 14.16167 1.556 0.133

popden -0.07437 0.07559 -0.984 0.335

Residual standard error: 921.4 on 23 degrees of freedom

Multiple R-squared: 0.1287, Adjusted R-squared: 0.05295

F-statistic: 1.699 on 2 and 23 DF, p-value: 0.205

> library(xtable)

> tab = xtable(m,caption="Exemplo de regress~ao.",label="tab1",

+ digits=2)

> print(tab)

Estimate Std. Error t value Pr(>|t|)(Intercept) 82.35 646.31 0.13 0.90

temp 22.03 14.16 1.56 0.13popden -0.07 0.08 -0.98 0.34

Table 1: Exemplo de regressao.

\begin{table}[ht]

\begin{center}

\begin{tabular}{rrrrr}

\hline

& Estimate & Std. Error & t value & Pr($>$$|$t$|$) \\

\hline

(Intercept) & 82.35 & 646.31 & 0.13 & 0.90 \\

temp & 22.03 & 14.16 & 1.56 & 0.13 \\

popden & -0.07 & 0.08 & -0.98 & 0.34 \\

\hline

\end{tabular}

\caption{Exemplo de regress~ao linear.}

\label{tab1}

\end{center}

\end{table}

Distribuicoes de Probabilidade

Exemplo. Seja X ∼ Binomial(n, θ) com n = 10 e θ = 0.3 entao

P(X = x) = p(x) =

0.3x(1− 0.3)10−x , x = 0, . . . , 10.

Os comandos abaixo calculam as probabilidades e probabilidadesacumuladas,

> x = 0:10

> px = choose(10, x) * (0.3)^x * (1-0.3)^(10-x)

> fx = dbinom(x,10,0.3)

> Fx = cumsum(px)

> cbind(x,px,fx,Fx)

x px fx Fx

[1,] 0 0.0282475249 0.0282475249 0.02824752

[2,] 1 0.1210608210 0.1210608210 0.14930835

[3,] 2 0.2334744405 0.2334744405 0.38278279

[4,] 3 0.2668279320 0.2668279320 0.64961072

[5,] 4 0.2001209490 0.2001209490 0.84973167

[6,] 5 0.1029193452 0.1029193452 0.95265101

[7,] 6 0.0367569090 0.0367569090 0.98940792

[8,] 7 0.0090016920 0.0090016920 0.99840961

[9,] 8 0.0014467005 0.0014467005 0.99985631

[10,] 9 0.0001377810 0.0001377810 0.99999410

[11,] 10 0.0000059049 0.0000059049 1.00000000

> par(mfrow=c(1,2))

> plot(x, px, type="h")

> plot(x, Fx, type="s")

0 2 4 6 8 10

Exemplo. Seja X ∼ N(µ, σ2) cuja funcao de densidade e,

f (x) = (2πσ2)−1/2 exp{−0.5 (x−µ)2/σ2}, x ∈ R, µ ∈ R, σ2 > 0.

Podemos criar uma funcao no R com a densidade acima,

> dnormal <- function(x,mu,sigma2){

+ logdens = -0.5*(log(2*pi*sigma2) + (x-mu)^2/sigma2)

+ return(exp(logdens))

ou usar a funcao pronta dnorm,

> args(dnorm)

function (x, mean = 0, sd = 1, log = FALSE)

> x = seq(-4,4,l=100)

> plot (x,dnorm(x,0,1),xlim=c(-4,3),type="l",

+ ylab=expression(f(x)))

> lines(x,dnorm(x,-1,2), col=2, lty=2)

> lines(x,dnorm(x, 1,1), col=4, lty=4, lwd=2)

> legend(-3.5,0.35,leg=c("N(0,1)","N(-1,4)","N(1,0.25)"),

+ col=c(1,2,4), lty=c(1,2,4), lwd=c(1,1,2), bty="n")

−4 −3 −2 −1 0 1 2 3

N(0,1)

N(−1,4)

N(1,0.25)

> plot (x,pnorm(x),xlim=c(-3,3),type="l",ylab=expression(F(x)))

> lines(x,pnorm(x,-1,2),lty=2, lwd=2)

> lines(x,pnorm(x,1,.5),lty=3, lwd=2)

> legend(-2.5,.8,leg=c("N(0,1)","N(-1,4)","N(1,0.25)"),lty=1:3,bty="n")

−3 −2 −1 0 1 2 3

N(0,1)

N(−1,4)

N(1,0.25)

introdução ao r - usp · introdu¸c˜ao ao r ricardo ehlers ehlers@icmc.usp.br departamento de...

Documents

monte carlo methods · montecarlomethods ricardo ehlers...

ehlers danlos

modelo cognitivo tept ehlers clark

estatistica - icmc.usp.brestatistica ricardo ehlers...

sindrome de ehlers-danlos

dokumentation nr 559 ehlers-danlos syndrom, eds …

beate ehlers, mi, referat iii/4,

john ehlers

ehlers- danlos tipo iv

ehlers-danlosnational foundation spurgte deres … ·...

astenia cutánea o síndrome de ehlers danlos

hermann-ehlers-haus - asb hamburg

apostila inferência bayesiana - ricardo ehlers

ergebnisse der genealogischen nachforschungen...

estima¸c˜ao -...

ehlers v. marine corps, et al

ricardo ehlers ehlers@icmc.usp · sofreram infarto sortear...

ehlers-danlos syndromes(eds)

otorrinolaringologia e ehlers-danlos: caso-clínico

ricardo ehlers...