chapter 2

33
Chapter 2 Frequency Distributions, Stem-and-leaf displays, and Histograms

Upload: wesley

Post on 05-Jan-2016

19 views

Category:

Documents


0 download

DESCRIPTION

Chapter 2. Frequency Distributions, Stem-and-leaf displays, and Histograms. Where have we been?.  = = 1.79. (X- ) = 0.00. (X- ) 2 = SS = 16.00. X = 30 N = 5  = 6.00. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 2

Chapter 2

Frequency Distributions, Stem-and-leaf displays, and Histograms

Page 2: Chapter 2

Where have we been?

Page 3: Chapter 2

To calculate SS, the variance, and the standard deviation: find the deviations from , square and sum them (SS), divide by N (2) and take a square root().

Example: Scores on a Psychology quiz

Student

John

JenniferArthurPatrickMarie

X

7

8357

X = 30 N = 5 = 6.00

X -

+1.00

+2.00-3.00-1.00+1.00

(X- ) = 0.00

(X - )2

1.00

4.009.001.001.00

(X- )2 = SS = 16.00

2 = SS/N = 3.20 = = 1.7920.3

Page 4: Chapter 2

Ways of showing how scores are distributed around the meanFrequency Distributions, Stem-and-leaf displays Histograms

Page 5: Chapter 2

Some definitionsFrequency Distribution - a tabular display of

the way scores are distributed across all the possible values of a variable

Absolute Frequency Distribution - displays the count of each score.

Cumulative Frequency Distribution - displays the total number of scores at and below each score.

Relative Frequency Distribution - displays the proportion of each score.

Relative Cumulative Frequency Distribution - displays the proportion of scores at and below each score.

Page 6: Chapter 2

Example DataTraffic accidents by bus drivers

•Studied 708 bus drivers.

•Recorded all accidents for a period of 4 years.

•Data looks like:

3, 0, 6, 0, 0, 2, 1, 4, 1, … 6, 0, 2

Page 7: Chapter 2

Frequency Distributions

# ofaccidents

01234567891011

AbsoluteFreq.11715715811578442176131

708

RelativeFrequency

.165

.222

.223

.162

.110

.062

.030

.010

.008

.001

.004

.001

.998

Calculate relative frequency.

Divide each absolute frequency by the N.

For example, 117/708 = .165

Notice rounding error

Page 8: Chapter 2

What can you answer?# of

accidents01234567891011

RelativeFreq..165.222.223.162.110.062.030.010.008.001.004.001.998

Proportion with at most 1 accident?

Proportion with 8 or more accidents?

= .165 + .222 = .387 .387 * 100 = 38.7%

= .008 + .001 +.004 + .001 = .014 = 1.4%

Proportion with between 4 and 7 accidents?= .110 + .062 +.030 + .010 = .212 = 21.2%

Page 9: Chapter 2

Cumulative Frequencies

# of acdnts01234567891011

AbsoluteFrequency

11715715811578442176131

708

CumulativeFrequency

117274432547625669690697703704707708

CumulativeRelative

Frequency.165.387.610.773.883.945.975.983.993.994.9991.000

Cumulative frequencies show number of scores at or below each point.

Calculate by adding all scores below each point.

Cumulative relative frequencies show the proportion of scores at or below each point.

Calculate by dividing cumulative frequencies by N at each point.

Page 10: Chapter 2

Grouped Frequency Example

2.72 2.84 2.63 2.51 2.54 2.98 2.61 2.93 2.87 2.76 2.58 2.66 2.86 2.862.58 2.60 2.63 2.62 2.73 2.80 2.79 2.96 2.58 2.50 2.82 2.83 2.90 2.912.87 2.87 2.74 2.70 2.52 2.75 2.99 2.66 2.58 2.71 2.51 2.87 2.87 2.752.85 2.61 2.54 2.73 2.96 2.90 2.75 2.76 2.93 2.64 2.85 2.70 2.56 2.512.83 2.79 2.76 2.75 2.86 2.58 2.87 2.89 2.89 2.52 2.59 2.54 2.54 2.852.83 2.96 2.93 2.89 2.92 2.98 2.59 2.81 2.78 2.95 2.96 2.95 2.56 2.592.87 2.84 2.84 2.80 2.65 2.70 2.61 2.89 2.83 2.85 2.52 2.66 2.74 2.732.88 2.85

100 High school students’ average time in seconds to read ambiguous sentences.

Values range between 2.50 seconds and 2.99 seconds.

Page 11: Chapter 2

Grouped FrequenciesNeeded when

number of values is large OR values are continuous.

To calculate group intervals First find the range. Determine a “good” interval based on

on number of resulting intervals,meaning of data, andcommon, regular numbers.

List intervals from largest to smallest.

Page 12: Chapter 2

Grouped Frequencies

ReadingTime

2.90-2.99

2.80-2.89

2.70-2.79

2.60-2.69

2.50-2.59

ReadingTime

2.95-2.99

2.90-2.94

2.85-2.89

2.80-2.84

2.75-2.79

2.70-2.74

2.65-2.69

2.60-2.64

2.55-2.59

2.50-2.54

Frequency

16

31

20

12

21

Frequency

9

7

20

11

10

10

4

8

10

11

Range = 2.99 - 2.50 = .49 ~ .50

i = .1#i = 5

i = .05#i = 10

Page 13: Chapter 2

Either is acceptable.

Use whichever display seems most informative.

In this case, the smaller intervals and 10 category table seems more informative.

Sometimes it goes the other way and less detailed presentation is necessary tp prevent the reader from missing the forest for the trees.

Page 14: Chapter 2

Stem and Leaf Displays

Used when seeing all of the values is important.

Shows data grouped all values visual summary

Page 15: Chapter 2

Stem and Leaf Display

Reading time dataReading

Time

2.9

2.9

2.8

2.8

2.7

2.7

2.6

2.6

2.5

2.5

Leaves

5,5,6,6,6,6,8,8,9

0,0,1,2,3,3,3

5,5,5,5,5,6,6,6,7,7,7,7,7,7,7,8,9,9,9,9

0,0,1,2,3,3,3,3,4,4,4

5,5,5,5,6,6,6,8,9,9

0,0,0,1,2,3,3,3,4,4

5,6,6,6

0,1,1,1,2,3,3,4

6,6,8,8,8,8,8,9,9,9

0,1,1,1,2,2,2,4,4,4,4

i = .05#i = 10

Page 16: Chapter 2

Stem and Leaf Display

Reading time dataReading

Time

2.9

2.8

2.7

2.6

2.5

Leaves

0,0,1,2,3,3,3,5,5,6,6,6,6,8,8,9

0,0,1,2,3,3,3,3,4,4,4,5,5,5,5,5,6,6,6,7,7,7,7,7,7,7,8,9,9,9,9

0,0,0,1,2,3,3,3,4,4,5,5,5,5,6,6,6,8,9,9

0,1,1,1,2,3,3,4,5,6,6,6

0,1,1,1,2,2,2,4,4,4,4,6,6,8,8,8,8,8,9,9,9

i = .1#i = 5

Page 17: Chapter 2

Transition to Histograms999977777776665555

988666655

3332100

44433332100

9986665555

4433321000

6665

43321110

44442221110

2.50-2.54

2.55-2.59

2.60 –2.64

2.65 –2.69

2.70 –2.74

2.75 –2.79

2.80 –2.84

2.85 –2.89

2.90 –2.94

2.95 –2.99

9998888866

Page 18: Chapter 2

Histogram of reading times

2.50-2.54

2.55-2.59

2.60 –2.64

2.65 –2.69

2.70 –2.74

2.75 –2.79

2.80 –2.84

2.85 –2.89

2.90 –2.94

2.95 –2.99

20181614121086420

Reading Time (seconds)

Frequency

Page 19: Chapter 2

Histogram concepts - 1

Used to display continuous data.Discrete data are shown on a box

graph.But most psychology data are

continuous, even if they are measured with integers.

Page 20: Chapter 2

Histogram concepts - 2

Use bar graphs, not histograms, for discrete data.

You rarely see data that is really discrete.Discrete data are categories or rankings.If you have continuous data, you can use

histograms, but remember real class limits.

Histograms can be used for relative frequencies as well.

Page 21: Chapter 2

What are the real limits of each class?

2.50-2.54

2.55-2.59

2.60 –2.64

2.65 –2.69

2.70 –2.74

2.75 –2.79

2.80 –2.84

2.85 –2.89

2.90 –2.94

2.95 –2.99

20181614121086420

Real limits of the fifth class are ???? - ???? Real limits of the highest class are ???? - ????.

Frequency

Page 22: Chapter 2

What are the real limits of each class?

2.50-2.54

2.55-2.59

2.60 –2.64

2.65 –2.69

2.70 –2.74

2.75 –2.79

2.80 –2.84

2.85 –2.89

2.90 –2.94

2.95 –2.99

20181614121086420

Real limits of the fifth class are 2.695-2.745 Real limits of the highest class are 2.945 - 2.995

Frequency

Page 23: Chapter 2

Predicting from Theoretical Distributions

Theoretical distributions show how scores can be expected to be distributed around the mean.(Mean = 2.755 for reading data).

Distributions are named after the shapes of their histograms:

Rectangular J-shaped Bell (Normal) many others

Page 24: Chapter 2

Rectangular Distribution of scores

Page 25: Chapter 2

Flipping a coin

100 flips - how many heads and tails do you expect?

Heads Tails

100

75

50

25

0

Page 26: Chapter 2

Rolling a die

120 rolls - how many of each number do you expect?

1 2 3 4 5 6

100

75

50

25

0

Page 27: Chapter 2

Rolling 2 dice

How many combinations are possible?

DiceTotal

123456789101112

AbsoluteFreq.

01234565432136

RelativeFrequency

.000

.028

.056

.083

.111

.139

.167

.139

.111

.083

.056

.0281.001

Page 28: Chapter 2

Rolling 2 dice

360 rolls - how many of each number do you expect?

1 2 3 4 5 6 7 8 9 10 11 12

100908070605040302010

0

Page 29: Chapter 2

Normal Curve

Page 30: Chapter 2

J Curve

Occurs when socially normative behaviors are measured.Most people follow the norm, but there are always a few outliers.

Page 31: Chapter 2

Principles of Theoretical Curves

Expected frequency = Theoretical relative frequency * N

Expected frequencies are your best estimates because they are closer, on the average, than any other estimate when we square the error.

Law of Large Numbers - The more observations that we have, the closer the relative frequencies should come to the theoretical distribution.

Page 32: Chapter 2

Q & A: Continuous data HOW IS THE FACT THAT WE ARE DISPLAYING

CONTINUOUS DATA SHOWN ON A HISTOGRAM AS OPPOSED TO A BAR GRAPH?

The bars of the graph on a histogram meet at the real limits of each interval.

IF DATA CAN ONLY BE INTEGERS (SUCH AS NUMBER OF TRUE/FALSE QUESTIONS ANSWERED CORRECTLY ON A PSYCH QUIZ), HOW COME IT IS CALLED CONTINUOUS DATA.

Whether data is continuous or discrete depends on what your measuring, not the accuracy of your measuring instrument. For example, distance is continuous whether you measure it with a yardstick or a micrometer. Knowledge, like self-confidence and other psychological variables, is probably best thought of as a continuous variable.

Page 33: Chapter 2

Determining “i” (the size of the interval)

WHAT IS THE RULE FOR DETERMINING THE SIZE OF INTERVALS TO USE IN WHICH TO GROUP DATA?

Whatever intervals seems appropriate to most informatively present the data. It is a matter of judgement. Usually we use 6 – 12 same size intervals each of which use intuitively obvious endpoints (e.g., 5s and 0s).