chapter 02 - collection and presentation of data

59

Upload: clemence-cruz

Post on 12-Jan-2016

13 views

Category:

Documents


0 download

DESCRIPTION

Covers different methods of collecting and presenting data.

TRANSCRIPT

Page 1: Chapter 02 - Collection and Presentation of Data
Page 2: Chapter 02 - Collection and Presentation of Data

• What is a variable?

• Characteristic or attribute

• Can assume different values

Page 3: Chapter 02 - Collection and Presentation of Data

DISCRETE VS. CONTINUOUS

• DISCRETE

• Finite, can be counted

• Counting or enumeration

• Examples: Sex, STFAP Bracket

• CONTINUOUS

• Infinitely many values

• Real line

• Examples: Height, Life span

QUALITATIVE VS. QUANTITATIVE

• QUALITATIVE

• Categorical

• Descriptive

• QUANTITATIVE

• Numerical

• Represents amount or quantitiy

Page 4: Chapter 02 - Collection and Presentation of Data
Page 5: Chapter 02 - Collection and Presentation of Data

• WHAT IS MEASUREMENT?

• Process

• Determine value or label of variable

• Based on observations

• WHY MEASURE?

• Important

• Determine appropriate tool

Page 6: Chapter 02 - Collection and Presentation of Data

• NOMINAL

• ORDINAL

• INTERVAL

• RATIO

Page 7: Chapter 02 - Collection and Presentation of Data

• Numbers serve as classification

• Categories are

• Distinct

• Mutually exclusive

• Exhaustive

• Weakest level of measurement

• No absolute value

• No operations permissible

• Examples

• Sex

• Civil Status

• Enrollment Status

Page 8: Chapter 02 - Collection and Presentation of Data

• Classification + Ranking

• Arrange categories according to magnitude

• No exact measurement between two orders

• Examples

• Shirt sizes

• Awards

• Year level

• Evaluation scores

Page 9: Chapter 02 - Collection and Presentation of Data

• Contains all properties of ordinal and nominal

• Equal intervals are present

• Multiplication and division are not possible

• Fixed unit of measure

• No absolute zero; zero does not indicate absence of

characteristic

• Examples

• IQ

• Calendar dates

• Temperature reading

Page 10: Chapter 02 - Collection and Presentation of Data

• Contains all properties of ordinal, nominal, and interval levels

• Has magnitude

• Has fixed unit of measurement

• Has an absolute zero

• Strongest level of measurement

• Any arithmetic operation is permissible

• Examples

• Weekly allowance

• Speed of a car in kph

• Rainfall in cm

Page 11: Chapter 02 - Collection and Presentation of Data
Page 12: Chapter 02 - Collection and Presentation of Data

• General misconception: always collect data

• Identify all pertinent data already available

• Previous studies

• Data compiled by agencies

• NSO

• NSCB

• BAS

• BSP

• Examine

• Source

• Availability

• Scope

Page 13: Chapter 02 - Collection and Presentation of Data

• What is a survey? • Data collection method

• Asking questions

• People who answer questions are RESPONDENTS

• Sample survey • More common

• Respondents: chosen objectively

• Several ways for communication • Personal interviews

• Telephone

• Self-administered

• Online surveys

• Focus group discussion

• Weigh pros and cons • Accuracy of data obtained

• Cost and time

• Ability of method to obtain data needed

Page 14: Chapter 02 - Collection and Presentation of Data

• Suitable for many types of problems

• Usually provides most accurate and complete responses

• Ability to read and write not necessary (for respondents)

• Most expensive and time consuming

• Relatively high response rate

• INTERVIEWER is the most crucial

• Reliability of data should be ensured

• Training

• Editing

• Field

• Central

Page 15: Chapter 02 - Collection and Presentation of Data

PROS

• Shares some advantage of

personal interview:

INTERVIEWER

• Easier to supervise work of

interviewers

• Cost- and time-efficient

CONS

• Respondents that can be

reached are limited

• Time for interview is limited

Page 16: Chapter 02 - Collection and Presentation of Data

PROS

• Cost-efficient

• Convenient for respondents

• Own time and pace

• Freedom in expressing self

CONS

May be

remedied

• More prone to

misinterpretation of questions

• More prone to vague

answers

• Response rate is low

• Delay in responses

Page 17: Chapter 02 - Collection and Presentation of Data

• Closely linked to self-administered questionnaires

• Same pros and cons as self-administered questionnaires

• Additional con

• Internet forms can be manipulated

Page 18: Chapter 02 - Collection and Presentation of Data

• In-depth discussion among participants

• Small group

• Moderator

• Data obtained

• Sentiments

• Ideas

• Attitudes

• Results not always conclusive

• Mostly used to

• Formulate hypotheses

• Explain results of previous studies

Page 19: Chapter 02 - Collection and Presentation of Data

• Method of collecting data

• Direct human intervention

• Determine effects of a certain treatment

• Control group and a treatment group

• Dependent, explanatory, and extraneous variables

• Also has disadvantages

• Uncommon

• Volunteer subjects

• Applicability of results

• Basic steps

1. Specify the response variable and explanatory variables

2. Identify possible extraneous variables

3. Determine how to control extraneous variables

4. Assign treatment at random and apply assigned treatment

5. Measure response variable for each subject at the end of experiment

6. Analyze the data

Page 20: Chapter 02 - Collection and Presentation of Data

• Method of data collection

• Recording observations

• As phenomenon happens

• Useful for • Studying reactions and behaviors

• Subjects unable to express themselves

• Major approaches • Duration recording (how long

behavior lasts)

• Frequency count recording (how often behavior happens)

• Latency recording (length of time between stimulus and fist occurrence)

• Interval recording (partition time; number of intervals behavior occurs)

• Time sampling (checks for behavior at a specified time)

• Two types • Participant

• Non-participant

• Use of observation is limited to characteristics that can be observed

• Usually more successful than surveys (nonverbal behavior)

• More successful than experiments in getting realistic data

• Objective sampling procedures are difficult to use

Page 21: Chapter 02 - Collection and Presentation of Data

• Internal data generated from operation and administration

• Data generated from registration

• Computer simulation

Page 22: Chapter 02 - Collection and Presentation of Data
Page 23: Chapter 02 - Collection and Presentation of Data

• Commonly used in surveys

• Statistical sampling theory

• Said to be accurate

• Based on probability theory

• Sample

• Subset of population of interest

• Should be representative

• Homogenous population not always a requirement

• Range of data is important

• Probability theory

• Sample survey

• Use depends on

• Type of problem

• Population of interest

• Amount of resources

• Sampling design

• High precision, low cost

• Select the design which meets

• Budget

• Time

• Precision requirement

Page 24: Chapter 02 - Collection and Presentation of Data

• Population • Target population. The population

about which information is desired

• Sampled or sampling population. The population from which sample is actually obtained.

• Sampling Frame. List of units or members in a sampling population

• Sampling Unit. Member or unit of the sampling population

• Survey • Census

• Sample survey

• Bias. Systematic tendency for sample to misrepresent population

• Precision • Repeated sampling

• Values tend to be widely spread out

• Probability. Measure of relative occurrence or non-occurrence of one of the possible outcomes of an experiment or procedure

• Sampling Error. Due to errors while sampling

• Non Sampling Error. Due to other factors

• Total Error. Deviation of am estimate from the true value it is supposed to estimate.

Page 25: Chapter 02 - Collection and Presentation of Data

• More economical

• Accomplished faster

• Wider scope

• More accurate

• Most feasible method

Page 26: Chapter 02 - Collection and Presentation of Data

PROBABILITY SAMPLING

• Each unit has a known,

nonzero probability of being

in the sample

• Rules and procedures present

for sample selection and

estimation

• Objective: make inferences

NON-PROBABILITY SAMPLING

• Probabilities of selection are not specified

• Some elements may not have a chance to be in the sample

• No objective way of assessing results obtained

• Pros • Convenience

• Economical

• Easy

• Cons • Error in sampling can’t be measured

• Sample may not be representative

Page 27: Chapter 02 - Collection and Presentation of Data

• Accidental/Convenience Sampling. Whatever items or units

that come to hand are used as sample

• Judgment/Purposive Sampling. Sample selected in

accordance with an expert’s subjective judgment. We choose

only those who best meet the purpose of the study.

• Quota Sampling. Interviewer required to interview a certain

number of persons with a given set of characteristics.

• Snowball Sampling

Page 28: Chapter 02 - Collection and Presentation of Data

• Simple Random Sampling

• Stratified Sampling

• Systematic Sampling

• Cluster Sampling

• Multi-stage Sampling

Page 29: Chapter 02 - Collection and Presentation of Data

• Random sample

• n observations

• Each subset of n observations of the population has the same chance of being selected

• With replacement or without replacement

• Appropriate for homogenous populations

• Advantages

• Theory involved is much easier to understand

• Estimation methods are simple and easy

• Disadvantages

• Sample chosen may be widely spread

• Population list (frame) is needed

• May not be applicable for heterogeneous populations

Page 30: Chapter 02 - Collection and Presentation of Data

• Population should be divided or stratified into homogenous groups

• Select simple random sample from each subgroup

• Strata

• Related to topic being studied

• Different from each other, homogeneous within

• Advantages

• May increase precision of estimates

• Comprehensive analysis

• Convenient

• Disadvantages

• List for each stratum is needed

• Additional prior info is needed

• Population

• Sub-populations

Page 31: Chapter 02 - Collection and Presentation of Data

• Take every kth unit from an

ordered population

• First unit is selected at

random

• k is the sampling interval

• Advantages

• Easy to draw sample

• Possible to select sample

without frame

• Sample is spread evenly

• Likely to give more precise

estimates compared to SRS

• Disadvantage

• Sample may consist of only

similar types if there are

periodic regularities in the list

Page 32: Chapter 02 - Collection and Presentation of Data

• Sample of distinct groups

or clusters of smaller units

(elements)

• Clusters are mutually

exclusive

• Each cluster is

heterogeneous

• aka Area Sample

• M is the size of the cluster

• N is the number of clusters

• Advantages

• Population list not needed

• Only list of clusters needed

• Reduced transportation cost

• Disadvantages

• Difficult estimation

procedures

• Costs and problems or

statistical analyses are

greater

Page 33: Chapter 02 - Collection and Presentation of Data

• Sampling accomplished

on two or more steps

• First stage primary units

second stage or

secondary units

• Further steps may be

added

• Advantages

• Reduced listing cost

• Reduced transportation

cost

• Disadvantages

• Estimation procedures are

difficult

• Much planning is needed

Page 34: Chapter 02 - Collection and Presentation of Data

• Consider efficiency of the scheme

• Larger sample more confidence in conclusions

• Avoid bias

Page 35: Chapter 02 - Collection and Presentation of Data

• Dependent on

• Population

• Nature

• Size

• Purpose of study

• At least minimum sample size to properly represent population

• Keep in mind

• Cost

• Reliability of estimates obtained

• Little to no variation, no reason to take large sample just because population is large

• Greater variation, larger sample size; balanced with cost

Page 36: Chapter 02 - Collection and Presentation of Data
Page 37: Chapter 02 - Collection and Presentation of Data

• Textual

• Tabular

• Graphical

Page 38: Chapter 02 - Collection and Presentation of Data

• Put important figures in the text of the report

• Highlight significant figures

• Should give reader clearer understanding of the significance of

the figures about conclusions made in the research problem

• Not advisable for large masses of data

Page 39: Chapter 02 - Collection and Presentation of Data

Political crises in the Middle East and in North Africa resulted in zero growth in Net Primary Income (NPI). This zero growth, in turn, impeded the growth in gross national income (GNI) from 11.5 percent the previous year to 3.6 percent this quarter.

In terms of seasonally adjusted data, GDP grew by 1.9 percent whereas GNI grew, albeit at a slower pace, by 0.9 percent. Growth for the Agriculture, Hunting, Fishery, and Forestry sector was recorded at 0.9 percent. This continued growth is attributed to the rebound of production of palay, sugarcane, and corn. Meanwhile, the strong performance of Manufacturing, Mining & Quarrying, and Construction neutralized the contraction of Electricity, Gas, and Water Supply. As a result, the Industry sector grew by 3.2 percent this quarter, from its 2.1 percent gain from the fourth quarter of 2010. Services sector likewise perked up in the first quarter. The sector grew by 1.3 percent after a 0.7 percent decline in the fourth quarter of 2010. All service subsectors posted positive growth, apart from Public Administration and Defense.

As projected population reached 95.1 million, per capita GDP rose by 2.9 percent. Per capita GNI and Household Final Consumption Expenditure also grew by 1.7 and 2.9 percent, respectively.

Page 40: Chapter 02 - Collection and Presentation of Data

• Most common method of data presentation

• Allows for comparison and pattern/relationship recognition

• Rows and columns

• Minimal discussion or lengthy explanations in text

• Three types

• Leader work

• Text tabulation

• Formal statistical table

Page 41: Chapter 02 - Collection and Presentation of Data

• Leader Work • Simplest layout

• No table title or column headings

• Within text; support

• Descriptive or introductory statement is needed

• Text Tabulation • Has column headings and table

borders

• Not table title and number

• Introductory statement needed

• Formal Statistical Table • Most complete

• Can stand alone

• Has parts

Parts of a formal statistical table • Heading

• Table number

• Table title

• Head note

• Box Head • Spanner head

• Column heading

• panel

• Stub • Stub head

• Center head

• Row caption

• Block

• Field

• Line

• Column

• Cell

• Footnote

• Source note

Page 42: Chapter 02 - Collection and Presentation of Data

1 – Heading 2 – Stub 3 – Notes

4 – Boxhead 5 – Field

1

2

3

5

4

Page 43: Chapter 02 - Collection and Presentation of Data

• Portrays numerical figures

or relationships among

variables in pictorial form

• General picture

• Good chart must be

• Accurate

• Clear

• Simple

• Professional

• Well-designed

• Types

• Line chart

• Vertical bar chart

• Horizontal bar chart

• Pictograph

• Pie chart

• Statistical map

Page 44: Chapter 02 - Collection and Presentation of Data

• Chart title

• Coordinate axes

• Point of origin

• Scale divisions

• Grid lines or coordinate lines

• Scale figures

• Scale labels or legends

• Curves

• Curve legends

• Footnote

• Source note

Page 45: Chapter 02 - Collection and Presentation of Data

Figure 1. Growth of Exports & Imports Year-on-year growth rates in percent (%) - 2009-Q1 to 2011-Q1

(In constant 2000 prices)

Source: NSCB

Note: Previous issues use data with 1985 as the base year; starting 2011, the NSCB uses 2000 as the new

base year.

(15.00)

(10.00)

(5.00)

-

5.00

10.00

15.00

20.00

25.00

30.00

2010:1 2010:2 2010:3 2010:4 2011:1 2011:2 2011:3 2011:4 2012:1 2012:2

Exports Imports

Page 46: Chapter 02 - Collection and Presentation of Data
Page 47: Chapter 02 - Collection and Presentation of Data

• Organization facilitates analyses and interpretation

• Data characterization

• Raw data

• Array

• Frequency distribution table

Page 48: Chapter 02 - Collection and Presentation of Data

• Also known as ungrouped or unclassified data

• Data in its original form

• Not organized yet

• Recorded in the order observed

Page 49: Chapter 02 - Collection and Presentation of Data

• Arrangement of data according to magnitude

• Ascending or descending

• Advantages

• Easier to detect smallest and largest values

• Easy to infer concentration of data values

• Disadvantages

• Inconvenient as data becomes voluminous

• Does not picture clearly the distribution for large masses of data

Page 50: Chapter 02 - Collection and Presentation of Data

• Summarized table

• Classes are distinct values or intervals with frequency counts

• Two types

• Single-value grouping. Frequency count of observed values where

classes are distinct values.

• Grouping by class intervals. Frequency count of observed values where

classes are intervals.

Page 51: Chapter 02 - Collection and Presentation of Data

• Class interval. Numbers defining a class

• Class frequency. Number of observations falling under a class interval

• Class limits. End numbers of a class interval

• Lower class limit (LCL)

• Upper class limit (UCL)

• Open class interval. Class interval with either no LCL or UCL

• Class boundaries. True class limits; number of decimal place is one more than the class limit.

• Class size. Size of the class interval; difference between two successive LCLs or UCLs

• Class mark. Midpoint of a class interval

• Modal class. Class interval with the highest frequency

Page 52: Chapter 02 - Collection and Presentation of Data

1. Determine adequate number of classes, K.

K = 1+3.322log10n, n is the total number of observations

2. Determine the range, R.

R = maximum - minimum

3. Calculate the approximate class size, C’.

C’ = R/K

4. Determine the class size C by rounding off C’ to a number that

is easy to work with.

5. List the required number of class intervals

Page 53: Chapter 02 - Collection and Presentation of Data

• Less than cumulative frequency distribution (<CF)

• Number of observations with values smaller than the UPPER class

boundary

• Greater than cumulative frequency distribution (>CF)

• Number of observations with values greater than the LOWER class

boundary

Page 54: Chapter 02 - Collection and Presentation of Data

• To get the relative frequency, divide the frequency of each

class interval by the total number of observations.

• Sum of the relative frequency column is equal to 1.

• Simply multiply the relative frequency by 100 to get the

relative frequency percentage

Page 55: Chapter 02 - Collection and Presentation of Data

• Given the following data construct an FDT

16500 10850 11850 7500

13500 23500 16500 10500

11000 12500 4500 5250

16500 9950 13950 18950

24000 15000 10000 9900

Page 56: Chapter 02 - Collection and Presentation of Data

• Three ways

• Frequency histogram

• Shape of distribution

• Class boundaries on the

horizontal axis, class

frequencies on the vertical

axis

• Frequency polygon

• Frequencies on the vertical,

class marks on horizontal

• Closed shape

• Ogives

• For less than or greater than

cumulative frequencies

• Less than ogive less than

CF

• Greater than ogive

greater than CF

Page 57: Chapter 02 - Collection and Presentation of Data

• Alternative method to describe a data set

• Histogram-like picture

• Allows for retention of data

• Partly tabular, partly graphical

• Observations are divided into two: STEM and LEAF

• Types

• Ordered

• Split

Page 58: Chapter 02 - Collection and Presentation of Data

• List the stem values, in order, in a vertical column

• Draw a vertical line to the right of the stem value

• For each observation, record the leaf portion of that

observation in the row corresponding to the appropriate stem

• Reorder leaves from lowest to highest within each stem row.

• If the number of leaves appearing in each row is too large,

dived stem into two groups—one whose leaves are from 0 to 4,

and the other whose leaves are from 5 to 9.

• Provide a key to stem-and-leaf coding so that the reader can

recreate actual measurements.

Page 59: Chapter 02 - Collection and Presentation of Data

• Given the following data for the price (in pesos) per gallon of a

sample of brands of sparkling mineral water sold in super

markets, construct a stem-and-leaf display

31 40 28 30 63

35 38 33 42 22

36 68 31 32 36

34 46 34 34 28