analyze statistic by using spss

70
Analyze Statistic by Using SPSS 3 rd Day 1 Fadwa Flemban

Upload: angelica-rosales

Post on 01-Jan-2016

52 views

Category:

Documents


3 download

DESCRIPTION

Analyze Statistic by Using SPSS. 3 rd Day. الاعجاز الرقمي في القرآن الكريم. - PowerPoint PPT Presentation

TRANSCRIPT

Fadwa Flemban1

Analyze Statistic by Using SPSS

3rd Day

Fadwa Flemban2

االعجاز الرقمي في القرآن الكريم له مدلول كبير في القرآن والكون والحياة ، فعدد أحرف األبجدية العربية )لغة 7الرقم

6 )أي 28القرآن( هو ?نزل القرآن على سبعة أحرف( 4 × 7 حرفا ( ، والحديث الصحيح )أKه تعالى سبع سماوات وسبع أراضين 7يؤكد أن الرقم له عالقة بالقرآن ، وقد خلق الل

)سبعة أشواط في 7وجعل الجمعة سبعة أيام . أما عبادة الحج فتعتمد على الرقم الطواف والسعي( وسبع جمرات . والذي ال يؤمن بكل هذا فجزاؤه نار جهنم التي خلق لها

Kه تعالى سبعة أبواٍب لكل باٍب_ الل . وال ننسى أن أعظم سورة 11 × 7 مرة أي 77مالحظة : كلمة جهنم تكررت في القرآن

Kه ] السبع المثاني [ ، عدد آياتها . كما أن عبارة 7في القرآن هي الفاتحة التي سمfاها الل مرات بالضبط .7السماوات السبع )وسبع سماوات( تكررت في القرآن

مرات في اآليات التالية :4كلمة ] سبعة [ تكررت في القرآن ?مm ـ } 1 عmت nج nا رnَذp mعnة_ ِإ ب nسnو rجnحm _ فpي ال fام nي nةp أ َث nالn nام? َث ]196{ ] البقرة : فnصpيٌءw مnقmس?ومw ـ } 2 mج?ز mه?مm nاٍب_ مpن ?لr ب pك mوnاٍب_ ل ب

n mعnة? أ ب nا سnهn ]44{ ] الحجر : ل?ه?مm ـ } 3 mب nل ?ه?مm ك nامpن mعnةw وnَث ب nس nون? nق?ول ]22{ ] الكهف : وnي

fهp ـ } 4 pمnات? الل nل nفpدnتm ك mح?ر_ مnا ن nب mعnة? أ ب nس pهpدmعn ]27} لقمان : [ مpنm ب

مرات لقمان الكهف الحج البقرة اسم السورة4كلمة ] سبعة [ تكررت في القرآن رقم اآلية196 44 22 27

272244196 = 7 × 38892028 =7 × 7 × 5556004 272244196

6 : العدد الذي يمثل اآليات األربعة )التي وردت فيها كلمة ] سبعة[( يقبل القسمة على 7ِإَذا ؟ 7مرتين متتاليتين ، فمن الذي نظfم مواضع هذه الكلمة بهذا التناسب المذهل مع الرقم

أليس هو اللKه ؟

Fadwa Flemban3

Chi-Squared Testsاختبارات مربع كاي

(1 )Goodness of fit tests

(2 )Independent tests

(3 )Homogeneity tests

Fadwa Flemban4

لمقارنة توزيع البيانات مع عدة توزيعات •احتمالية وهي:

. Normal Dist- التوزيع الطبيعي 1.Poisson Dist - توزيع بواسون 2. Exponential Dist- التوزيع األسي 3. Uniform Dist- التوزيع المنتظم 4

(1 )Goodness of fit testsاختبار جودة التوفيق

Fadwa Flemban5

Hypotheses of Test :

• Hₒ: The data are consistent with a S distribution. : Hₒ .البيانات تتبع التوزيع س

• H1: The data are not consistent with S distribution.H1.البيانات ال تتبع التوزيع س :

(1 )Goodness of fit testsاختبار جودة التوفيق

Fadwa Flemban6

Goodness of fit tests Example

This data are representing the number of persons who ate the dinner in a small restaurant on 50 days:

Is a variable of the persons' number who ate the dinner in the restaurant following the normal distribution at the level of significance (0.05)?

20 12 16 19 24 6 10 1 15 23

8 30 25 7 10 8 16 24 22 8

12 10 5 14 27 20 21 16 18 12

16 23 20 4 17 27 19 16 8 6

9 7 12 14 19 22 20 16 14 15

Fadwa Flemban7

Solution

• Hₒ: The data are consistent with the normal distribution.

• H1: The data are not consistent with the normal distribution.

Fadwa Flemban8

Normality Test two way: By

(1 )Analyze Descriptive Statistics Explore

Plots check in Normality plots with test

Fadwa Flemban9

Normality Test for (male)

Fadwa Flemban10

Normality Test for (female)

Fadwa Flemban11

Output

بما أن :•جميع النقاط تقع على وحول الخط

المستقيمِإَذن :

العينة تتبع التوزيع الطبيعي

Fadwa Flemban12

Normality Test two way: By

(2 )Analyze Nonparametric test

1-sample kolmogorov-smirnov test

Fadwa Flemban13

Analyze Nonparametric test 1-sample kolmogorov-smirnov test

Fadwa Flemban14

Fadwa Flemban15

Output: P-value (0.898)>α(0.05)We don't reject Hₒ

the persons' number who ate the dinner in the restaurant

following the normal distribution at degree of

confidence 95%.

Fadwa Flemban16

Make the same steps but :Choose Poison test distribution

Fadwa Flemban17

Output: P-value (0.047)<α(0.05)We reject Hₒ

the persons' number who ate the dinner in the restaurant don’t following the Poisson

distribution at degree of confidence 95%.

Fadwa Flemban18

Hypotheses of Test :

• H0: The variables are independent.

• H1: The variables are not independen.

: Hₒ .المتغيران مستقــالنH1.المتغيران غيرمستقــالن, أي توجد عالقة بينهما :

(2 )Independent testsاختبارات االستقالل

Fadwa Flemban19

Independent testsExample

In a study of the relationship between the grade of student in the university and his gender:

There is a relationship between the student’s grade & his gender?

F F F F B A D A B Female

C B B C D D D F A

D F D D D F B B C

C C A B A C C C

F C C C F B F B B A Male

B C F F F F F D A

D A A F F D D A C

A C C D F F C C B

Fadwa Flemban20

Solution

• Hₒ: The student’s grade & his gender are independent.

• H1: There is a relationship between the student’s grade & his gender.

21

Analyze Descriptive Statistics Crosstabs

Fadwa Flemban

Fadwa Flemban22

Crosstabs Window:

PressStatisticsbutton

Fadwa Flemban23

Chi-square to Independent Test

Fadwa Flemban24

P-value = 0.656P-value > 0.05

We don’t reject HₒThe two variables are independent

Fadwa Flemban25

لإلجابة عن السؤال: هل تكرارات المشاهدات •موزعة بشكل متجانس )متماَثل( بين فئات

المجتمع.

Hypotheses of Test :

• H0: Pi1= Pi2 =…………= Pis OR σ²1=σ²2=……= σ²i

• H1: at least one of the null hypothesis statements is false.

(3 )Homogeneity testsاختبارات التجانس

Fadwa Flemban26

Homogeneity tests Example for Clarification

In a study of the television viewing habits of children, a developmental psychologist selects a random sample of 300 first graders - 100 boys and 200 girls.

Do the boys' preferences for these TV programs differ significantly from the girls' preferences? Use a 0.05 level of significance.

Rows total The Simpsons

Sesame Street

Lone Ranger

100 20 30 50 Boys

200 70 80 50 Girls

300 90 110 100 Column total

Fadwa Flemban27

Mathematical SolutionH0: Pboys who prefer Lone Ranger = Pgirls who prefer Lone Ranger H0: Pboys who prefer Sesame Street = Pgirls who prefer Sesame Street

H0: Pboys who prefer The Simpsons = Pgirls who prefer The Simpsons

H1: At least one of the null hypothesis statements is false.

DF = (r - 1) * (c - 1) = (2 - 1) * (3 - 1) = 2

Er,c = (nr * nc) / nE1,1 = (100 * 100) / 300 = 10000/300 = 33.3E1,2 = (100 * 110) / 300 = 11000/300 = 36.7E1,3 = (100 * 90) / 300 = 9000/300 = 30.0E2,1 = (200 * 100) / 300 = 20000/300 = 66.7E2,2 = (200 * 110) / 300 = 22000/300 = 73.3E2,3 = (200 * 90) / 300 = 18000/300 = 60.0

Χ2 = Σ [ (Or,c - Er,c)2 / Er,c ] Χ2 = (50 - 33.3)2/33.3 + (30 - 36.7)2/36.7 + (20 - 30)2/30 + (50 - 66.7)2/66.7 + (80 - 73.3)2/73.3 + (70 - 60)2/60Χ2 = (16.7)2/33.3 + (-6.7)2/36.7 + (-10.0)2/30 + (-17.7)2/66.7 + (3.3)2/73.3 + (10)2/60Χ2 = 8.38 + 1.22 + 3.33 + 4.70 + 0.61 + 1.67 = 19.91

P(Χ2 > 19.91) = 0.0000

Since the P-value (0.0000) is less than the significance level (0.05), we cannot accept the null hypothesis.

Fadwa Flemban28

Homogeneity tests Example

We have the following data:

1- Are two factories homogeneity ?

2-Test the hypothesis, the factories them the same calories (by million calories),Use a 0.05 level of significance?

Calories8400 8230 8380 7860 7930 Factory 1

7510 7690 7720 8070 7660 Factory 2

Fadwa Flemban29

Solution

• NOTE: we have two variables (scale & nominal).

• Hypotheses of Homogeneity test:

Hₒ : σ²1 = σ²2 H1 : σ²1 ≠ σ²2

Fadwa Flemban30

Analyze Compare means Independent Samples

Fadwa Flemban31

Define Groups

Fadwa Flemban32

Output: P-value = 0.330

P-value > αWe don’t reject HₒThe samples are

Homogeneity

Fadwa Flemban33

Also:• From t-test of equality of means:

Hₒ : µ1=µ2 H1 : µ1≠µ2

Sig. = 0.018 , α = 0.05

Sig. < αwe reject Hₒ, the means of two factories are

not equal.

Fadwa Flemban34

SummaryIn Nominal Variables

NormalityTest

Data from Normal Dist. T test

Data not from Normal Dist.

Non Parametric Tests

MakeHomogeneity

Test

Fadwa Flemban35

Regression & Correlationاالنحدار و االرتباط

Fadwa Flemban36

Regression االنحدار

استخدام معادلة خط اإلنحدار في التنبؤ •المستقبلي.

معادلة خط اإلنحدار تستخدم للتنبؤ لقيم •”ضمن“ قيم المتغير المستقل.

Fadwa Flemban37

يستخدم االنحدار الخطي لتقدير معامل المتغير •المستقل للمعادلة الخطية بغرض تقدير المتغير التابع

فى حالة وجود متغير مستقل واحد فإن معادلة الخط •تأخذ الصورة:

Y = a + b*X

عن المتغير Y عن المتغير المستقل وتعبر Xحيث تعبر •التابع.

االنحدار الخطي البسيطSimple linear Regression

Fadwa Flemban38

Example

Suppose that X symbolize to the temperature between (3:00 pm & 4:00 pm) through the summer season, Y symbolize to electricity consumption representative by levels from 1 to 10 where level 10 is higher consumption. And the data were recorded during a period of 10 days:

X: 38 38 30 32 23 30 3425 31 21

Y: 9.5 9 6 6 4.5 7 85 7 4

1- Draw the scatter diagram for this data?2-Estimate the linear regression equation between (X,Y) at a temperature ?3- If X=35, then the level of electricity consumption =……

mathematical solution ;302,10 ixn ;66iy

;94242ix ;5.4662

iy ;5.2086ii yx

2.30n

xx i

51.536.302 xx ss

;6.6n

yy i

09.3)6.6()5.466(10

11 2222 yyn

s iy

7578.109.32 xyy ss

2

.)/(1

x

ii

s

yxnyxb

3073.0

36.30

)6.6)(2.30()10/5.2086(

ii xy .3073.06808.2ˆ:isequation regressionlinear the

075.8)35(3073.06808.2ˆ 35 re temperatuaat iy

= 6.6 – (0.3073)(30.2) = -2.680xbybo 1

36.30)2.30()9424(10

11 2222 xxn

s ix

Fadwa Flemban40

SPSS Solution1- Graphs Legacy Dialogs Scatter/Dot

Fadwa Flemban41

Simple Scatter Define

Fadwa Flemban42

Output:

To add the regression line on the chart:Double click on the chart add fit line at total linear close

Fadwa Flemban43

Output:

the straight line is best representation to this

data.The next step>>

Fadwa Flemban44

2 -Analyze Regression Linear

Fadwa Flemban45

Correlation Coefficient

a=

b=

Output:

the linear regression equation

Yi = -2.681 + 0.307 Xi

46أ.فدوى فلمبان

التنبؤ باستخدام معادلة االنحدار:

تقدير االستهالك من الطاقة الكهربائية عندما • درجة مئوية35تكون درجة الحرارة

معادلة خط االنحدار هيYi = -2.681 + 0.307 Xi

X = 35 بما أن ِإَذن استهالك الطاقة الكهربائية يقدKر بـ :

Y = -2.681 + 0.307 (35)Y = 8.075

Fadwa Flemban47

Correlation االرتباط

• Can be used as another measure to determine strength of the relationship between and among phenomena, this measure is the correlation coefficient.

Fadwa Flemban48

ان واحدا من اهم اهداف اى بحث هى ِإيجاد عالقات •بين المتغيرات وَذلك هو هدف أساسي لعلم االحصاٌء.

ويجب قبل حساٍب معامالت االرتباط للبيانات الكمية • Scatterمشاهدة البيانات من خالل شكل االنتشار

diagram وَذلك لمالحظة طبيعة العالقة )خطية او غير والتى قد outliersَذلك( او لمالحظة وجود قيم شاَذة

يؤدى وجودها الى نتائج مضللة.

+. ِإَذا كانت 1- و 1تنحصر قيمة معامل االرتباط بين •+ عندها يكون 1قيمة معامل اإلرتباط مساوية

اإلرتباط طردي تام، وكذلك عندما تكون قيمة معامل - عندها يكون اإلرتباط عكسي 1اإلرتباط مساوية

تام.

Correlation االرتباط

Fadwa Flemban49

Scatter Diagram

this scatter diagram means the coefficient of correlation ( r=0): There is no relationship between the variables or there is relationship but not linear .

this scatter diagram means the coefficient of correlation (r=-1 or r=+1) :Of all points on the regression line which is the relationship between the variables (x,y).

this scatter diagram means the coefficient of correlation (0<r<+1 or -1<r<0): All points concentrated around and above the regression line.

Fadwa Flemban50

Values of the correlation coefficients

r Its mean r Its mean

+1 Perfect positive correlation -1 Perfect negative correlation

0.99 <r<0.90 Very strong positive correlation -0.90<r<-0.99 Very strong negative correlation

0.89<r<0.70 strong positive correlation -0.70<r<-0.89 strong negative correlation

0.69<r<0.50 Moderate positive correlation -0.50<r<-0.69 Moderate negative correlation

0.49<r<0.30 Weak positive correlation -0.30<r<-0.49 Weak negative correlation

0.29<r<0.01 Very weak positive correlation -0.01<r<-0.29 Very weak negative correlation

r = 0 Zero correlation

Fadwa Flemban51

6 لقياس المتغيرات معامالت االرتباط تبعا

Fadwa Flemban52

Two different correlation techniques are available:

• for quantitative variables

1 -Pearson correlation coefficient

• for ordinal scales2 -Spearman correlation coefficient

Fadwa Flemban53

• for quantitative variables

1 -Pearson correlation coefficient

Fadwa Flemban54

Example

Find the correlation between the outside temperature (y) and the height by thousands of foot (x) for a plane in different times.

Height (x) 0 4 4 10 6

Temperature (y) 27 21 18 10 16

Calculate the coefficient of correlation between the height & the temperature?

Fadwa Flemban55

mathematical solutionNo. x y x² y² xy

1 0 27 0 729 0

2 4 21 16 441 84

3 4 18 16 324 72

4 10 10 100 100 100

5 6 16 36 257 96

∑ 24 92 168 1850 352

yx

ii

ss

yxnyxr

.

)/(

xy

983.0)6071.5)(2496.3(

)4.18)(8.4()352(

r

=18.4; = 4.8 Sx=3.2496; Sy=5.6071

.

It means there is strong negative correlation between the height & the temperature

Fadwa Flemban56

1- Graphs Legacy Dialogs Scatter/Dot Simple Scatter

SPSS Solution

Fadwa Flemban57

To add the regression line on the chart:Double click on the chart add fit line at total linear close

Output:

Fadwa Flemban58

Output:

the straight line is best representation to this

data.The next step>>

Fadwa Flemban59

2 -Analyze Correlate Bivariate

Fadwa Flemban60

Bivariate Correlations Windows:

61 Fadwa Flemban

Output:

From Output of correlation: r= -0.983It means there is strong negative correlation between the height & the temperature.

Fadwa Flemban62

• for ordinal scales

2 -Spearman correlation coefficient

Fadwa Flemban63

Example

If we have the grade of 5 students in both articles :

Statistics A C D F B

Mathematics B C F D A

Find the correlation between the students' grade in the statistics and the mathematics?

Fadwa Flemban64

mathematical solution

0.8 r

0.2 - 1

) 1 - 25 ( 5

) 4 ( 6 - 1

) 1- n (n

) d ( 6 - 1 r

s

2

2

s

There is strong positive correlation between the students' grade in the statistics and the mathematics.

d squared d Rank of

StatRank of

Math Stat Math

1 -1 1 2 A B

0 0 3 3 C C

1 -1 4 5 D F

1 1 5 4 F D

1 1 2 1 B A

4 0 Total

Fadwa Flemban65

• By same steps in the previous example:

Solution by SPSS

Fadwa Flemban66

Analyze Correlate Bivariate

Fadwa Flemban67

Output:

From this table we find the same result: r=0.8, there is strong positive correlation.

Fadwa Flemban68

استخدام معامل بيرسون لإلرتباط 1(لبيانات غير خطية لذلك يجب التأكد من

”خطية“ العالقة بين الظاهرتين.

معامل بيرسون لإلرتباط يعكس ”خطية 2(العالقة“.

أخطاء شائعة

69

Question??? A national consumer magazine reported the following

correlations.1-The correlation between car weight and car reliability

is -0.30.2-The correlation between car weight and annual

maintenance cost is 0.20.Which of the following statements are true?

I. Heavier cars tend to be less reliable. II. Heavier cars tend to cost more to maintain.

III. Car weight is related more strongly to reliability than to maintenance cost.

Fadwa Flemban

70 Fadwa Flemban

Statistical Humor A ONE-WAY ANOVA shouted at a TWO-WAY

ANOVA: "STOP! Turn around - You are going the wrong way!" The TWO-WAY ANOVA yelled back: "Sorry! I will turn when I see an interaction!"