biostat

STATISTICAL VALIDATION METHOD VIZ. STATISTICAL TREATMENT OF

FINITE SAMPLE

By:Sachin kumarM.Pharm. (Pharmacology)Deptt. of Pharma. SciencesM.D.U. Rohtak, 124001

CONTENTS• INTRODUCTION • ANALYSIS OF DATA 1. MEASURES OF CENTRAL TENDENCY 2. MEASURES OF DISPERSION 3. SKEWNESS 4. CORRELATION 5. REGRESSION• TEST OF SIGNIFICANCE 1. T-TEST 2. F-TEST 3. ANOVA• REFERENCES

INTRODUCTION• Biostatistics :- It is defined as the application of statistical

method to the data derived from biological sciences.• Statistics :- It is the collection of methods used in

planning an experiment and analyzing data in order to draw accurate conclusions.

- It include collection, organization, presentation, analysis and interpretation of numerical data.• Data :- Facts or figures from which conclusion can be drawn. - It may be qualitative or quantitative.

ANALYSIS OF DATA• Analysis can be done through different statistical

techniques:- 1. Measures of central tendency 2. Measures of dispersion 3. Skewness 4. Correlation 5. Regression

1. MEASURES OF CENTRAL TENDENCY

• The observation of set of data exibit a tendency to cluster around a specific value. This characterstic of data is central tendency.

• The value around which individual observation are clustered is called central value.

• Three main measures of central tendency. 1. Mean 2. Median 3. Mode

• MEAN- It is the mathematical average denoted by x-bar.

(a) Arithmetic mean (simple mean)- for ungrouped data-

for grouped data-

(b) Geometric mean-

(c ) Harmonic mean-

• Median- Central value when arranging in ascending or descending order. Denoted by ‘M’.

for ungrouped data- n is odd→ M = [(n+1)/2]th value n is even → M = [ (n/2)th value + (n/2 +1)th value ] /2 for grouped data- M = L+ [ (n/2-F)/f ] x c L= lower limit of median class F= frequency of the class preceding the

median class c= width of the median interval

MODE- Most commonly occuring value -for ungrouped data- which occurs maximum no. of times -for grouped data- mode= L+ [( f₁-f₀) / 2f₁- f₀-f₂]x c L= lower limit of mode class f₀= frequency of class preceding the m.c. f₁ = frequency of class succeeding the m.c. f₂= frequency of mode class c= width of mode class mode class= class which have maximum frequency

2. MEASURES OF DISPERSION• For comparing two set of data sets, we require a

measures of dispersion. Dispersion indicate the extent to which a distribution is squeezed.

• There are five main measures of dispersion: - Range - Interquartile range - Mean deviation - standard deviation - variance

• RANGE- It is the simplest measure of dispersion. Range= L-S L= largest observation S= smallest observation• INTERQUARTILE RANGE- Problem with range

such as instability from one sample to another or when added new sample. So we calculate I.R.

I.R.= Q₃-Q₁ Q₁ = first quartile Q₂= second quartile Q₃= third quartile

• MEAN DEVIATION- The average absolute deviation from the central value of a data set is called mean deviation.

-For grouped data- M.D. about mean = (∑|xᵢ-x̅|) /n M.D. about median = (∑|xᵢ-M|) /n M.D. about mode = (∑|xᵢ-Z|) /n - For grouped data- M.D. about mean = (∑fᵢ|xᵢ-x̅|) / ∑fᵢ M.D. about median = (∑fᵢ|xᵢ-M|) / ∑fᵢ M.D. about mode = (∑fᵢ|xᵢ-Z|) / ∑fᵢ

• STANDARD DEVIATION- It tell us how much scores deviate from the mean. Denoted by sigma or S.

Standard error of mean(SEM)= S/ n

• VARIANCE- It tell us how far a set of numbers are spread out from their mean.

-Variance is the square root of standard deviation.

3. SKEWNESS• It is the measure of degree of asymmetry of the

distribution. (a) Symmetric- Mean, median, mode are the

same. (b) Skewed left- Mean to the left of the median,

long tail on the left. (c) Skewed right- Mean to the right of the

median, long tail on the left.

• Coefficient of Skewness = (mean-mode)/ S.D

4. CORRELATION• In correlation we study the degree of relationship

between two variables. - Types of correlation: (a) positive or negative correlation (b) simple or multiple correlation• Correlation coefficient- It is a measure of

correlation . Denoted by ‘r’. when r=1 (+ve correlation) when r= -1 (-ve correlation) when r=0 (no correlation)

5. REGRESSION• It is the functional relationship between two

variable. -We take variable whose values are known as

independent variable and the variable whose values are to predicted as the dependent variable.

Line of regression of Y on X- It is used for estimation of the variable Y for a give value of the variable X.

X= Independent variable Y= Dependent variable

Line of regression of X on Y- It is used for estimation of the variable X for a give value of the variable Y.

Y= Independent variable X= Dependent variableRegression coefficient- It is measure or regression.

Denoted by ‘b’. bxy(X on Y) = ( n∑xy - ∑x∑y )/ n∑y²-(∑y)²

byx(Y on X) = ( n∑xy - ∑x∑y )/ n∑x²-(∑x)²

TEST OF SIGNIFICANCE• It is the formal procedure for comparing

observed data with a claim (also called a hypothesis) whose truth we want to assess.

1. T-TEST• Two types of t-test (a) Unpaired t-test (b) Paired t-test Unpaired t-test- If there is no link between the

data. Data is independent. - Testing the significance of single mean-

- Testing the significance of difference between two mean-

- Degree of freedom= n₁ +n₂ -2

• PAIRED T-TEST- When the two samples were dependent. Two samples are said to be dependent when the observation in one sample is related to those in other.

- When the samples are dependent, they have equal sample size.

2. F-TEST• Used to compare the precision of two set of data.

2. ANOVA• Developed by Sir Ronald A. Fisher in 1920.• A statistical technique specially designed to the

test whether the means of more than two quantitative population are equal.

• Types of ANOVA (a) One way ANOVA (b) Two way ANOVA-ONE WAY ANOVA- There is only one factor or

independent variable.-TWO WAY ANOVA- There are two independent

variable.

ONE WAY ANOVA• Suppose we have three different groups.

• There are 5 steps: 1. Hypothesis- Two hypothesis. Null hypothesis H₀ = All mean are equal. Alternate hypothesis = At least one difference

among the mean

Group- A Group- B Group-c

1 2 2

2 4 3

5 2 4

2. Calculate degree of freedom(d.f)- Between the group= k-1 k= No. of level 3-1= 2 With in the group= N-k N= total no. of observation 9-3= 6 Total d.f.= 8 F- critical value- 5.14

3. Sum of squared deviation from mean- Calculate mean - X̅ᴀ= 2.67 X̅ʙ= 2.67 X̅ᴄ= 3.00 Grand mean X= sum of all observation/ total no.

of observation 25/9= 2.78- Total sum of square= ∑(X-X̅)²

= 13.6- Sum of square with in the group= ∑(Xᴀ-X̅ᴀ)² + ∑(Xʙ-Xʙ̅)² + ∑(Xᴄ-X̅ᴄ)² = 13.37

- Sum of square between the group = total S.S. - S.S. with in the group 13.6-13.37= 0.234. Calculate variance- between the group= S.S. between group/ d.f.

between group - .23/2 = 0.12 with in the group= S.S. with in the group/ d.f.

with in the group - 13.34/6= 2.22

5. F-value- variance between the group/ variance

with in the group

0.12/2.22= 0.5 RESULT- 0.5< 5.14 we fail to reject null hypothesis. Hence there is no significant between these

three groups.

REFRENCESMendham J, Denny RC, Barnes JD, Thomas M,

Sivasanker B. Vogel’s textbook of quantitative chemical analysis. 6th ed. Delhi: Pearson Education Ltd; 2000: 110-133.

Patel GC, Jani GK. Basic biostatistics for pharmacy. 2nd ed. Ahemdabad: Atul Parkashan; 2007-2008.

Manikandan S. Measures of central tendency: Median and mode. J Pharmacol Pharmacother. 2011: 2(3): 214-215.

KNOWLEDGE NOT SHARE, IS WASTED. - CLAN JACOBS