0% found this document useful (0 votes)
26 views34 pages

Test of Independence - II: Dr. A. Ramesh

The document discusses the Chi-squared test of independence and the goodness of fit test for Poisson distributions using Python. It provides examples involving student data to test the independence of gender and student motivation, as well as a case study on customer arrivals at a parking garage. The document outlines the steps for conducting these tests, including hypothesis formulation, calculation of observed and expected frequencies, and the rejection criteria for the null hypothesis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views34 pages

Test of Independence - II: Dr. A. Ramesh

The document discusses the Chi-squared test of independence and the goodness of fit test for Poisson distributions using Python. It provides examples involving student data to test the independence of gender and student motivation, as well as a case study on customer arrivals at a parking garage. The document outlines the steps for conducting these tests, including hypothesis formulation, calculation of observed and expected frequencies, and the rejection criteria for the null hypothesis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

2 Test of Independence - II

Dr. A. Ramesh
DEPARTMENT OF MANAGEMENT STUDIES

1
Agenda

• Using python to test the independence of variables


• Understanding goodness of fit test for Poisson

2
Example

• Record of 50 students studying in ABN School is taken at random, the first


10 entries are like this:

res_num aa pe sm ae r g c
1 99 19 1 2 0 0 1
2 46 12 0 0 0 0 0
3 57 15 1 1 0 0 0
4 94 18 2 2 1 1 1
5 82 13 2 1 1 1 1
6 59 12 0 0 2 0 0
7 61 12 1 2 0 0 0
8 29 9 0 0 1 1 0
9 36 13 1 1 0 0 0
10 91 16 2 2 1 1 0

3
Example

Here :
• res_num = registration no.
• aa= academic ability
• pe = parent education
• sm = student motivation
• r = religion
• g = gender

4
Python code

5
Hypothesis

• Test the hypothesis that “gender and student motivation” are


independent

6
Python code

7
Observed values
Gender Student motivation
0 1 2 Row Sum
(Disagree ) (Not (Agree)
decided )

0 (Male) 10 13 6 29

1(Female ) 4 9 8 21

Column 14 22 14 50
Sum

8
Expected frequency (contingency table)

Gender Student motivation


0 1 2

0 29*14/50= 12.76 8.12


8.12
1 5.88 9.24 5.88

9
Frequency Table

Gender Student motivation


0 1 2

0 fo = 10 fo = 13 fo = 6
fe = 8.12 fe =12.76 fe =8.12
1 fo = 4 fo = 9 fo = 8
fe =5.88 fe =9.24 fe =5.88

10
Chi sq. calculation

(f o −f f e)
2

 = 
2

= 0.435+ 0.005+0.554+0.601+0.006+0.764
= 2.365

11
Python code

12
Python code

Degrees of
freedom =
(2-1)*(3-1)

13
Python code

Contingency
table

14
2 Goodness of Fit Test

15
2 Goodness-of-Fit Test

• The 2 goodness-of-fit test compares expected (theoretical)


frequencies of categories from a population distribution to the
observed (actual) frequencies from a distribution to determine
whether there is a difference between what was expected and what
was observed

16
2 Goodness-of-Fit Test

( f o− f e )
2

 =
2

f e

df = k - 1 - p
where : f = frequency of observed values
o

f = frequency of expected values


e

k = number of categories
p = number of parameters estimated from the sample data

17
Goodness of Fit Test: Poisson Distribution
1. Set up the null and alternative hypotheses.
H0: Population has a Poisson probability distribution
Ha: Population does not have a Poisson distribution

2. Select a random sample and


• Record the observed frequency fi for each value of the Poisson
random variable.
• Compute the mean number of occurrences .

3. Compute the expected frequency of occurrences ei


for each value of the Poisson random variable.

18
Goodness of Fit Test: Poisson Distribution

4. Compute the value of the test statistic


k( f i − ei ) 2
 =
2
i =1 ei

where:
fi = observed frequency for category i
ei = expected frequency for category i
k = number of categories

19
Goodness of Fit Test: Poisson Distribution
5. Rejection rule:
p-value approach: Reject H0 if p-value < 

Critical value approach: Reject H0 if  2   2

where  is the significance level and


there are k - 2 degrees of freedom

20
Goodness of Fit Test: Poisson Distribution
• Example: Parking Garage

In studying the need for an additional entrance to a city parking


garage, a consultant has recommended an analysis, that approach is
applicable only in situations where the number of cars entering
during a specified time period follows a Poisson distribution.

21
Goodness of Fit Test: Poisson Distribution
A random sample of 100 one- minute time intervals resulted in the
customer arrivals listed below. A statistical test must be conducted to
see if the assumption of a Poisson distribution is reasonable.

# Arrivals 0 1 2 3 4 5 6 7 8 9 10 11 12
Frequency 0 1 4 10 14 20 12 12 9 8 6 3 1

22
Goodness of Fit Test: Poisson Distribution

• Hypotheses
H0: Number of cars entering the garage during
a one-minute interval is Poisson distributed

Ha: Number of cars entering the garage during a


one-minute interval is not Poisson distributed

23
Python Code

24
Goodness of Fit Test: Poisson Distribution

• Estimate of Poisson Probability Function


otal Arrivals = 0(0) + 1(1) + 2(4) + . . . + 12(1) = 600
Estimate of  = 600/100 = 6
Total Time Periods = 100
Hence,

6 x e −6
f ( x) =
x!

25
Goodness of Fit Test: Poisson Distribution
• Expected Frequencies

x f (x ) nf (x ) x f (x ) nf (x )
0 .0025 .25 7 .1377 13.77
1 .0149 1.49 8 .1033 10.33
2 .0446 4.46 9 .0688 6.88
3 .0892 8.92 10 .0413 4.13
4 .1339 13.39 11 .0225 2.25
5 .1606 16.06 12+ .0201 2.01
6 .1606 16.06 Total 1.0000 100.00

26
Python code

27
Python code

28
Goodness of Fit Test: Poisson Distribution
• Observed and Expected Frequencies
i fi ei fi - ei
0 or 1 or 2 5 6.20 -1.20
3 10 8.92 1.08
4 14 13.39 0.61
5 20 16.06 3.94
6 12 16.06 -4.06
7 12 13.77 -1.77
8 9 10.33 -1.33
9 8 6.88 1.12
10 or more 10 8.39 1.61
29
Python code

30
Goodness of Fit Test: Poisson Distribution
• Rejection Rule
With  = .05 and k - p - 1 = 9 - 1 - 1 = 7 d.f.
(where k = number of categories and p = number of
population parameters estimated),  .02 5 = 1 4 .0 6 7
Reject H0 if p-value < .05 or 2 > 14.067.
• Test Statistic

( − 1.20) 2
(1.08) 2
(1.61) 2
2 = + + ... + = 3.268
6.20 8.92 8.39

31
Python code

32
Goodness of Fit Test: Poisson
Distribution
df = 7
0.05
Non rejection
region

14.067


2
= 3.268  14.067, do not reject Ho.
Cal

33
Thank You

34

You might also like