chi-square_test
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
chi-square_test [2016/05/16 07:36] – hkimscil | chi-square_test [2024/12/09 08:20] (current) – hkimscil | ||
---|---|---|---|
Line 1: | Line 1: | ||
{{keywords>" | {{keywords>" | ||
====== Short Explanation ====== | ====== Short Explanation ====== | ||
+ | To be filled... | ||
====== Chi-square test, explanation ====== | ====== Chi-square test, explanation ====== | ||
This is rather a redudent, long description of chi-square test. | This is rather a redudent, long description of chi-square test. | ||
Line 18: | Line 19: | ||
Let's start with what we know first. | Let's start with what we know first. | ||
- | Two variables:: Let's say you are interested in the relationships between the types of religions and opinions about abortion. | + | Two variables: Let's say you are interested in the relationships between the types of religions and opinions about abortion. |
You have a hunch that people who have a different religion will have a different opinion about abortion. This actually reveals that you think having a particular religion will affect what to think about abortion. Therefore, particular religions will be the __IndependentVariable__ (IV). And the opinions abut abortion will be the __DependentVariable__ (DV). | You have a hunch that people who have a different religion will have a different opinion about abortion. This actually reveals that you think having a particular religion will affect what to think about abortion. Therefore, particular religions will be the __IndependentVariable__ (IV). And the opinions abut abortion will be the __DependentVariable__ (DV). | ||
Line 26: | Line 27: | ||
{{19-419434-a-tshirt-camp-scils-rutgers.jpg? | {{19-419434-a-tshirt-camp-scils-rutgers.jpg? | ||
- | We have two variables here. What kinds of values (attributes) do you see for the first variable, types of religions? __Nominal variable__. Your initial response would be '' | + | We have two variables here. What kinds of values (attributes) do you see for the first variable, types of religions? __Nominal variable__. Your initial response would be Catholic, Protestant, and Judaism. . . . At this point your might not be sure if you thought of all kinds of religions -- yes, certainly, there is Buhdism, too; and many others. Now you use your judgment about this __exhaustivenss__ problem. With your rationale, you can decide that you make categories as follow: Variable 1: Catholic, Protestant, and Others. The last one, others, covers everything else, right? So you just escaped from the exhaustivenss problem. As a side note, you may say, "Wait, my religion is not there... and It's unfair!" |
Now we need to take care of the other variable, the choice of abortion. The values of the variable, choice of abortion, may be rather simple: | Now we need to take care of the other variable, the choice of abortion. The values of the variable, choice of abortion, may be rather simple: | ||
Line 82: | Line 83: | ||
<WRAP clear /> | <WRAP clear /> | ||
{{ 11-419460-rusure.jpg? | {{ 11-419460-rusure.jpg? | ||
- | Chi-square test:: This is how the Chi-square method was involved in ... Anyway, now let's think about the chi-square test. Previously, you wanted to know if there are differences in the abortion opinions among religious groups; and you got the frequency table. Now think about what the table would look like if there is no differences in the abortion opinion among religious groups? | + | Chi-square test: This is how the Chi-square method was involved in ... Anyway, now let's think about the chi-square test. Previously, you wanted to know if there are differences in the abortion opinions among religious groups; and you got the frequency table. Now think about what the table would look like if there is no differences in the abortion opinion among religious groups? |
We already know from the first contingency, | We already know from the first contingency, | ||
Line 89: | Line 90: | ||
| | | | | | | | ||
| | | Catholic | | | | Catholic | ||
- | | <|3>Legal Abortion | + | | Legal Abortion |
- | | | | . | + | | ::: | | . |
- | | no | | | | 50 | | + | | ::: | no | | | | 50 | |
| | | | . | | | | | . | ||
| | Total | 40 | 42 | 18 | 100 | | | | Total | 40 | 42 | 18 | 100 | | ||
Line 160: | Line 161: | ||
Let's see the table. | Let's see the table. | ||
- | | | + | | |
| | Catholic | | | Catholic | ||
| yes | 5 | 32 | 13 | 50 | | | | yes | 5 | 32 | 13 | 50 | | | ||
| Expected value | (20) | (21) | (9) | 50 | | | | Expected value | (20) | (21) | (9) | 50 | | | ||
- | | (O-T)2 / T | (-15)^2^/ | + | | (O-T)2 / T | (-15)<sup>2</ |
| no | 35 | 10 | 5 | 50 | | | | no | 35 | 10 | 5 | 50 | | | ||
| | (20) | (21) | (9) | 50 | | | | | (20) | (21) | (9) | 50 | | | ||
- | | | (15)^2^/ | + | | | (15)<sup>2</ |
| Total | 40 | 42 | 18 | 100 | 37.58 | | | Total | 40 | 42 | 18 | 100 | 37.58 | | ||
Chi-square value = 37.58. \\ | Chi-square value = 37.58. \\ | ||
Line 174: | Line 175: | ||
I do not know exactly why the degree of freedom is important in a conceptual way -- so, having a difficulty explaining it. But, the idea behind it is that if you know totals of column and row, and the values of two cells (as a minimum requirement; | I do not know exactly why the degree of freedom is important in a conceptual way -- so, having a difficulty explaining it. But, the idea behind it is that if you know totals of column and row, and the values of two cells (as a minimum requirement; | ||
- | {{07-419434-a-tshirt-camp-scils-rutgers.jpg? | + | {{07-419434-a-tshirt-camp-scils-rutgers.jpg? |
The numbers you obtain from the book are 5.991 for the 0.05 probability and 9.210 for the 0.01 probability. They are called critical values. So the critical values are: | The numbers you obtain from the book are 5.991 for the 0.05 probability and 9.210 for the 0.01 probability. They are called critical values. So the critical values are: | ||
Line 180: | Line 181: | ||
5.991 at 0.05 probability | 5.991 at 0.05 probability | ||
9.210 at 0.01 probability | 9.210 at 0.01 probability | ||
+ | < | ||
+ | > qchisq(0.95, | ||
+ | [1] 5.991465 | ||
+ | > qchisq(0.99, | ||
+ | [1] 9.21034 | ||
+ | > | ||
+ | </ | ||
These critical values do not exceed the chi-square value you obtained from your table -- 37.58. How do you want to relate them together? Think about the expected values -- the ideal types. Suppose you obtained the same values (observed values) as those of expected values, what would be your chi-square value? --Yes, it is going to be zero. Why? If you look at the formula | These critical values do not exceed the chi-square value you obtained from your table -- 37.58. How do you want to relate them together? Think about the expected values -- the ideal types. Suppose you obtained the same values (observed values) as those of expected values, what would be your chi-square value? --Yes, it is going to be zero. Why? If you look at the formula | ||
Line 191: | Line 199: | ||
{{ princeton-nassau-inn-1.jpg? | {{ princeton-nassau-inn-1.jpg? | ||
- | Chi-square test, example:: Let's have an exercise for the chi-square thing. I hope you remember the below part which is from the last essay I wrote. Let's go through it, first. Let's look at another table. I will show the percentage in the table -- the ratio between the opinions in a religious group. | + | Chi-square test, example: Let's have an exercise for the chi-square thing. I hope you remember the below part which is from the last essay I wrote. Let's go through it, first. Let's look at another table. I will show the percentage in the table -- the ratio between the opinions in a religious group. |
| Abortion opinion and Religeon | | Abortion opinion and Religeon | ||
| | Catholic | | | Catholic | ||
Line 205: | Line 213: | ||
This may lead you to an idea that you may need some kinds of methods that can be used to aid your decision. And these "some kinds of methods" | This may lead you to an idea that you may need some kinds of methods that can be used to aid your decision. And these "some kinds of methods" | ||
- | | | + | | |
| | Catholic | | | Catholic | ||
| yes | 18 | 25 | 12 | 55 | | | yes | 18 | 25 | 12 | 55 | | ||
Line 217: | Line 225: | ||
__Note:__ Hey! the calculated expected values are all whole numbers! Guess how much time I spent in making up the example table! My point is that you might not get the whole numbers [such as 1, 2, 3, 4, ...] as your expected values in your table. Anyway, can you fill the entire cells now? It should look like the below. | __Note:__ Hey! the calculated expected values are all whole numbers! Guess how much time I spent in making up the example table! My point is that you might not get the whole numbers [such as 1, 2, 3, 4, ...] as your expected values in your table. Anyway, can you fill the entire cells now? It should look like the below. | ||
- | | | + | | |
| | Catholic | | | Catholic | ||
| yes | 18 | 25 | 12 | 55 | | | yes | 18 | 25 | 12 | 55 | | ||
Line 233: | Line 241: | ||
E is expected value. Sometimes, it is called theoretical value (T). | E is expected value. Sometimes, it is called theoretical value (T). | ||
<WRAP clear /> | <WRAP clear /> | ||
- | | | + | | |
| | Catholic | | | Catholic | ||
| yes | 18 | 25 | 12 | 55 | | | | yes | 18 | 25 | 12 | 55 | | | ||
| Expected Value | (22) | (22) | (11) | (55) | | | | Expected Value | (22) | (22) | (11) | (55) | | | ||
- | | (O-T)2 / T | (-4)2/ | + | | (O-T)<sup>2</ |
| no | 22 | 15 | 8 | 45 | | | | no | 22 | 15 | 8 | 45 | | | ||
| Expected Value | (18) | (18) | (9) | (45) | | | | Expected Value | (18) | (18) | (9) | (45) | | | ||
- | | (O-T)2 / T | (4)2/ | + | | (O-T)<sup>2</ |
| Total | 40 | 40 | 20 | 100 | 2.73 | | | Total | 40 | 40 | 20 | 100 | 2.73 | | ||
- | Chi-square value = The sum of the entire 6 yellow cells = 2.73. | + | **Chi-square value = The sum of the entire 6 yellow cells = 2.73.** \\ |
- | Degrees of Freedom (df) = (the # of columns-1) x (the # of rows-1)= (3-1) x (2-1) = 2 x 1 = 2. | + | **Degrees of Freedom (df) = (the # of columns-1) x (the # of rows-1)= (3-1) x (2-1) = 2 x 1 = 2.** \\ |
+ | \\ | ||
Look up the values in your textbook -- which is called " | Look up the values in your textbook -- which is called " | ||
+ | \\ | ||
They are: | They are: | ||
5.991 (0.05 probability) | 5.991 (0.05 probability) | ||
9.210 (0.01 probability) | 9.210 (0.01 probability) | ||
+ | |||
+ | OR | ||
+ | < | ||
+ | > pchisq(2.73, | ||
+ | [1] 0.7446193 | ||
+ | </ | ||
Now the rest of what you need to do is to compare the numbers (chi-square value and the critical values). | Now the rest of what you need to do is to compare the numbers (chi-square value and the critical values). | ||
Line 260: | Line 274: | ||
In the first place, you assumed that there would be no differences in the abortion issue among the religious groups to get the expected values. And you compared the expected values to the observed values. In other words, you tested your survey result (the observed values) against the idea of "no difference." | In the first place, you assumed that there would be no differences in the abortion issue among the religious groups to get the expected values. And you compared the expected values to the observed values. In other words, you tested your survey result (the observed values) against the idea of "no difference." | ||
- | {{raritan-river-01.jpg? | + | {{raritan-river-01.jpg? |
- | + | ||
- | Therefore, it would not have been safe, had you ever said, "Sure 45% and 62.5% are different." | + | |
__Another note:__ You might have a question... Hey, wait a minute... If I pick up some other numbers from the chi-square distribution table, the result would be totally different! | __Another note:__ You might have a question... Hey, wait a minute... If I pick up some other numbers from the chi-square distribution table, the result would be totally different! | ||
- | + | <WRAP clear /> | |
- | *** For your information, | + | For your information, |
| df | .30 | .20 | .10 | .05 | .02 | .01 | .001 | | | df | .30 | .20 | .10 | .05 | .02 | .01 | .001 | | ||
| 1 | 1.074 | 1.642 | 2.706 | 3.841 | 5.412 | 6.635 | 10.827 | | 1 | 1.074 | 1.642 | 2.706 | 3.841 | 5.412 | 6.635 | 10.827 | ||
Line 279: | Line 291: | ||
So, basically, choosing the probability means you decide the certainty of your decision. | So, basically, choosing the probability means you decide the certainty of your decision. | ||
- | {{princeton_park_river_1.jpg|202 |Princeton Park river}} At the same token, since the chi-square value is even bigger than the critical value at 0.01 probability level, you can state that there is indeed difference in legal abortion opinions among the religious groups. The chances that you are wrong about this decision is 0.01 out of 1 (1 out of 100; 1%). That is, even though the difference of the chi-square and the critical value seem to be due to the fact that the religious groups are indeed different in legal abortion issue, there is still a slight chance saying that such a big difference between the chi-square and critical value is due to the randomly occurred error in your sampling procedure. And such chances are 1 out of 100. | + | {{princeton_park_river_1.jpg?202 |Princeton Park river}} At the same token, since the chi-square value is even bigger than the critical value at 0.01 probability level, you can state that there is indeed difference in legal abortion opinions among the religious groups. The chances that you are wrong about this decision is 0.01 out of 1 (1 out of 100; 1%). That is, even though the difference of the chi-square and the critical value seem to be due to the fact that the religious groups are indeed different in legal abortion issue, there is still a slight chance saying that such a big difference between the chi-square and critical value is due to the randomly occurred error in your sampling procedure. And such chances are 1 out of 100. |
Yes, choosing the probability means that you decide the certainty of your decision. The chances of your being wrong in your statement -- there is differences in the abortion issue among the religious groups -- was 3 out 10. And unlike the probability of 0.05 or 0.01, this risk is too big to take. In other words, it is a bit meaningless. This is why professor White said that social scientists usually take 0.05 as a criterion of his or her statistical tests. | Yes, choosing the probability means that you decide the certainty of your decision. The chances of your being wrong in your statement -- there is differences in the abortion issue among the religious groups -- was 3 out 10. And unlike the probability of 0.05 or 0.01, this risk is too big to take. In other words, it is a bit meaningless. This is why professor White said that social scientists usually take 0.05 as a criterion of his or her statistical tests. | ||
- | |||
chi-square_test.1463353566.txt.gz · Last modified: 2016/05/16 07:36 by hkimscil