adjusted_r_squared
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
adjusted_r_squared [2016/05/11 07:17] – created hkimscil | adjusted_r_squared [2016/05/11 07:48] (current) – hkimscil | ||
---|---|---|---|
Line 13: | Line 13: | ||
| __Model Summary(b)__ | | __Model Summary(b)__ | ||
- | | Model | R | R Square | + | | Model | R | R \\ Square |
| 1 | 0.903696114 | | 1 | 0.903696114 | ||
<WRAP clear /> | <WRAP clear /> | ||
- | | + | **__r-square:__** |
- | | + | * $\displaystyle |
- | | + | * $\displaystyle |
- | | + | * Usually interpret with % ( by multiplying 100 to $r^2$ ) |
- | | + | **__Adjusted |
- | | + | * $\displaystyle |
- | | + | * This is equivalent to: $ \displaystyle |
- | | + | * $\text{Var} = \text{MS} = s^{2} = \displaystyle \frac {SS}{n} $ |
- | | + | * 여기서 |
- | * for Var< | + | * $\displaystyle Var_{res} = \frac {SS_{res}}{n-p-1}$ |
- | * for Var< | + | * $\displaystyle Var_{total} = \frac {SS_{total}}{n-1}$ |
- | * This is the same logic as we used n-1 instead of n in order to get estimation of population standard deviation with a sample statistics. | + | * 따라서, |
+ | * $\displaystyle \text{Adjusted } R^{2} = 1 - \displaystyle \frac {\displaystyle \frac {SS_{res}}{n-p-1}}{\displaystyle | ||
+ | * This is **the same logic** as we used n-1 instead of n in order to get estimation of population standard deviation with a sample statistics. | ||
* Therefore, the Adjusted r< | * Therefore, the Adjusted r< | ||
+ | **__왜 Adjusted R squared 값을 사용하는가? | ||
+ | * p가 커지면, 즉 . . . . | ||
+ | * Adjusted R squared 값이 작아지는 경향이 생긴다. | ||
+ | * 그런데, p가 커진다는 것은 독립변인을 자꾸 추가한다는 것인데, 독립변인 모든 X들이 사실은 Y를 설명하는 것이 아니라고 해도, (즉, X와 Y가 이론적인 원인과 결과의 관계를 갖지 않더라도) 자연적으로 R< | ||
+ | * < | ||
+ | * 가령 위의 경우, 연구자는 독립변인으로 처음 세가지만 사용할 것을 결정할 수 있는데 이는 Adjusted R 제곱값이 4번째 변인 투입부터 줄기때문이다. 반면에 R 제곱값은 계속 커진다. | ||
- | If we take a look at the ANOVA result: | ||
- | ^ __ANOVA__ | ||
- | | Model | ||
- | | 1 | ||
- | | | Residual | ||
- | | | Total | ||
- | | a Predictors: (Constant), | ||
- | | b Dependent Variable: y ||||||| | ||
- | <WRAP clear /> | ||
- | |||
- | * ANOVA, F-test, $F=\frac{MS_{between}}{MS_{within}}$ | ||
- | * MS_between? | ||
- | * MS_within? | ||
- | * MS for residual | ||
- | * $s = \sqrt{s^2} = \sqrt{\frac{SS_{res}}{n-2}} $ | ||
- | * random difference (MS< | ||
- | * MS for regression . . . Obtained difference | ||
- | * do the same procedure at the above in MS for residual. | ||
- | * but, this time degress of freedom is k-1 (number of variables -1 ), 1. | ||
- | * Then what does F value mean? | ||
- | |||
- | Then, we take another look at coefficients result: | ||
- | |||
- | ^ __example__ | ||
- | | Model | ||
- | | B | ||
- | | 1 | ||
- | | | ||
- | | a Dependent Variable: y | ||
- | <WRAP clear /> | ||
- | |||
- | * Why do we do t-test for the slope of X variable? The below is a mathematical explanation for this. | ||
- | * Sampling distribution of Beta (혹은 b): | ||
- | * $\sigma_{\beta_{1}} = \frac{\sigma}{\sqrt{SS_{xx}}}$ | ||
- | * estimation of $\sigma_{\beta_{1}}$ : substitute sigma with s | ||
- | * t-test | ||
- | * $t=\frac{\beta_{1} - \text{Hypothesized value of }\beta_{1}}{s_{\beta_{1}}}$ | ||
- | * Hypothesized value of beta 값은 대개 0. 따라서 t 값은 | ||
- | * $t=\frac{\beta_{1}}{s_{\beta_{1}}}$ | ||
- | * $s_{\beta} = \frac {MS_{E}}{SS_{X}} = \display\frac{\sqrt{\frac{SSE}{n-2}}}{\sqrt{SS_{X}}} = \display\frac{\sqrt{\frac{\Sigma{(Y-\hat{Y})^2}}{n-2}}}{\sqrt{\Sigma{(X_{i}-\bar{X})^2}}} $ | ||
- | |||
- | ^ X ^ Y ^ $X-\bar{X}$ | ||
- | | 1 | 1 | -2 | 4 | 2 | 0.6 | -0.4 | 0.16 | | ||
- | | 2 | 1 | -1 | 1 | 1 | 1.3 | 0.3 | 0.09 | | ||
- | | 3 | 2 | 0 | 0 | 0 | 2 | 0 | 0 | | ||
- | | 4 | 2 | 1 | 1 | 0 | 2.7 | 0.7 | 0.49 | | ||
- | | 5 | 4 | 2 | 4 | 4 | 3.4 | -0.6 | 0.36 | | ||
- | | $\bar{X}$ = 3 | 2 | | SS< | ||
- | |||
- | Regression formula: y< | ||
- | SSE = Sum of Square Error | ||
- | 기울기 beta(b)에 대한 표준오차값은 아래와 같이 구한다. | ||
- | $$se_{\beta} = \frac {\sqrt{SSE/ | ||
- | & = & \frac {\sqrt{1.1/ | ||
- | 그리고 b = 0.7 | ||
- | 따라서 t = b / se = 3.655631 |
adjusted_r_squared.1462920453.txt.gz · Last modified: 2016/05/11 07:17 by hkimscil