Differences

This shows you the differences between two versions of the page.

--- partial_and_semipartial_correlation [2024/10/16 08:04] – [e.g. Using ppcor.test with 4 var] hkimscil
+++ partial_and_semipartial_correlation [2025/06/04 08:37] (current) – [X1과 X2 간의 상관관계가 심할 때 Regression 결과의 오류] hkimscil
@@ Line 442: / Line 442: @@
 library(ppcor)
-reg.f.sh <- lm(FGPA ~ SATV + HSGPA)   # second regression
+reg.g.sh <- lm(GREV ~ SATV + HSGPA)
-res.f <- resid(reg.f.sh)     # second set of residuals - FGPA free of SATV and HSGPA
+res.g.sh <- resid(reg.g.sh)
-reg.s.fh <- lm(SATV ~ FGPA + HSGPA)
-res.s <- resid(reg.s.fh)
-reg.h.sf <- lm(HSGPA ~ FGPA + SATV)
-res.h <- resid(reg.h.sf)
-reg.all <- lm(GREV ~ HSGPA + FGPA + SATV)
-reg.1 <- lm(GREV ~ res.f)
-reg.2 <- lm(GREV ~ res.s)
-reg.3 <- lm(GREV ~ res.h)
-summary(reg.all)
-summary(reg.1)
-summary(reg.2)
-summary(reg.3)
-reg.1$coefficient[2]
-reg.2$coefficient[2]
-reg.3$coefficient[2]
-spr.y.f <- spcor.test(GREV, FGPA, scholar[,c("SATV", "HSGPA")])
-spr.y.s <- spcor.test(GREV, SATV, scholar[,c("HSGPA", "FGPA")])
-spr.y.h <- spcor.test(GREV, HSGPA, scholar[,c("SATV", "FGPA")])
-spr.y.f$estimate
-spr.y.s$estimate
-spr.y.h$estimate
-spr.y.f$estimate^2
-spr.y.s$estimate^2
-spr.y.h$estimate^2
-summary(reg.1)$r.square
-summary(reg.2)$r.square
-summary(reg.3)$r.square
-ca <- summary(reg.1)$r.square +
-  summary(reg.2)$r.square +
-  summary(reg.3)$r.square
-# so common explanation area should be
-summary(reg.all)$r.square - carm(list=ls())
-library(ggplot2)
-library(dplyr)
-library(tidyr)
-library(faux)
-set.seed(101)
-scholar <- rnorm_multi(n = 50,
-                       mu = c(3.12, 3.3, 540, 650),
-                       sd = c(.25, .34, 12, 13),
-                       r = c(0.15, 0.44, 0.47, 0.55, 0.45, 0.88),
-                       varnames = c("HSGPA", "FGPA", "SATV", "GREV"),
-                       empirical = FALSE)
-attach(scholar)
-# library(psych)
-describe(scholar) # provides descrptive information about each variable
-corrs <- cor(scholar) # find the correlations and set them into an object called 'corrs'
-corrs                 # print corrs
-pairs(scholar)        # pairwise scatterplots
+reg.g.fh <- lm(GREV ~ FGPA + HSGPA)
+res.g.fh <- resid(reg.g.fh)
-# install.packages("ppcor")
+reg.g.sf <- lm(GREV ~ SATV + FGPA)
-library(ppcor)
+res.g.sf <- resid(reg.g.sf)
 reg.f.sh <- lm(FGPA ~ SATV + HSGPA)   # second regression
@@ Line 530: / Line 469: @@
 summary(reg.2)
 summary(reg.3)
+reg.1a <- lm(res.g.sh~res.f)
+reg.2a <- lm(res.g.fh~res.s)
+reg.3a <- lm(res.g.sf~res.h)
 reg.1$coefficient[2]
 reg.2$coefficient[2]
 reg.3$coefficient[2]
+reg.1a$coefficient[2]
+reg.2a$coefficient[2]
+reg.3a$coefficient[2]
 spr.y.f <- spcor.test(GREV, FGPA, scholar[,c("SATV", "HSGPA")])
@@ Line 556: / Line 503: @@
 # so common explanation area should be
 summary(reg.all)$r.square - ca
 </code>
@@ Line 746: / Line 694: @@
 >
 </code>
+----
 {{:pasted:20241016-080226.png}}
 multiple regression 분석을 보면 독립변인의 coefficient 값은 각각
-  * HSGPA         -25.475
+  * HSGPA         8.3214
-  * FGPA          17.679
+  * FGPA          1.3994
-  * SATV           0.131
+  * SATV          0.8143
 이 기울기에 대해서 t-test를 각각 하여 HSGPA와 FGPA의 설명력이 significant 한지를 확인하였다. 그리고 이 때의 R<sup>2</sup> 값은
-  * 0.672 (67.2%) 이었다.
+  * 0.799 이었다.
 그런데 이 coefficient값은 독립변인 각각의 고유의 설명력을 가지고 (spcor.test(GREV, x1, 나머지제어)로 얻은 부분) 종속변인에 대해서 regression을 하여 얻은 coefficient값과 같음을 알 수 있다. 즉, <fc #ff0000>multiple regression의 독립변인의 b coefficient 값들은 고유의 설명부분을 (spr) 추출해서 y에 (GREV) regression한 결과와 같음을</fc> 알 수 있다.
-<code>
+reg.all
-> reg.1$coefficient[2]
+{{:pasted:20250604-082250.png?600}}
-res.f
+reg.1
-.68
+{{:pasted:20250604-082519.png?600}}
-> reg.2$coefficient[2]
+reg.2
- res.s
+{{:pasted:20250604-082635.png?600}}
-.1305
+reg.3
-> reg.3$coefficient[2]
+{{:pasted:20250604-082740.png?600}}
-res.h
+또한 세 독립변인이 공통적으로 설명하는 부분은
-.97
+  * 0.39
->
+임을 알 수 있다.
-</code>
 ====== e.g., 독립변인 들이 서로 독립적일 때의 각각의 설명력 ======
 In this example, the two IVs are orthogonal to each other (not correlated with each other). Hence, regress res.y.x2 against x1 would not result in any problem.
@@ Line 933: / Line 878: @@
 m <- lm(weights ~ LSS + RSS)
-## F-value is very small, but neither LSS or RSS are significant
+## F-value is very large, and significant.
+# but neither LSS or RSS are significant with t-test
 summary(m)