User Tools

Site Tools


partial_and_semipartial_correlation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
partial_and_semipartial_correlation [2024/10/15 15:52] hkimscilpartial_and_semipartial_correlation [2024/10/17 10:28] (current) – [e.g. Using ppcor.test with 4 var] hkimscil
Line 422: Line 422:
 library(faux) library(faux)
  
-set.seed(1011)+set.seed(101)
 scholar <- rnorm_multi(n = 50,  scholar <- rnorm_multi(n = 50, 
-                   mu = c(3.12, 3.3, 540, 650), +                       mu = c(3.12, 3.3, 540, 650), 
-                   sd = c(.25, .34, 12, 13), +                       sd = c(.25, .34, 12, 13), 
-                   r = c(0.21, 0.24, 0.5, 0.24, 0.6, 0.48),  +                       r = c(0.15, 0.44, 0.47, 0.55, 0.45, 0.88),  
-                   varnames = c("HSGPA", "FGPA", "SATV", "GREV"), +                       varnames = c("HSGPA", "FGPA", "SATV", "GREV"), 
-                   empirical = FALSE)+                       empirical = FALSE)
 attach(scholar) attach(scholar)
  
Line 441: Line 441:
 # install.packages("ppcor") # install.packages("ppcor")
 library(ppcor) library(ppcor)
 +
 +reg.g.sh <- lm(GREV ~ SATV + HSGPA)
 +res.g.sh <- resid(reg.g.sh)
 +
 +reg.g.fh <- lm(GREV ~ FGPA + HSGPA)
 +res.g.fh <- resid(reg.g.fh)
 +
 +reg.g.sf <- lm(GREV ~ SATV + FGPA)
 +res.g.sf <- resid(reg.g.sf)
  
 reg.f.sh <- lm(FGPA ~ SATV + HSGPA)   # second regression reg.f.sh <- lm(FGPA ~ SATV + HSGPA)   # second regression
Line 460: Line 469:
 summary(reg.2) summary(reg.2)
 summary(reg.3) summary(reg.3)
 +
 +reg.1a <- lm(res.g.sh~res.f)
 +reg.2a <- lm(res.g.fh~res.s)
 +reg.3a <- lm(res.g.sf~res.h)
 +
 +reg.1$coefficient[2]
 +reg.2$coefficient[2]
 +reg.3$coefficient[2]
 +
 +reg.1a$coefficient[2]
 +reg.2a$coefficient[2]
 +reg.3a$coefficient[2]
  
 spr.y.f <- spcor.test(GREV, FGPA, scholar[,c("SATV", "HSGPA")]) spr.y.f <- spcor.test(GREV, FGPA, scholar[,c("SATV", "HSGPA")])
 spr.y.s <- spcor.test(GREV, SATV, scholar[,c("HSGPA", "FGPA")]) spr.y.s <- spcor.test(GREV, SATV, scholar[,c("HSGPA", "FGPA")])
 spr.y.h <- spcor.test(GREV, HSGPA, scholar[,c("SATV", "FGPA")]) spr.y.h <- spcor.test(GREV, HSGPA, scholar[,c("SATV", "FGPA")])
 +
 +spr.y.f$estimate
 +spr.y.s$estimate
 +spr.y.h$estimate
  
 spr.y.f$estimate^2 spr.y.f$estimate^2
Line 472: Line 497:
 summary(reg.2)$r.square summary(reg.2)$r.square
 summary(reg.3)$r.square summary(reg.3)$r.square
 +
 +ca <- summary(reg.1)$r.square + 
 +  summary(reg.2)$r.square + 
 +  summary(reg.3)$r.square
 +# so common explanation area should be
 +summary(reg.all)$r.square - carm(list=ls())
 +
 +library(ggplot2)
 +library(dplyr)
 +library(tidyr)
 +library(faux)
 +
 +set.seed(101)
 +scholar <- rnorm_multi(n = 50, 
 +                       mu = c(3.12, 3.3, 540, 650),
 +                       sd = c(.25, .34, 12, 13),
 +                       r = c(0.15, 0.44, 0.47, 0.55, 0.45, 0.88), 
 +                       varnames = c("HSGPA", "FGPA", "SATV", "GREV"),
 +                       empirical = FALSE)
 +attach(scholar)
 +
 +# library(psych)
 +describe(scholar) # provides descrptive information about each variable
 +
 +corrs <- cor(scholar) # find the correlations and set them into an object called 'corrs'
 +corrs                 # print corrs
 +
 +pairs(scholar)        # pairwise scatterplots
 +
 +# install.packages("ppcor")
 +library(ppcor)
 +
 +reg.f.sh <- lm(FGPA ~ SATV + HSGPA)   # second regression
 +res.f <- resid(reg.f.sh)     # second set of residuals - FGPA free of SATV and HSGPA
 +
 +reg.s.fh <- lm(SATV ~ FGPA + HSGPA)   
 +res.s <- resid(reg.s.fh)    
 +
 +reg.h.sf <- lm(HSGPA ~ FGPA + SATV)   
 +res.h <- resid(reg.h.sf)    
 +
 +reg.all <- lm(GREV ~ HSGPA + FGPA + SATV)
 +reg.1 <- lm(GREV ~ res.f)
 +reg.2 <- lm(GREV ~ res.s)
 +reg.3 <- lm(GREV ~ res.h)
 +
 +summary(reg.all)
 +summary(reg.1)
 +summary(reg.2)
 +summary(reg.3)
 +
 +reg.1$coefficient[2]
 +reg.2$coefficient[2]
 +reg.3$coefficient[2]
 +
 +spr.y.f <- spcor.test(GREV, FGPA, scholar[,c("SATV", "HSGPA")])
 +spr.y.s <- spcor.test(GREV, SATV, scholar[,c("HSGPA", "FGPA")])
 +spr.y.h <- spcor.test(GREV, HSGPA, scholar[,c("SATV", "FGPA")])
 +
 +spr.y.f$estimate
 +spr.y.s$estimate
 +spr.y.h$estimate
 +
 +spr.y.f$estimate^2
 +spr.y.s$estimate^2
 +spr.y.h$estimate^2
 +
 +summary(reg.1)$r.square
 +summary(reg.2)$r.square
 +summary(reg.3)$r.square
 +
 +ca <- summary(reg.1)$r.square + 
 +  summary(reg.2)$r.square + 
 +  summary(reg.3)$r.square
 +# so common explanation area should be
 +summary(reg.all)$r.square - ca
 </code> </code>
  
 <code> <code>
 +
 > rm(list=ls()) > rm(list=ls())
  
Line 482: Line 584:
 > library(faux) > library(faux)
  
-> set.seed(1011)+> set.seed(101)
 > scholar <- rnorm_multi(n = 50,  > scholar <- rnorm_multi(n = 50, 
-                   mu = c(3.12, 3.3, 540, 650), +                       mu = c(3.12, 3.3, 540, 650), 
-                   sd = c(.25, .34, 12, 13), +                       sd = c(.25, .34, 12, 13), 
-                   r = c(0.21, 0.24, 0.5, 0.24, 0.6, 0.48),  +                       r = c(0.15, 0.44, 0.47, 0.55, 0.45, 0.88),  
-                   varnames = c("HSGPA", "FGPA", "SATV", "GREV"), +                       varnames = c("HSGPA", "FGPA", "SATV", "GREV"), 
-                   empirical = FALSE)+                       empirical = FALSE)
 > attach(scholar) > attach(scholar)
 The following objects are masked from scholar (pos = 3): The following objects are masked from scholar (pos = 3):
Line 494: Line 596:
     FGPA, GREV, HSGPA, SATV     FGPA, GREV, HSGPA, SATV
  
 +
 > # library(psych) > # library(psych)
 > describe(scholar) # provides descrptive information about each variable > describe(scholar) # provides descrptive information about each variable
       vars  n   mean    sd median trimmed   mad    min    max range  skew       vars  n   mean    sd median trimmed   mad    min    max range  skew
-HSGPA    1 50   3.13  0.29   3.14    3.13  0.30   2.39   3.67  1.29 -0.25 +HSGPA    1 50   3.13  0.24   3.11    3.13  0.16   2.35   3.62  1.26 -0.42 
-FGPA     2 50   3.27  0.40   3.27    3.25  0.42   2.50   4.24  1.74  0.43 +FGPA     2 50   3.34  0.35   3.32    3.33  0.33   2.50   4.19  1.68  0.27 
-SATV     3 50 539.52 10.92 539.54  539.51 11.69 511.15 565.69 54.54 -0.05 +SATV     3 50 541.28 11.43 538.45  540.50 10.85 523.74 567.97 44.24  0.58 
-GREV     4 50 648.41 12.96 649.14  648.14 13.47 622.89 682.17 59.28  0.19+GREV     4 50 651.72 11.90 649.70  651.29 10.55 629.89 678.33 48.45  0.35
       kurtosis   se       kurtosis   se
-HSGPA    -0.22 0.04 +HSGPA     1.21 0.03 
-FGPA     -0.18 0.06 +FGPA     -0.01 0.05 
-SATV     -0.25 1.54 +SATV     -0.60 1.62 
-GREV     -0.33 1.83+GREV     -0.54 1.68
  
 > corrs <- cor(scholar) # find the correlations and set them into an object called 'corrs' > corrs <- cor(scholar) # find the correlations and set them into an object called 'corrs'
 > corrs                 # print corrs > corrs                 # print corrs
        HSGPA   FGPA   SATV   GREV        HSGPA   FGPA   SATV   GREV
-HSGPA 1.0000 0.3965 0.2047 0.6175 +HSGPA 1.0000 0.3404 0.4627 0.5406 
-FGPA  0.3965 1.0000 0.2894 0.7300 +FGPA  0.3404 1.0000 0.5266 0.5096 
-SATV  0.2047 0.2894 1.0000 0.3461 +SATV  0.4627 0.5266 1.0000 0.8802 
-GREV  0.6175 0.7300 0.3461 1.0000+GREV  0.5406 0.5096 0.8802 1.0000
  
 > pairs(scholar)        # pairwise scatterplots > pairs(scholar)        # pairwise scatterplots
Line 541: Line 644:
 Residuals: Residuals:
     Min      1Q  Median      3Q     Max      Min      1Q  Median      3Q     Max 
--15.939  -4.001   0.451   4.301  16.377 +-13.541  -3.441   0.148   4.823   7.796 
  
 Coefficients: Coefficients:
             Estimate Std. Error t value Pr(>|t|)                 Estimate Std. Error t value Pr(>|t|)    
-(Intercept)  466.989     54.659    8.54  4.7e-11 *** +(Intercept) 180.2560    40.3988    4.46  5.2e-05 *** 
-HSGPA         16.971      4.160    4.08  0.00018 **+HSGPA         8.3214     3.8050    2.19    0.034   
-FGPA          17.679      3.049    5.80  5.8e-07 *** +FGPA          1.3994     2.6311    0.53    0.597     
-SATV           0.131      0.105    1.24  0.22123    +SATV          0.8143     0.0867    9.40  2.8e-12 ***
 --- ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
  
-Residual standard error: 7.66 on 46 degrees of freedom +Residual standard error: 5.51 on 46 degrees of freedom 
-Multiple R-squared:  0.672, Adjusted R-squared:  0.65  +Multiple R-squared:  0.799, Adjusted R-squared:  0.786  
-F-statistic: 31.on 3 and 46 DF,  p-value: 3.43e-11+F-statistic: 60.on 3 and 46 DF,  p-value: 4.84e-16
  
 > summary(reg.1) > summary(reg.1)
Line 562: Line 665:
  
 Residuals: Residuals:
-    Min      1Q  Median      3Q     Max  +   Min     1Q Median     3Q    Max  
--24.346  -5.405  -0.617   6.863  24.967 +-21.76  -8.65  -2.08   7.83  26.10 
  
 Coefficients: Coefficients:
             Estimate Std. Error t value Pr(>|t|)                 Estimate Std. Error t value Pr(>|t|)    
-(Intercept)   648.41       1.61  401.76  < 2e-16 *** +(Intercept)   651.72       1.70  383.59   <2e-16 *** 
-res.f          17.68       4.54    3.89  0.00031 ***+res.f           1.40       5.74    0.24     0.81    
 --- ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
  
-Residual standard error: 11.4 on 48 degrees of freedom +Residual standard error: 12 on 48 degrees of freedom 
-Multiple R-squared:  0.24, Adjusted R-squared:  0.224  +Multiple R-squared:  0.00124, Adjusted R-squared:  -0.0196  
-F-statistic: 15.on 1 and 48 DF,  p-value: 0.000305+F-statistic: 0.0595 on 1 and 48 DF,  p-value: 0.808
  
 > summary(reg.2) > summary(reg.2)
Line 583: Line 686:
 Residuals: Residuals:
    Min     1Q Median     3Q    Max     Min     1Q Median     3Q    Max 
--24.71 -10.49   1.15   7.99  34.24 +-22.54  -4.94  -1.24   6.08  20.35 
  
 Coefficients: Coefficients:
             Estimate Std. Error t value Pr(>|t|)                 Estimate Std. Error t value Pr(>|t|)    
-(Intercept)  648.407      1.841  352.20   <2e-16 *** +(Intercept)  651.715      1.332   489.4  < 2e-16 *** 
-res.s          0.131      0.179    0.73     0.47    +res.s          0.814      0.148     5.5  1.4e-06 ***
 --- ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
  
-Residual standard error: 13 on 48 degrees of freedom +Residual standard error: 9.42 on 48 degrees of freedom 
-Multiple R-squared:  0.011, Adjusted R-squared:  -0.00963  +Multiple R-squared:  0.386, Adjusted R-squared:  0.374  
-F-statistic: 0.533 on 1 and 48 DF,  p-value: 0.469+F-statistic: 30.on 1 and 48 DF,  p-value: 1.45e-06
  
 > summary(reg.3) > summary(reg.3)
Line 602: Line 705:
  
 Residuals: Residuals:
-    Min      1Q  Median      3Q     Max  +   Min     1Q Median     3Q    Max  
--24.918  -7.537   0.222   6.491  28.827 +-22.71  -9.32  -1.30   7.92  26.43 
  
 Coefficients: Coefficients:
             Estimate Std. Error t value Pr(>|t|)                 Estimate Std. Error t value Pr(>|t|)    
-(Intercept)   648.41       1.74  373.13   <2e-16 *** +(Intercept)   651.72       1.68  387.43   <2e-16 *** 
-res.h          16.97       6.67    2.54    0.014 *  +res.h           8.32       8.21    1.01     0.32    
 --- ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
  
-Residual standard error: 12.on 48 degrees of freedom +Residual standard error: 11.on 48 degrees of freedom 
-Multiple R-squared:  0.119, Adjusted R-squared:   0. +Multiple R-squared:  0.0209, Adjusted R-squared:  0.000538  
-F-statistic: 6.47 on 1 and 48 DF,  p-value: 0.0142+F-statistic: 1.03 on 1 and 48 DF,  p-value: 0.316
  
 +
 +> reg.1$coefficient[2]
 +res.f 
 +1.399 
 +> reg.2$coefficient[2]
 + res.s 
 +0.8143 
 +> reg.3$coefficient[2]
 +res.h 
 +8.321 
  
 > spr.y.f <- spcor.test(GREV, FGPA, scholar[,c("SATV", "HSGPA")]) > spr.y.f <- spcor.test(GREV, FGPA, scholar[,c("SATV", "HSGPA")])
 > spr.y.s <- spcor.test(GREV, SATV, scholar[,c("HSGPA", "FGPA")]) > spr.y.s <- spcor.test(GREV, SATV, scholar[,c("HSGPA", "FGPA")])
 > spr.y.h <- spcor.test(GREV, HSGPA, scholar[,c("SATV", "FGPA")]) > spr.y.h <- spcor.test(GREV, HSGPA, scholar[,c("SATV", "FGPA")])
 +
 +> spr.y.f$estimate
 +[1] 0.03519
 +> spr.y.s$estimate
 +[1] 0.6217
 +> spr.y.h$estimate
 +[1] 0.1447
  
 > spr.y.f$estimate^2 > spr.y.f$estimate^2
-[1] 0.24+[1] 0.001238
 > spr.y.s$estimate^2 > spr.y.s$estimate^2
-[1] 0.01098+[1] 0.3865
 > spr.y.h$estimate^2 > spr.y.h$estimate^2
-[1] 0.1188+[1] 0.02094
  
 > summary(reg.1)$r.square > summary(reg.1)$r.square
-[1] 0.24+[1] 0.001238
 > summary(reg.2)$r.square > summary(reg.2)$r.square
-[1] 0.01098+[1] 0.3865
 > summary(reg.3)$r.square > summary(reg.3)$r.square
-[1] 0.1188+[1] 0.02094 
 +>  
 +> ca <- summary(reg.1)$r.square +  
 ++   summary(reg.2)$r.square +  
 ++   summary(reg.3)$r.square 
 +> # so common explanation area should be 
 +> summary(reg.all)$r.square - ca 
 +[1] 0.39
  
 </code> </code>
-{{:pasted:20241015-152559.png}}+{{:pasted:20241016-080226.png}}
  
 multiple regression 분석을 보면 독립변인의 coefficient 값은 각각  multiple regression 분석을 보면 독립변인의 coefficient 값은 각각 
-  * HSGPA         16.971  +  * HSGPA         8.3214  
-  * FGPA          17.679 +  * FGPA          1.3994 
-  * SATV           0.131+  * SATV          0.8143
 이 기울기에 대해서 t-test를 각각 하여 HSGPA와 FGPA의 설명력이 significant 한지를 확인하였다. 그리고 이 때의 R<sup>2</sup> 값은  이 기울기에 대해서 t-test를 각각 하여 HSGPA와 FGPA의 설명력이 significant 한지를 확인하였다. 그리고 이 때의 R<sup>2</sup> 값은 
-  * 0.672 (67.2%) 이었다.  +  * 0.799 이었다.  
-그런데 이 coefficient값은 독립변인 각각의 고유의 설명력을 가지고 (spcor.test(GREV, x1, 나머지제어)로 얻은 부분) 종속변인에 대해서 regression을 하여 얻은 coefficient값과 같음을 알 수 있다.  +그런데 이 coefficient값은 독립변인 각각의 고유의 설명력을 가지고 (spcor.test(GREV, x1, 나머지제어)로 얻은 부분) 종속변인에 대해서 regression을 하여 얻은 coefficient값과 같음을 알 수 있다. , <fc #ff0000>multiple regression의 독립변인의 b coefficient 값들은 고유의 설명부분을 (spr) 추출해서 y에 (GREV) regression한 결과와 같음을</fc> 알 수 있다. 
-<WRAP box> +
-<code> +
-> reg.1$coefficient[2] +
-res.f  +
-17.68  +
-> reg.2$coefficient[2] +
- res.s  +
-0.1305  +
-> reg.3$coefficient[2] +
-res.h  +
-16.97  +
->  +
-</code> +
-</WRAP> +
- +
-correlation값 +
-<WRAP box> +
-Coefficients: +
-            Estimate Std. Error t value Pr(>|t|)     +
-(Intercept)  466.989     54.659    8.54  4.7e-11 *** +
-HSGPA         16.971      4.160    4.08  0.00018 *** +
-FGPA          17.679      3.049    5.80  5.8e-07 *** +
-SATV           0.131      0.105    1.24  0.22123     +
---- +
-Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 +
- +
-Residual standard error: 7.66 on 46 degrees of freedom +
-Multiple R-squared:  0.672, Adjusted R-squared:  0.65  +
-F-statistic: 31.4 on 3 and 46 DF,  p-value: 3.43e-11 +
-</WRAP>  +
- +
-위의 결과를 보면 <fc #ff0000>multiple regression의 b coefficient 값들은 고유의 설명부분을 (spr) 추출해서 y에 (GREV) regression한 결과와 같음을</fc> 알 수 있다.  +
  
 +또한 세 독립변인이 공통적으로 설명하는 부분은 
 +  * 0.39 
 +임을 알 수 있다. 
 ====== e.g., 독립변인 들이 서로 독립적일 때의 각각의 설명력 ====== ====== e.g., 독립변인 들이 서로 독립적일 때의 각각의 설명력 ======
 In this example, the two IVs are orthogonal to each other (not correlated with each other). Hence, regress res.y.x2 against x1 would not result in any problem.  In this example, the two IVs are orthogonal to each other (not correlated with each other). Hence, regress res.y.x2 against x1 would not result in any problem. 
partial_and_semipartial_correlation.1728975163.txt.gz · Last modified: 2024/10/15 15:52 by hkimscil

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki