This is an old revision of the document!
Table of Contents
Mediation Analysis
Planned behavior 이론에 따라서 연구자는 아래와 같은 모델을 만들고 데이터를 얻은 후 테스트하려고 한다. 특히 이 단계에서 연구자는 Attitudes가 Behavior 미치는 영향력을 Attitudes 고유의 것과 Intention을 거쳐가는 것으로 구분하여 확인해보려고 한다. 이와 같은 통계검증 방식을 mediation analysis라고 하는데 . . . .
보통 lavaan package를 활용하여 (path analysis를 위해서 개발) mediator 변인의 효과와 두 독립변인의 공통효과를 알아낸다.
-- Intention -- a -- -- b -- -- Attitudes --------c'--------- Behavior
위의 그림에서 attitudes 가 behavior 미치는 영향력을 볼 때
- c' = 0 인 경우가 있다. 이 경우에 attitudes는 behavior에 대한 직접적인 영향력은 없고 오로지 a 와 b의 path만을 거쳐서 영향력을 (설명력을) 갖게 된다. 이 경우를 완전매개라고 (complete mediation) 한다
- 반면에 c' ~= 0 인 경우가 있다. 이 때는 c에도 부분적으로 설명력이 걸리고, a b를 통해서도 부분적으로 설명력이 걸리는 경우이므로 부분매개라고 (partial mediation) 부른다.
또한 mediation analysis에서 독립변인들의 효과를 (설명력을) 직접효과와 간접효과로 나눌 수 있는데 직접효과는 a, b, 그리고 c'를 직접효과라고 (direct effects) 하고 a와 b를 거쳐서 가는 효과를 간접효과라고 (indirect effects) 한다. Indirect effects 의 크기를 어떻게 측정하는가에는 여러가지 방법이 있을 수 있지만, 가장 많이 쓰이는 방법으로는
- a path와 b path의 coefficient값을 곱한 값을 취하는 방법이다
- 다른 방법으로는 b - a 값을 취하는 방법이 있지만 흔하지는 않다
위에서 a b 를 곱해서 간접효과를 측정할 때에 그 값이 (효과가) significant한지 알아보기 위한 테스트에는 두 가지 방법이 있을 수 있는데
- Delta method와 Resampling method 이다
- 전자는 indirect coefficient의 (곱한 값의) sampling distribution이 normal distribution을 이룬다는 전제하에 구하는 방법인데, 이런 경우가 드문 편이라 현실적이지 않은 면이 있다. 또한 샘플사이즈가 작을 때에는 부정확한 면이 있다
- 후자인 resampling method는 샘플링을 반복해서 indirect coefficient 값을 얻은 후에 이를 가지고 효과를 측정하는 경우이다.
- Percentile bootstrapping
- bias-corrected bootstrapping
- permutation method
- monte carlo method 등을 사용한다.
###################################################### ## data file: PlannedBehavior.csv ###################################################### ###################################################### install.packages("readr") library(readr) df <- read.csv("http://commres.net/wiki/_media/r/plannedbehavior.csv") head(df) str(df) # attitude # norms # control # intention # behavior ###################################################### ###################################################### ###################################################### # Mediation Analysis using lavaan # in R (path analysis framework, SEM) ###################################################### ###################################################### ###################################################### # specifying path analysis model # by using lavann package install.packages("lavann") library(lavaan) specmod <- " # path c # identifying path c (prime) by putting c* behavior ~ c*attitude # path a intention ~ a*attitude # path b behavior ~ b*intention # indirect effect (a*b): Sobel test (Delta Method) # 간접효과 a path x b path 를 구해서 얻음 # sobel test 라 부름 ab := a*b " # Fit/estimate the model fitmod <- sem(specmod, data=df) # summarize the model summary(fitmod, fit.measures=TRUE, rsquare=T) ########################################## # boot strapping instead of sobel test ########################################## set.seed(101) fitmod2 <- sem(specmod, data=df, se="bootstrap", bootstrap=100) # bootstrap = 5000 is common summary(fitmod2, fit.measures=TRUE, rsquare=TRUE) parameterEstimates(fitmod2, ci=TRUE, level=.95, boot.ci.type="perc")
output
> ###################################################### > ## data file: PlannedBehavior.csv > ###################################################### > ###################################################### > install.packages("readr") Error in install.packages : Updating loaded packages > library(readr) > df <- read.csv("http://commres.net/wiki/_media/r/plannedbehavior.csv") > head(df) attitude norms control intention behavior 1 2.31 2.31 2.03 2.50 2.62 2 4.66 4.01 3.63 3.99 3.64 3 3.85 3.56 4.20 4.35 3.83 4 4.24 2.25 2.84 1.51 2.25 5 2.91 3.31 2.40 1.45 2.00 6 2.99 2.51 2.95 2.59 2.20 > str(df) 'data.frame': 199 obs. of 5 variables: $ attitude : num 2.31 4.66 3.85 4.24 2.91 2.99 3.96 3.01 4.77 3.67 ... $ norms : num 2.31 4.01 3.56 2.25 3.31 2.51 4.65 2.98 3.09 3.63 ... $ control : num 2.03 3.63 4.2 2.84 2.4 2.95 3.77 1.9 3.83 5 ... $ intention: num 2.5 3.99 4.35 1.51 1.45 2.59 4.08 2.58 4.87 3.09 ... $ behavior : num 2.62 3.64 3.83 2.25 2 2.2 4.41 4.15 4.35 3.95 ... > # attitude > # norms > # control > # intention > # behavior > ###################################################### > ###################################################### > ###################################################### > # Mediation Analysis using lavaan > # in R (path analysis framework, SEM) > ###################################################### > ###################################################### > ###################################################### > > # specifying path analysis model > # by using lavann package > install.packages("lavann") ‘C:/Users/Hyo/Documents/R/win-library/4.1’의 위치에 패키지(들)을 설치합니다. (왜냐하면 ‘lib’가 지정되지 않았기 때문입니다) Warning in install.packages : package ‘lavann’ is not available for this version of R A version of this package for your version of R might be available elsewhere, see the ideas at https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages > library(lavaan) > > specmod <- " + # path c + # identifying path c (prime) by putting c* + behavior ~ c*attitude + + # path a + intention ~ a*attitude + + # path b + behavior ~ b*intention + + # indirect effect (a*b): Sobel test (Delta Method) + # 간접효과 a path x b path 를 구해서 얻음 + # sobel test 라 부름 + ab := a*b + " > # Fit/estimate the model > fitmod <- sem(specmod, data=df) > # summarize the model > summary(fitmod, fit.measures=TRUE, rsquare=T) lavaan 0.6-9 ended normally after 11 iterations Estimator ML Optimization method NLMINB Number of model parameters 5 Number of observations 199 Model Test User Model: Test statistic 0.000 Degrees of freedom 0 Model Test Baseline Model: Test statistic 103.700 Degrees of freedom 3 P-value 0.000 User Model versus Baseline Model: Comparative Fit Index (CFI) 1.000 Tucker-Lewis Index (TLI) 1.000 Loglikelihood and Information Criteria: Loglikelihood user model (H0) -481.884 Loglikelihood unrestricted model (H1) -481.884 Akaike (AIC) 973.767 Bayesian (BIC) 990.234 Sample-size adjusted Bayesian (BIC) 974.393 Root Mean Square Error of Approximation: RMSEA 0.000 90 Percent confidence interval - lower 0.000 90 Percent confidence interval - upper 0.000 P-value RMSEA <= 0.05 NA Standardized Root Mean Square Residual: SRMR 0.000 Parameter Estimates: Standard errors Standard Information Expected Information saturated (h1) model Structured Regressions: Estimate Std.Err z-value P(>|z|) behavior ~ attitude (c) 0.029 0.071 0.412 0.680 intention ~ attitude (a) 0.484 0.058 8.333 0.000 behavior ~ intention (b) 0.438 0.075 5.832 0.000 Variances: Estimate Std.Err z-value P(>|z|) .behavior 0.698 0.070 9.975 0.000 .intention 0.623 0.062 9.975 0.000 R-Square: Estimate behavior 0.199 intention 0.259 Defined Parameters: Estimate Std.Err z-value P(>|z|) ab 0.212 0.044 4.778 0.000 > > ########################################## > # boot strapping instead of sobel test > ########################################## > set.seed(101) > fitmod2 <- sem(specmod, data=df, + se="bootstrap", + bootstrap=100) > # bootstrap = 5000 is common > > summary(fitmod2, fit.measures=TRUE, rsquare=TRUE) lavaan 0.6-9 ended normally after 11 iterations Estimator ML Optimization method NLMINB Number of model parameters 5 Number of observations 199 Model Test User Model: Test statistic 0.000 Degrees of freedom 0 Model Test Baseline Model: Test statistic 103.700 Degrees of freedom 3 P-value 0.000 User Model versus Baseline Model: Comparative Fit Index (CFI) 1.000 Tucker-Lewis Index (TLI) 1.000 Loglikelihood and Information Criteria: Loglikelihood user model (H0) -481.884 Loglikelihood unrestricted model (H1) -481.884 Akaike (AIC) 973.767 Bayesian (BIC) 990.234 Sample-size adjusted Bayesian (BIC) 974.393 Root Mean Square Error of Approximation: RMSEA 0.000 90 Percent confidence interval - lower 0.000 90 Percent confidence interval - upper 0.000 P-value RMSEA <= 0.05 NA Standardized Root Mean Square Residual: SRMR 0.000 Parameter Estimates: Standard errors Bootstrap Number of requested bootstrap draws 100 Number of successful bootstrap draws 100 Regressions: Estimate Std.Err z-value P(>|z|) behavior ~ attitude (c) 0.029 0.066 0.446 0.655 intention ~ attitude (a) 0.484 0.051 9.453 0.000 behavior ~ intention (b) 0.438 0.075 5.843 0.000 Variances: Estimate Std.Err z-value P(>|z|) .behavior 0.698 0.061 11.464 0.000 .intention 0.623 0.061 10.170 0.000 R-Square: Estimate behavior 0.199 intention 0.259 Defined Parameters: Estimate Std.Err z-value P(>|z|) ab 0.212 0.045 4.690 0.000 > parameterEstimates(fitmod2, + ci=TRUE, level=.95, + boot.ci.type="perc") lhs op rhs label est se z pvalue ci.lower ci.upper 1 behavior ~ attitude c 0.029 0.066 0.446 0.655 -0.085 0.182 2 intention ~ attitude a 0.484 0.051 9.453 0.000 0.376 0.588 3 behavior ~ intention b 0.438 0.075 5.843 0.000 0.294 0.612 4 behavior ~~ behavior 0.698 0.061 11.464 0.000 0.561 0.822 5 intention ~~ intention 0.623 0.061 10.170 0.000 0.492 0.724 6 attitude ~~ attitude 0.928 0.000 NA NA 0.928 0.928 7 ab := a*b ab 0.212 0.045 4.690 0.000 0.131 0.319 >
질문. 위에서 rsquare 값은 무엇을 의미하나요?
- 우선 지정모델을 보면 (specmod) 아래처럼 세개의 리그레션을 지정한 것을 알 수 있습니다.
# 편의상 코멘트 지움 specmod <- " behavior ~ c*attitude behavior ~ b*intention intention ~ a*attitude ab := a*b "
- 이 중에서 종속변인을 골라내자면
- behavior ~ attitude + intention 에서 나타나는 r square 값
- intention ~ attitude 에서 나타나는 intention에서의 r square 값이 있을 수 있습니다.
- 위의 둘은 poking the above 라는 아래 문서에서
lm.ba.01 <- lm(behavior~attitude+intention, data=df) # all lm.ba.02 <- lm(behavior~intention, data=df) # b path lm.ba.03 <- lm(intention~attitude, data=df) # a path lm.ba.04 <- lm(attitude~intention, data=df) # reverse a path lm.ba.05 <- lm(behavior~attitude, data=df) # c prime path
- 위에서 lm.ba.01 과 lm.ba.03에 해당하는 regression이라고 하겠습니다.
- 아래는 이를 출력해보는 명령어입니다
> summary(lm.ba.01)$r.square [1] 0.1989125 > summary(lm.ba.03)$r.square [1] 0.2586768
- 바로 위의 값과 같은 값이라고 보면 되겠습니다.
질문. mediation effect는 왜 두 path를 곱하나요?
-- Intention -- a -- -- b -- -- Attitudes --------c'--------- Behavior
위에서 a, b는 beta coefficients라고 가정하고, 이 값들이 각각 a = 2, b = 1.5 라고 가정합니다. 이 때,
- a는 (2) attitudes의 measurement 한 unit이 증가할 때 intention이 2 증가한다는 것을 의미합니다.
- b는 (1.5)는 attitudes의 점수가 하나 증가할 때 마다 behavior가 2*1.5 증가함을 (3) 의미합니다. 즉, attitudes가 한 단위 증가할 때마다 beahvior는 3 증가합니다. 독립변인 attitudes의 intention을 매개로 한 영향력을 말할 때 이 3을 사용합니다. 따라서 ab (mediation effects) = a * b 로 생각할 수 있습니다.
Poking the above
# poking 둘러보기 # 모델 = # a Intention b # Attitude c Behavior # lm.ba.01 <- lm(behavior~attitude+intention, data=df) # all lm.ba.02 <- lm(behavior~intention, data=df) # b path lm.ba.03 <- lm(intention~attitude, data=df) # a path lm.ba.04 <- lm(attitude~intention, data=df) # reverse a path lm.ba.05 <- lm(behavior~attitude, data=df) # c prime path
위에서
summary(lm.ba.05) summary(lm.ba.01)
> summary(lm.ba.05) Call: lm(formula = behavior ~ attitude, data = df) Residuals: Min 1Q Median 3Q Max -1.9679 -0.6291 0.0882 0.6625 1.8128 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.3355 0.2221 10.51 < 2e-16 *** attitude 0.2413 0.0669 3.61 0.00039 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.909 on 197 degrees of freedom Multiple R-squared: 0.062, Adjusted R-squared: 0.0572 F-statistic: 13 on 1 and 197 DF, p-value: 0.000392 > summary(lm.ba.01) Call: lm(formula = behavior ~ attitude + intention, data = df) Residuals: Min 1Q Median 3Q Max -2.0192 -0.5728 0.0633 0.6373 1.7381 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.6962 0.2336 7.26 8.8e-12 *** attitude 0.0294 0.0720 0.41 0.68 intention 0.4377 0.0756 5.79 2.8e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.842 on 196 degrees of freedom Multiple R-squared: 0.199, Adjusted R-squared: 0.191 F-statistic: 24.3 on 2 and 196 DF, p-value: 3.64e-10
위 둘에서 attitude의 t test 결과를 비교해 보면
- lm.ba.05 에서는 attitude 가 behavior 를 설명하는 것으로 나타남
- 그러나, lm.ba.01 에서 attitude 와 intention 을 동시에 넣었을 때에는
- attitude 의 설명력은 의미가 없어지고 intention 만이 설명력을 보임
참고. 아래는 lm 으로 얻은 리그레션 결과가 가지는 데이터들로 residuals 은 리그레션 라인의 예측치와 실제관측치의 차이인 residual 값들을 저장한 데이터를 말한다.
> names(lm.ba.03) [1] "coefficients" "residuals" "effects" "rank" [5] "fitted.values" "assign" "qr" "df.residual" [9] "xlevels" "call" "terms" "model" >
- 또한 fitted.values 는 리그레션 라인의 예측치를 말하므로 이 예측치에서 Y 변인의 (intention) 평균값을 뺀 가치는 explained 혹은 regression 혹은 극복된 error에 해당하는 값이 된다. 이를 reg.int에 저장한 것이다.
- 또한 res.int는 residual 값을 저장한 것이다.
- 이 둘의 각각 제곱의 합은 SS regression과 SS residual이 된다.
- 그러므로 이 둘을 합한 값은 intention 변인의 각 개인 값에서 평균값을 제외한 값이 된다. 만약에 intention값을 그대로 (똑같이) 복원하고자 한다면
- reg.reg + res.int + mean(df$intention) 을 이용하면 될 것이다.
reg.int <- lm.ba.03$fitted.values - mean(df$intention) res.int <- summary(lm.ba.03)$residuals
- 또한 SS reg + SS res = SS y = SS total = SS intention 임을 알고 있다.
# just checking sum(reg.int^2) + sum(res.int^2) var(df$intention)*(length(df$intention)-1)
아래는 behavior를 종속변인으로 하고 reg.int를 독립변인으로 regression을 한 결과이다. 이 결과는 intention 의 SS 중에서 attitude의 설명력 부분이 (regression 부분) behavior에 어떻게 설명이 되는가를 보기 위한 것이다.
# the intention part contributed by attitudes # is it explaing behavior too? lm.ba.021 <- lm(behavior~reg.int, data=df) summary(lm.ba.021)
반대로 아래는 res.int는 attitude의 설명력을 제외한 나머지이고 이것이 behavior에 어떻게 영향을 주는지 보는 것이다. 이는 attitude의 설명력이 “mediated”되지 않고 (attitude의 설명력을 제외한) 순수 intention이 behavior에 영향을 주는 것을 보는 부분이다.
# the pure intention part excluding # what attitude contributes lm.ba.022 <- lm(behavior~res.int, data=df) summary(lm.ba.022)
- 또한 당연한 이야기지만 int의 residual과 regression 파트를 더해서 behavior에 regression을 해보면 intention으로 regression을 한 것과 (b path) 같은 결과를 보여줄 것이다.
int.all <- res.int + reg.int lm.temp <- lm(behavior~int.all, data=df) summary(lm.temp) summary(lm.ba.02)
그러니,
- lm.ba.021는 (reg.int로 behavior에 regression을 한 결과) attitude의 영향력이 mediated 된 intention 부분의 설명력을 보여주고
- lm.ba.022는 (res.int로 behavior에 regression을 한 결과) attitude의 영향력을 제외한 intention 부분의 설명력을 보여주는 것이라고 볼 수 있다.
아래는 이를 다시 한번 출력해 본 것이다
summary(lm.ba.021) summary(lm.ba.022)
혹은 아래와 같이 살펴볼 수도 있다.
# 아래는 attitude - intention - behavior 의 # 영향력의 경로를 도식화 준다. # intention - behavior part summary(lm.ba.02)$r.squared # K - attitudes 가 intention을 설명해 주는 부분 (regression error) summary(lm.ba.021)$r.squared # J - attitudes 가 설명하지 못하는 부분 (residual error) summary(lm.ba.022)$r.squared # 위에서 intention은 K와 J로 이루어져 있다. 이를 확인하는 것 summary(lm.ba.021)$r.squared + summary(lm.ba.022)$r.squared
그렇다면 attitude가 intention을 통해서 mediated된 부분의 설명력을 제외한 나머지는 얼마가 될까라고 생각을 하면 이는 lm.ba.04 에서의 (reversed a path) residual 이 behavior에 얼마나 영향을 주는가를 보는 것과 같을 것이다. 이를 계산해 보면
# lm.ba.04 <- lm(attitude~intention, data=df) # reverse a path res.temp <- lm.ba.04$residuals # res.temp는 attitude 중에서 intention의 설명력을 제외한 (관계있는 부분을 제외한) 것을 의미 # 이것으로 behavior에 regression을 한다는 것은 attitude가 intention을 타고 넘어가는 (매개로 # 하는) 영향력을 제외한 부분만을 가지고 behavior에 regression을 한다는 것으로 이해할 수 있다 lm.temp <- lm(behavior~res.temp, data=df) summary(lm.temp) summary(lm.ba.01)
위에서
attitude의 regression coeffcient 값이 같음을 알 수 있다 (b = 0.02941).
> res.temp <- lm.ba.04$residuals > lm.temp <- lm(behavior~res.temp, data=df) > summary(lm.temp) Call: lm(formula = behavior ~ res.temp, data = df) Residuals: Min 1Q Median 3Q Max -2.1138 -0.6154 0.0725 0.6955 1.9098 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.1025 0.0665 46.66 <2e-16 *** res.temp 0.0294 0.0802 0.37 0.71 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.938 on 197 degrees of freedom Multiple R-squared: 0.000683, Adjusted R-squared: -0.00439 F-statistic: 0.135 on 1 and 197 DF, p-value: 0.714 > summary(lm.ba.01) Call: lm(formula = behavior ~ attitude + intention, data = df) Residuals: Min 1Q Median 3Q Max -2.0192 -0.5728 0.0633 0.6373 1.7381 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.6962 0.2336 7.26 8.8e-12 *** attitude 0.0294 0.0720 0.41 0.68 intention 0.4377 0.0756 5.79 2.8e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.842 on 196 degrees of freedom Multiple R-squared: 0.199, Adjusted R-squared: 0.191 F-statistic: 24.3 on 2 and 196 DF, p-value: 3.64e-10
Another modeling
성재학생이 다른 아이디어를 가지고 있다. 이를 분석에서 구현보고자 한다.
library(lavaan) specmod <- " # regression test behavior ~ c*attitude + m*control + n*norms intention ~ a*attitude + k*norms + j*control behavior ~ b*intention # indirect effect (a*b): Sobel test (Delta Method) ab := a*b kb := k*b jb := j*b " # Fit/estimate the model fitmod <- sem(specmod, data=df) # summarize the model summary(fitmod, fit.measures=TRUE, rsquare=T)
Another moding output
> library(lavaan) > > specmod <- " + # path c + # identifying path c (prime) by putting c* + behavior ~ c*attitude + + m*control + + n*norms + + # path a + intention ~ a*attitude + + k*norms + + j*control + + # path b + behavior ~ b*intention + + # indirect effect (a*b): Sobel test (Delta Method) + # 간접효과 a path x b path 를 구해서 얻음 + # sobel test 라 부름 + ab := a*b + kb := k*b + jb := j*b + + # attitude ~~ norms + control + # norms ~~ control + " > # Fit/estimate the model > fitmod <- sem(specmod, data=df) > # summarize the model > summary(fitmod, fit.measures=TRUE, rsquare=T) lavaan 0.6-12 ended normally after 1 iterations Estimator ML Optimization method NLMINB Number of model parameters 9 Number of observations 199 Model Test User Model: Test statistic 0.000 Degrees of freedom 0 Model Test Baseline Model: Test statistic 137.622 Degrees of freedom 7 P-value 0.000 User Model versus Baseline Model: Comparative Fit Index (CFI) 1.000 Tucker-Lewis Index (TLI) 1.000 Loglikelihood and Information Criteria: Loglikelihood user model (H0) -464.922 Loglikelihood unrestricted model (H1) -464.922 Akaike (AIC) 947.845 Bayesian (BIC) 977.484 Sample-size adjusted Bayesian (BIC) 948.972 Root Mean Square Error of Approximation: RMSEA 0.000 90 Percent confidence interval - lower 0.000 90 Percent confidence interval - upper 0.000 P-value RMSEA <= 0.05 NA Standardized Root Mean Square Residual: SRMR 0.000 Parameter Estimates: Standard errors Standard Information Expected Information saturated (h1) model Structured Regressions: Estimate Std.Err z-value P(>|z|) behavior ~ attitude (c) 0.041 0.072 0.563 0.573 control (m) -0.090 0.070 -1.285 0.199 norms (n) 0.042 0.069 0.605 0.545 intention ~ attitude (a) 0.352 0.058 6.068 0.000 norms (k) 0.153 0.059 2.577 0.010 control (j) 0.275 0.058 4.740 0.000 behavior ~ intention (b) 0.463 0.081 5.715 0.000 Variances: Estimate Std.Err z-value P(>|z|) .behavior 0.692 0.069 9.975 0.000 .intention 0.530 0.053 9.975 0.000 R-Square: Estimate behavior 0.206 intention 0.369 Defined Parameters: Estimate Std.Err z-value P(>|z|) ab 0.163 0.039 4.160 0.000 kb 0.071 0.030 2.349 0.019 jb 0.127 0.035 3.648 0.000 >
What about this output
library(lavaan) specmod <- " # path c # identifying path c (prime) by putting c* # path a intention ~ a*attitude + k*norms + j*control # path b behavior ~ b*intention # indirect effect (a*b): Sobel test (Delta Method) # 간접효과 a path x b path 를 구해서 얻음 # sobel test 라 부름 ab := a*b kb := k*b jb := j*b # attitude ~~ norms + control # norms ~~ control " # Fit/estimate the model fitmod <- cfa(specmod, data=df) # summarize the model summary(fitmod, fit.measures=TRUE, rsquare=T)
아래 아웃풋의 . . . path analysis 참조
chi-square test p-value는 어떤가?
CFI, TLI는?
그 외의 지수는?
어떤 모델이 지금의 현상을 가장 잘 설명하는다고 판단하는가?
Output
library(lavaan) > > specmod <- " + # path c + # identifying path c (prime) by putting c* + + # path a + intention ~ a*attitude + + k*norms + + j*control + + # path b + behavior ~ b*intention + + # indirect effect (a*b): Sobel test (Delta Method) + # 간접효과 a path x b path 를 구해서 얻음 + # sobel test 라 부름 + ab := a*b + kb := k*b + jb := j*b + + # attitude ~~ norms + control + # norms ~~ control + " > # Fit/estimate the model > fitmod <- cfa(specmod, data=df) > # summarize the model > summary(fitmod, fit.measures=TRUE, rsquare=T) lavaan 0.6-12 ended normally after 1 iterations Estimator ML Optimization method NLMINB Number of model parameters 6 Number of observations 199 Model Test User Model: Test statistic 2.023 Degrees of freedom 3 P-value (Chi-square) 0.568 Model Test Baseline Model: Test statistic 137.622 Degrees of freedom 7 P-value 0.000 User Model versus Baseline Model: Comparative Fit Index (CFI) 1.000 Tucker-Lewis Index (TLI) 1.017 Loglikelihood and Information Criteria: Loglikelihood user model (H0) -465.934 Loglikelihood unrestricted model (H1) -464.922 Akaike (AIC) 943.868 Bayesian (BIC) 963.628 Sample-size adjusted Bayesian (BIC) 944.619 Root Mean Square Error of Approximation: RMSEA 0.000 90 Percent confidence interval - lower 0.000 90 Percent confidence interval - upper 0.103 P-value RMSEA <= 0.05 0.735 Standardized Root Mean Square Residual: SRMR 0.019 Parameter Estimates: Standard errors Standard Information Expected Information saturated (h1) model Structured Regressions: Estimate Std.Err z-value P(>|z|) intention ~ attitude (a) 0.352 0.058 6.068 0.000 norms (k) 0.153 0.059 2.577 0.010 control (j) 0.275 0.058 4.740 0.000 behavior ~ intention (b) 0.453 0.065 7.014 0.000 Variances: Estimate Std.Err z-value P(>|z|) .intention 0.530 0.053 9.975 0.000 .behavior 0.699 0.070 9.975 0.000 R-Square: Estimate intention 0.369 behavior 0.198 Defined Parameters: Estimate Std.Err z-value P(>|z|) ab 0.160 0.035 4.589 0.000 kb 0.069 0.029 2.419 0.016 jb 0.125 0.032 3.927 0.000
e.gs
# install.packages("ISLR") library(ISLR) attach(mtcars) # x <- disp (배기량) # y <- mpg (아일리지) # m <- wt (무게) specmod <- " # path c # identifying path c (prime) by putting c* mpg ~ c*disp # path a wt ~ a*disp # path b mpg ~ b*wt # indirect effect (a*b): Sobel test (Delta Method) # 간접효과 a path x b path 를 구해서 얻음 # sobel test 라 부름 ab := a*b " # Fit/estimate the model fitmod <- sem(specmod, data=mtcars) summary(fitmod) set.seed(101) fitmod2 <- sem(specmod, data=mtcars, se="bootstrap", bootstrap=200) # bootstrap = 5000 is common summary(fitmod2, fit.measures=TRUE, rsquare=TRUE) parameterEstimates(fitmod2, ci=TRUE, level=.95, boot.ci.type="perc")
install.packages("ISLR") library(ISLR) attach(Carseats) mod.total <- lm(Sales~Price, data=Carseats) summary(mod.total) mod.m <- lm(CompPrice~Price, data=Carseats) summary(mod.m) mod.y <- lm(Sales~Price+CompPrice, data=Carseats) summary(mod.y) # install.packages("multilevel") library(multilevel) sob.m <- sobel(pred = Price, med = CompPrice, out = Sales) sob.m pnorm(abs(sob.m$z.value), lower.tail = FALSE) * 2 install.packages("bda") library(bda) mediation.test(mv=CompPrice, iv=Price, dv=Sales) install.packages("nloptr") library(nloptr) install.packages("mediation") library(mediation) set.seed(101) mod.mediation <- mediate(model.m=mod.m, model.y = mod.y, treat="Price", mediator="CompPrice", boot=TRUE, sims=500) summary(mod.mediation) # Total Effect = # ADE Average Direct Effect = 직접효과 # ACME Average Causal Mediation Effect = 간접효과 plot(mod.mediation)
e.g. 2
https://advstats.psychstat.org/book/mediation/index.php
#################################### nlsy <- read.csv("http://commres.net/wiki/_media/r/nlsy.csv") attach(nlsy) # install.packages("bmem") library(bmem) library(sem) # install.packages("cfa") library(cfa)
########################## nlsy.model<-specifyEquations(exog.variances=T) math =b*HE + cp*ME HE = a*ME <ENTER> effects<-c('a*b', 'cp+a*b') nlsy.res<-bmem.sobel(nlsy, nlsy.model,effects) ##########################
m.me.he <- lm(ME~HE) m.math.me <- lm(math~ME) m.math.he <- lm(math~HE) m.math.mehe <- lm(math~ME+HE) m.he.me <- lm(HE~ME) summary(m.he.me) res.m.he.me <- resid(m.he.me) m.temp <- lm(math~res.m.he.me) summary(m.temp) res.m.me.he <- resid(m.me.he) m.temp2 <- lm(math~res.m.me.he) summary(m.temp2) library(lavaan) specmod <- " # path c' (direct effect) math ~ c*ME # path a HE ~ a*ME # path b math ~ b*HE # indirect effect (a*b) # sobel test (Delta method) ab := a*b " # fit/estimate model fitmod <- sem(specmod, data=df) # summarize/result the output summary(fitmod, fit.measures=TRUE, rsquare=TRUE) # for a summary(m.he.me) # for b summary(m.temp) # for cprime summary(m.temp2) a <- summary(m.he.me)$coefficient[2] # a b <- summary(m.temp)$coefficient[2] # b c <- summary(m.temp2)$coefficient[2] # c a b c a*b c2 <- summary(fitmod)$pe$est[1] a2 <- summary(fitmod)$pe$est[2] b2 <- summary(fitmod)$pe$est[3] ab2 <- summary(fitmod)$pe$est[7] a2 b2 c2 ab2
> #################################### > nlsy <- read.csv("http://commres.net/wiki/_media/r/nlsy.csv") > attach(nlsy) The following objects are masked from nlsy (pos = 3): HE, math, ME The following objects are masked from nlsy (pos = 5): HE, math, ME The following objects are masked from df: HE, math, ME > # install.packages("bmem") > library(bmem) > library(sem) > # install.packages("cfa") > library(cfa) >
> nlsy.model<-specifyEquations(exog.variances=T) 1: math =b*HE + cp*ME 2: HE = a*ME 3: Read 2 items NOTE: adding 3 variances to the model > effects<-c('a*b', 'cp+a*b') > nlsy.res<-bmem.sobel(nlsy, nlsy.model,effects) Estimate S.E. z-score p.value b 0.46450283 0.14304860 3.247168 1.165596e-03 cp 0.46281480 0.11977862 3.863918 1.115825e-04 a 0.13925694 0.04292438 3.244239 1.177650e-03 V[HE] 2.73092635 0.20078170 13.601471 0.000000e+00 V[math] 20.67659134 1.52017323 13.601471 0.000000e+00 V[ME] 4.00590078 0.29451968 13.601471 0.000000e+00 a*b 0.06468524 0.02818457 2.295059 2.172977e-02 cp+a*b 0.52750005 0.11978162 4.403848 1.063474e-05 >
> m.me.he <- lm(ME~HE) > m.math.me <- lm(math~ME) > m.math.he <- lm(math~HE) > m.math.mehe <- lm(math~ME+HE) > m.he.me <- lm(HE~ME) > summary(m.he.me) Call: lm(formula = HE ~ ME) Residuals: Min 1Q Median 3Q Max -5.5020 -0.7805 0.2195 1.2195 3.3587 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.10944 0.49127 8.365 1.25e-15 *** ME 0.13926 0.04298 3.240 0.0013 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.655 on 369 degrees of freedom Multiple R-squared: 0.02766, Adjusted R-squared: 0.02502 F-statistic: 10.5 on 1 and 369 DF, p-value: 0.001305 > res.m.he.me <- resid(m.he.me) > m.temp <- lm(math~res.m.he.me) > summary(m.temp) Call: lm(formula = math ~ res.m.he.me) Residuals: Min 1Q Median 3Q Max -12.4263 -3.1496 -0.3499 2.2826 29.7795 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 12.1186 0.2427 49.936 < 2e-16 *** res.m.he.me 0.4645 0.1471 3.159 0.00172 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 4.674 on 369 degrees of freedom Multiple R-squared: 0.02633, Adjusted R-squared: 0.02369 F-statistic: 9.978 on 1 and 369 DF, p-value: 0.001715 > res.m.me.he <- resid(m.me.he) > m.temp2 <- lm(math~res.m.me.he) > summary(m.temp2) Call: lm(formula = math ~ res.m.me.he) Residuals: Min 1Q Median 3Q Max -14.1938 -2.7965 -0.3425 2.4081 29.5656 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 12.1186 0.2413 50.22 < 2e-16 *** res.m.me.he 0.4628 0.1224 3.78 0.000183 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 4.648 on 369 degrees of freedom Multiple R-squared: 0.03728, Adjusted R-squared: 0.03467 F-statistic: 14.29 on 1 and 369 DF, p-value: 0.0001828 > > library(lavaan) > specmod <- " + # path c' (direct effect) + math ~ c*ME + + # path a + HE ~ a*ME + + # path b + math ~ b*HE + + # indirect effect (a*b) + # sobel test (Delta method) + ab := a*b + " > > # fit/estimate model > fitmod <- sem(specmod, data=df) > > # summarize/result the output > summary(fitmod, fit.measures=TRUE, rsquare=TRUE) lavaan 0.6-12 ended normally after 1 iterations Estimator ML Optimization method NLMINB Number of model parameters 5 Number of observations 371 Model Test User Model: Test statistic 0.000 Degrees of freedom 0 Model Test Baseline Model: Test statistic 39.785 Degrees of freedom 3 P-value 0.000 User Model versus Baseline Model: Comparative Fit Index (CFI) 1.000 Tucker-Lewis Index (TLI) 1.000 Loglikelihood and Information Criteria: Loglikelihood user model (H0) -1800.092 Loglikelihood unrestricted model (H1) -1800.092 Akaike (AIC) 3610.184 Bayesian (BIC) 3629.765 Sample-size adjusted Bayesian (BIC) 3613.901 Root Mean Square Error of Approximation: RMSEA 0.000 90 Percent confidence interval - lower 0.000 90 Percent confidence interval - upper 0.000 P-value RMSEA <= 0.05 NA Standardized Root Mean Square Residual: SRMR 0.000 Parameter Estimates: Standard errors Standard Information Expected Information saturated (h1) model Structured Regressions: Estimate Std.Err z-value P(>|z|) math ~ ME (c) 0.463 0.120 3.869 0.000 HE ~ ME (a) 0.139 0.043 3.249 0.001 math ~ HE (b) 0.465 0.143 3.252 0.001 Variances: Estimate Std.Err z-value P(>|z|) .math 20.621 1.514 13.620 0.000 .HE 2.724 0.200 13.620 0.000 R-Square: Estimate math 0.076 HE 0.028 Defined Parameters: Estimate Std.Err z-value P(>|z|) ab 0.065 0.028 2.298 0.022 > > # for a > summary(m.he.me) Call: lm(formula = HE ~ ME) Residuals: Min 1Q Median 3Q Max -5.5020 -0.7805 0.2195 1.2195 3.3587 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.10944 0.49127 8.365 1.25e-15 *** ME 0.13926 0.04298 3.240 0.0013 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.655 on 369 degrees of freedom Multiple R-squared: 0.02766, Adjusted R-squared: 0.02502 F-statistic: 10.5 on 1 and 369 DF, p-value: 0.001305 > # for b > summary(m.temp) Call: lm(formula = math ~ res.m.he.me) Residuals: Min 1Q Median 3Q Max -12.4263 -3.1496 -0.3499 2.2826 29.7795 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 12.1186 0.2427 49.936 < 2e-16 *** res.m.he.me 0.4645 0.1471 3.159 0.00172 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 4.674 on 369 degrees of freedom Multiple R-squared: 0.02633, Adjusted R-squared: 0.02369 F-statistic: 9.978 on 1 and 369 DF, p-value: 0.001715 > # for cprime > summary(m.temp2) Call: lm(formula = math ~ res.m.me.he) Residuals: Min 1Q Median 3Q Max -14.1938 -2.7965 -0.3425 2.4081 29.5656 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 12.1186 0.2413 50.22 < 2e-16 *** res.m.me.he 0.4628 0.1224 3.78 0.000183 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 4.648 on 369 degrees of freedom Multiple R-squared: 0.03728, Adjusted R-squared: 0.03467 F-statistic: 14.29 on 1 and 369 DF, p-value: 0.0001828 > > a <- summary(m.he.me)$coefficient[2] # a > b <- summary(m.temp)$coefficient[2] # b > c <- summary(m.temp2)$coefficient[2] # c > a [1] 0.1392569 > b [1] 0.4645028 > c [1] 0.4628148 > a*b [1] 0.06468524 > > c2 <- summary(fitmod)$pe$est[1] > a2 <- summary(fitmod)$pe$est[2] > b2 <- summary(fitmod)$pe$est[3] > ab2 <- summary(fitmod)$pe$est[7] > a2 [1] 0.1392569 > b2 [1] 0.4645028 > c2 [1] 0.4628148 > ab2 [1] 0.06468524 >
temp
tests <- read.csv("http://commres.net/wiki/_media/r/tests_cor.csv") colnames(tests) <- c("ser", "sat", "clep", "gpa") tests <- subset(tests, select=c("sat", "clep", "gpa")) attach(tests)
lm.gpa.clepsat <- lm(gpa ~ clep + sat, data = tests) summary(lm.gpa.clepsat) lm.gpa.clep <- lm(gpa ~ clep) lm.gpa.sat <- lm(gpa ~ sat) summary(lm.gpa.clep) summary(lm.gpa.sat)
> lm.gpa.clepsat <- lm(gpa ~ clep + sat, data = tests) > summary(lm.gpa.clepsat) Call: lm(formula = gpa ~ clep + sat, data = tests) Residuals: Min 1Q Median 3Q Max -0.197888 -0.128974 -0.000528 0.131170 0.226404 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.1607560 0.4081117 2.844 0.0249 * clep 0.0729294 0.0253799 2.874 0.0239 * sat -0.0007015 0.0012564 -0.558 0.5940 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1713 on 7 degrees of freedom Multiple R-squared: 0.7778, Adjusted R-squared: 0.7143 F-statistic: 12.25 on 2 and 7 DF, p-value: 0.005175 >
0.7778 이 두 변인 clep 과 sat 가 gpa를 설명하는 부분
summary(lm.gpa.clepsat)$r.squared
= 0.778
그리고, 위에서 sat의 설명력은 significant하지 않음
그럼 sat만으로 gpa를 보면?
> lm.gpa.clep <- lm(gpa ~ clep) > lm.gpa.sat <- lm(gpa ~ sat) > summary(lm.gpa.clep) Call: lm(formula = gpa ~ clep) Residuals: Min 1Q Median 3Q Max -0.190496 -0.141167 -0.002376 0.110847 0.225207 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.17438 0.38946 3.015 0.016676 * clep 0.06054 0.01177 5.144 0.000881 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1637 on 8 degrees of freedom Multiple R-squared: 0.7679, Adjusted R-squared: 0.7388 F-statistic: 26.46 on 1 and 8 DF, p-value: 0.0008808 > summary(lm.gpa.sat) Call: lm(formula = gpa ~ sat) Residuals: Min 1Q Median 3Q Max -0.23544 -0.12184 0.00316 0.02943 0.56456 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.7848101 0.4771715 3.740 0.0057 ** sat 0.0024557 0.0008416 2.918 0.0193 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.2365 on 8 degrees of freedom Multiple R-squared: 0.5156, Adjusted R-squared: 0.455 F-statistic: 8.515 on 1 and 8 DF, p-value: 0.01935 >
위에서처럼 significant함.
summary(lm.gpa.clep)$r.squared = 0.7679
summary(lm.gpa.sat)$r.squared = 0.5156
그렇다면 sat의 영향력은 clep을 매개로 해서 나타나는가를 보기 위해서
> lm.clep.sat <- lm(clep ~ sat) > summary(lm.clep.sat) Call: lm(formula = clep ~ sat) Residuals: Min 1Q Median 3Q Max -2.5316 -1.2437 -0.2848 0.0949 5.6329 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 8.556962 4.813367 1.778 0.11334 sat 0.043291 0.008489 5.100 0.00093 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2.386 on 8 degrees of freedom Multiple R-squared: 0.7648, Adjusted R-squared: 0.7353 F-statistic: 26.01 on 1 and 8 DF, p-value: 0.0009303 >
res.lm.clep.sat <- resid(lm.clep.sat) reg.lm.clep.sat <- predict(lm.clep.sat)-mean(clep) lm.gpa.sat.mediated.via.clep <- lm(gpa~reg.lm.clep.sat) lm.gpa.clep.alone <- lm(gpa~res.lm.clep.sat) summary(lm.gpa.sat.mediated.via.clep) summary(lm.gpa.clep.alone)
> res.lm.clep.sat <- resid(lm.clep.sat) > reg.lm.clep.sat <- predict(lm.clep.sat)-mean(clep) > lm.gpa.sat.mediated.via.clep <- lm(gpa~reg.lm.clep.sat) > lm.gpa.clep.alone <- lm(gpa~res.lm.clep.sat) > > summary(lm.gpa.sat.mediated.via.clep) Call: lm(formula = gpa ~ reg.lm.clep.sat) Residuals: Min 1Q Median 3Q Max -0.23544 -0.12184 0.00316 0.02943 0.56456 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.16000 0.07480 42.246 1.09e-10 *** reg.lm.clep.sat 0.05673 0.01944 2.918 0.0193 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.2365 on 8 degrees of freedom Multiple R-squared: 0.5156, Adjusted R-squared: 0.455 F-statistic: 8.515 on 1 and 8 DF, p-value: 0.01935 > summary(lm.gpa.clep.alone) Call: lm(formula = gpa ~ res.lm.clep.sat) Residuals: Min 1Q Median 3Q Max -0.34523 -0.23300 -0.04416 0.27577 0.36370 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.16000 0.09231 34.231 5.8e-10 *** res.lm.clep.sat 0.07293 0.04326 1.686 0.13 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.2919 on 8 degrees of freedom Multiple R-squared: 0.2622, Adjusted R-squared: 0.1699 F-statistic: 2.842 on 1 and 8 DF, p-value: 0.1303 >
sat의 영향력 mediated via clep = Multiple R-squared: 0.5156
sat의 영향력을 제외한 celp만의 설명력 = Multiple R-squared: 0.2622
temp2
> ############################# > ############################# > # https://data.library.virginia.edu/introduction-to-mediation-analysis/ > # Download data online. This is a simulated dataset for this post. > myData <- read.csv('http://static.lib.virginia.edu/statlab/materials/data/mediationData.csv') > > model.0 <- lm(Y ~ X, myData) > summary(model.0) Call: lm(formula = Y ~ X, data = myData) Residuals: Min 1Q Median 3Q Max -5.0262 -1.2340 -0.3282 1.5583 5.1622 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.8572 0.6932 4.122 7.88e-05 *** X 0.3961 0.1112 3.564 0.000567 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.929 on 98 degrees of freedom Multiple R-squared: 0.1147, Adjusted R-squared: 0.1057 F-statistic: 12.7 on 1 and 98 DF, p-value: 0.0005671 > # c line의 coef 값 = 0.396 at p-level = 0.0006 > > model.M <- lm(M ~ X, myData) > summary(model.M) Call: lm(formula = M ~ X, data = myData) Residuals: Min 1Q Median 3Q Max -4.3046 -0.8656 0.1344 1.1344 4.6954 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.49952 0.58920 2.545 0.0125 * X 0.56102 0.09448 5.938 4.39e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.639 on 98 degrees of freedom Multiple R-squared: 0.2646, Adjusted R-squared: 0.2571 F-statistic: 35.26 on 1 and 98 DF, p-value: 4.391e-08 > # a line = 0.561 p = 0.0 > > model.Y <- lm(Y ~ X + M, myData) > summary(model.Y) Call: lm(formula = Y ~ X + M, data = myData) Residuals: Min 1Q Median 3Q Max -3.7631 -1.2393 0.0308 1.0832 4.0055 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.9043 0.6055 3.145 0.0022 ** X 0.0396 0.1096 0.361 0.7187 M 0.6355 0.1005 6.321 7.92e-09 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.631 on 97 degrees of freedom Multiple R-squared: 0.373, Adjusted R-squared: 0.3601 F-statistic: 28.85 on 2 and 97 DF, p-value: 1.471e-10 > ################################################# > ## b line + c' line (c에서 M이 추가되었으므로) > ## c' = 0.0396 --- > ## b = 0.6355 *** > ################################################# > > library(mediation) > results <- mediate(model.M, model.Y, + treat='X', + mediator='M', + boot=TRUE, sims=500) Running nonparametric bootstrap > summary(results) Causal Mediation Analysis Nonparametric Bootstrap Confidence Intervals with the Percentile Method Estimate 95% CI Lower 95% CI Upper p-value ACME 0.3565 0.2174 0.54 <2e-16 *** ADE 0.0396 -0.1712 0.31 0.74 Total Effect 0.3961 0.1890 0.64 <2e-16 *** Prop. Mediated 0.9000 0.4707 1.69 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Sample Size Used: 100 Simulations: 500 > > # OR with lavaan > library(lavaan) > specmod <- " + # path c' (direct effect) + Y ~ c*X + + # path a + M ~ a*X + + # path b + Y ~ b*M + + # indirect effect (a*b) + # sobel test (Delta method) + ab := a*b + " > > # fit/estimate model > fitmod <- sem(specmod, data=myData) > > # summarize/result the output > summary(fitmod, fit.measures=TRUE, rsquare=TRUE) lavaan 0.6-12 ended normally after 1 iterations Estimator ML Optimization method NLMINB Number of model parameters 5 Number of observations 100 Model Test User Model: Test statistic 0.000 Degrees of freedom 0 Model Test Baseline Model: Test statistic 77.413 Degrees of freedom 3 P-value 0.000 User Model versus Baseline Model: Comparative Fit Index (CFI) 1.000 Tucker-Lewis Index (TLI) 1.000 Loglikelihood and Information Criteria: Loglikelihood user model (H0) -379.612 Loglikelihood unrestricted model (H1) -379.612 Akaike (AIC) 769.225 Bayesian (BIC) 782.250 Sample-size adjusted Bayesian (BIC) 766.459 Root Mean Square Error of Approximation: RMSEA 0.000 90 Percent confidence interval - lower 0.000 90 Percent confidence interval - upper 0.000 P-value RMSEA <= 0.05 NA Standardized Root Mean Square Residual: SRMR 0.000 Parameter Estimates: Standard errors Standard Information Expected Information saturated (h1) model Structured Regressions: Estimate Std.Err z-value P(>|z|) Y ~ X (c) 0.040 0.108 0.367 0.714 M ~ X (a) 0.561 0.094 5.998 0.000 Y ~ M (b) 0.635 0.099 6.418 0.000 Variances: Estimate Std.Err z-value P(>|z|) .Y 2.581 0.365 7.071 0.000 .M 2.633 0.372 7.071 0.000 R-Square: Estimate Y 0.373 M 0.265 Defined Parameters: Estimate Std.Err z-value P(>|z|) ab 0.357 0.081 4.382 0.000 > > # Resampling method > set.seed(2019) > > fitmod <- sem(specmod, data=myData, se="bootstrap", bootstrap=1000) > summary(fitmod, fit.measures=TRUE, rsquare=TRUE) lavaan 0.6-12 ended normally after 1 iterations Estimator ML Optimization method NLMINB Number of model parameters 5 Number of observations 100 Model Test User Model: Test statistic 0.000 Degrees of freedom 0 Model Test Baseline Model: Test statistic 77.413 Degrees of freedom 3 P-value 0.000 User Model versus Baseline Model: Comparative Fit Index (CFI) 1.000 Tucker-Lewis Index (TLI) 1.000 Loglikelihood and Information Criteria: Loglikelihood user model (H0) -379.612 Loglikelihood unrestricted model (H1) -379.612 Akaike (AIC) 769.225 Bayesian (BIC) 782.250 Sample-size adjusted Bayesian (BIC) 766.459 Root Mean Square Error of Approximation: RMSEA 0.000 90 Percent confidence interval - lower 0.000 90 Percent confidence interval - upper 0.000 P-value RMSEA <= 0.05 NA Standardized Root Mean Square Residual: SRMR 0.000 Parameter Estimates: Standard errors Bootstrap Number of requested bootstrap draws 1000 Number of successful bootstrap draws 1000 Regressions: Estimate Std.Err z-value P(>|z|) Y ~ X (c) 0.040 0.124 0.319 0.750 M ~ X (a) 0.561 0.095 5.888 0.000 Y ~ M (b) 0.635 0.102 6.224 0.000 Variances: Estimate Std.Err z-value P(>|z|) .Y 2.581 0.334 7.731 0.000 .M 2.633 0.362 7.271 0.000 R-Square: Estimate Y 0.373 M 0.265 Defined Parameters: Estimate Std.Err z-value P(>|z|) ab 0.357 0.080 4.442 0.000 > parameterEstimates(fitmod, ci=TRUE, level=.95, boot.ci.type="perc") lhs op rhs label est se z pvalue ci.lower ci.upper 1 Y ~ X c 0.040 0.124 0.319 0.75 -0.200 0.288 2 M ~ X a 0.561 0.095 5.888 0.00 0.391 0.765 3 Y ~ M b 0.635 0.102 6.224 0.00 0.439 0.835 4 Y ~~ Y 2.581 0.334 7.731 0.00 1.868 3.193 5 M ~~ M 2.633 0.362 7.271 0.00 1.890 3.311 6 X ~~ X 3.010 0.000 NA NA 3.010 3.010 7 ab := a*b ab 0.357 0.080 4.442 0.00 0.225 0.538 > > summary(lm(Y~X+M,data=myData))$r.square [1] 0.3729992 > summary(lm(M~X,data=myData))$r.square [1] 0.2645879 > var(myData$Y) [1] 4.158687 > var(myData$M) [1] 3.616566 > var(myData$X) [1] 3.040303 >