Differences

This shows you the differences between two versions of the page.

--- mediation_analysis [2023/06/18 07:30] – hkimscil
+++ mediation_analysis [2024/11/03 22:37] (current) – [What about this output] hkimscil
@@ Line 30: / Line 30: @@
 다른 하나는 세 변인을 동시에 고려하여 attitudes의 단위변화가 최종적으로 behavior에 어떻게 영향을 미치는가를 본다. 만약에 a의 beta 크기가 2라고 하고, b의 beta 크기가 3이라고 하면, 우리는 attitudes의 단위가 한단위 변하면 intention의 단위가 2단위 변한다는 것을 알고, 이를 두번째 단계에 (b의 크기) 적용하면 2단위가 변하였으므로 behavior는 3x2단위 (6단위) 변하게 될 것이라고 추측할 수 있다. 즉, attitudes의 한단위는 intention의 매개효과를 통하여 behavior가 6단위 변하도록 한다 (a beta coefficient * b beta coefficient).
+size of mediated effect
+''ab:=a*b'' 의 크기는 이것을 sdx/sdy 로 곱하여 beta coefficient값으로 변하여 구해본다. . . .
 이런 것들은 일반 regression을 이용하여 알아낼 수도 있지만, 보통 lavaan package를 활용하거나 (path analysis를 위해서 개발) 혹은 mediation이라는 package를 이용하여 독립변인의 매개효과를 알아낸다.
 <code>
-            -- Intention --
+                Intention
-        a --               -- b
+             /            \
-        --                    --
+        a   /              \  b
+           /                \
 Attitudes  --------c'---------   Behavior
@@ Line 47: / Line 52: @@
 또한 mediation analysis에서 독립변인들의 효과를 (설명력을) 직접효과와 간접효과로 나눌 수 있는데 직접효과는 a, b, 그리고 c'를 직접효과라고 (direct effects) 하고 a와 b를 거쳐서 가는 효과를 간접효과라고 (indirect effects) 한다. Indirect effects 의 크기를 어떻게 측정하는가에는 여러가지 방법이 있을 수 있지만, 가장 많이 쓰이는 방법으로는
   * a path와 b path의 coefficient값을 곱한 값을 취하는 방법이다
-  * 다른 방법으로는 b - a 값을 취하는 방법이 있지만 흔하지는 않다
+  * 다른 방법으로는 c - c' 값을 취하는 방법이 있지만 흔하지는 않다
 위에서 a b 를 곱해서 간접효과를 측정할 때에 그 값이 (효과가) significant한지 알아보기 위한 테스트에는 두 가지 방법이 있을 수 있는데
@@ Line 409: / Line 414: @@
 </code>
-위에서 a, b는 [[:beta coefficients]]라고 가정하고, 이 값들이 각각 a = 2, b = 1.5 라고 가정합니다. 이 때,
+위에서 a, b는 [[:coefficients]] 값이라고 가정하고, 이 값들이 각각 a = 2, b = 1.5 라고 가정합니다. 이 때,
   * a는 (2) attitudes의 measurement 한 unit이 증가할 때 intention이 2 증가한다는 것을 의미합니다.
   * b는 (1.5)는 attitudes의 점수가 하나 증가할 때 마다 behavior가 2*1.5 증가함을 (3) 의미합니다. 즉, attitudes가 한 단위 증가할 때마다 beahvior는 3 증가합니다. 독립변인 attitudes의 intention을 매개로 한 영향력을 말할 때 이 3을 사용합니다. 따라서 ab (mediation effects) = a * b 로 생각할 수 있습니다.
@@ Line 510: / Line 515: @@
 </code>
-아래는 behavior를 종속변인으로 하고 reg.int를 독립변인으로 regression을 한 결과이다. 이 결과는 intention 의 SS 중에서 attitude의 설명력 부분이 (regression 부분) behavior에 어떻게 설명이 되는가를 보기 위한 것이다.
+아래는 behavior를 종속변인으로 하고 reg.int를 독립변인으로 regression을 한 결과이다. 이 결과는 intention 의 SS 중에서 attitude의 설명력 부분이 (regression 부분) behavior에 어떻게 설명이 되는가를 보기 위한 것이다. 그런데 이것은 매개하는 변인
 <code>
 # the intention part contributed by attitudes
@@ Line 614: / Line 619: @@
 Multiple R-squared:  0.199,	Adjusted R-squared:  0.191
 F-statistic: 24.3 on 2 and 196 DF,  p-value: 3.64e-10
+</code>
+그리고 위의 lm.ba.01의 R-squared 값이 0.199임은 Attitude와 Intention이 함께 설명하는 부분이 약 19.9%임을 말하고 있고, 이 19.9% 중에서 아주 극히 일부분만 (lm.temp의 R-squared 값인 0.000683) Attitude 고유의 설명력 부분이 된다는 것을 알수 있다. 좀더 살펴보자면
+<code>
+abc <- summary(lm.ba.01)$r.square
+ab <- summary(lm.ba.02)$r.square
+bc <- summary(lm.ba.05)$r.square
+abc
+ab
+bc
+abbc <- ab + bc
+abbc
+# b는 아래처럼 구할 수도 있고
+b <- abbc - abc
+b
+# 위에서 구한 summary(lm.ba.021)$r.squared 값이기도 하다
+summary(lm.ba.021)$r.squared
+# a 또한 마찬가지
+a <- abc - bc
+a
+# 혹은
+summary(lm.ba.022)$r.squared
+# 아래 c 도 마찬가지이다
+c <- abc - ab
+c
+summary(lm.temp)$r.squared
+</code>
+<code>
+> abc <- summary(lm.ba.01)$r.square
+> ab <- summary(lm.ba.02)$r.square
+> bc <- summary(lm.ba.05)$r.square
+> abc
+[1] 0.1989125
+> ab
+[1] 0.1982297
+> bc
+[1] 0.06197255
+> abbc <- ab + bc
+> abbc
+[1] 0.2602023
+> # b는 아래처럼 구할 수도 있고
+> b <- abbc - abc
+> b
+[1] 0.0612898
+> # 위에서 구한 summary(lm.ba.021)$r.squared 값이기도 하다
+> summary(lm.ba.021)$r.squared
+[1] 0.06197255
+> # a 또한 마찬가지
+> a <- abc - bc
+> a
+[1] 0.1369399
+> # 혹은
+> summary(lm.ba.022)$r.squared
+[1] 0.1369399
+>
+> # 아래 c 도 마찬가지이다
+> c <- abc - ab
+> c
+[1] 0.0006827583
+> summary(lm.temp)$r.squared
+[1] 0.0006827583
+>
 </code>
+{{:pasted:20241031-085240.png}}
 ===== Another modeling =====
 성재학생이 다른 아이디어를 가지고 있다. 이를 분석에서 구현보고자 한다.
@@ Line 793: / Line 861: @@
 그 외의 지수는?
 어떤 모델이 지금의 현상을 가장 잘 설명하는다고 판단하는가?
+  * Model fit
+    * Chi-square Test: p-value less than p-critical value (.05 for example) indicates that model does not fit well enough. p-value more than critical value means the model fits the data relatively well. The test is sensitive to the sample size and normality of the data.
+    * CFI (Comparative Fit Index): greater than .90 indicates good fit to the data. It is less sensitive to the sample size and normality of the data than chi-square test.
+    * TLI (Tucker-Lewis Index): greater than .95 (sometimes .90) indicates good fit. It is less sensitive to the sample size.
+    * RMSEA (Root Mean Square Error of Approximation): equal to or less than .08 (sometimes .10 is used) indicates good fit to the data.
+    * SRMR (Standard Root Mean square Residual): less than or equal to .08 indicates good fit to the data.
+| $\chi^2$  | $\text{CFI}$  | $\text{TLI}$  | $\text{RMSEA}$  | $\text{SRMR}$   |
+| $p \ge .05$  | $p \ge .90$  | $p \ge .95$  | $p \le .08$  | $p \le .08$  |
+Then what is SEM (Structural Equation Modeling)
+  * Relationships within and among variables and constructs
@@ Line 905: / Line 987: @@
 </code>
+====== e.g. with a categorical variable ======
+<code>
+# regression with job placement data
+df <- read.csv("http://commres.net/wiki/_media/r/plannedbehavior.csv")
+head(df)
+df<- within(df, {
+  norms.cat <- NA # need to initialize variable
+  norms.cat[norms < tmp[2]] <- "Low"
+  norms.cat[norms >= tmp[2] & norms < tmp[3]] <- "Middle"
+  norms.cat[norms >= tmp[3]] <- "High"
+} )
+head(df)
+med.m.01 <- '
+    # mediator
+    intention ~ a*attitude
+    behavior ~ b*intention
+    # direct effect c
+    behavior ~ c*attitude
+    # indirect effect
+    ab := a*b
+    # total effect
+    tot := c + ab
+    '
+fit <- sem(med.m.01, data = df,
+           meanstructure = T,
+           se = "boot", bootstrap = 500)
+summary(fit, fit.measures = T,
+        standardized = T,
+        ci = T)
+###
+mod.m.02 <- '
+    # mediator
+    intention ~ c(ag1,ag2,ag3)*attitude
+    behavior ~ c(bg1,bg2,bg3)*intention
+    # direct effect
+    behavior ~ c(cg1,cg2,cg3)*attitude
+    # indirect effect
+    abg1 := ag1*bg1 # for group 1
+    abg2 := ag2*bg2
+    abg3 := ag3*bg3
+    # tot effect
+    totalg1 := cg1 + (ag1*bg1)
+    totalg2 := cg2 + (ag2*bg2)
+    totalg3 := cg3 + (ag3*bg3)
+'
+fit.by.norms.cat <- sem(mod.m.02, data = df,
+                        group = "norms.cat",
+                        se = "boot", bootstrap = 500,
+                        meanstructure = T)
+summary(fit.by.norms.cat,
+        fit.measures = T,
+        standardized = T,
+        ci = T)
+all.constraints <- '
+    ag1 == ag2 == ag3
+    bg1 == bg2 == bg3
+    cg1 == cg2 == cg3
+'
+lavTestWald(fit.by.norms.cat,
+            constraints = all.constraints)
+lavTestWald(fit.by.norms.cat,
+            constraints = "ag1==ag2==ag3")
+lavTestWald(fit.by.norms.cat,
+            constraints = "bg1==bg2==bg3")
+lavTestWald(fit.by.norms.cat,
+            constraints = "cg1==cg2==cg3")
+# or
+full.mod.mediation <- '
+    # mediator
+    intention ~ a*attitude
+    behavior ~ b*intention + w*norms.cat
+    # define moderator
+    Z := w*b
+    # direct effect
+    behavior ~ c*attitude
+    # indirect effect
+    ab := a*b
+    # tot effect
+    total := c + (a*b)
+'
+full.mod <- sem (full.mod.mediation, data = df,
+                 se = "boot", bootstrap = 500,
+                 meanstructure = T)
+summary(full.mod, fit.measures = T,
+        stand = T, ci = T)
+</code>
 ====== e.gs ======
 <code>