statistical_regression_methods
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
statistical_regression_methods [2017/11/13 10:08] – hkimscil | statistical_regression_methods [2022/11/13 23:01] (current) – hkimscil | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Statistical Regression Methods ====== | + | ~~REDIRECT>statistical regression~~ |
- | A part of selection method in multiple regression. Inshort, | + | |
- | + | ||
- | Multiple Regression | + | |
- | - Enter method | + | |
- | - Selection method | + | |
- | - Statistical regression method | + | |
- | - forward selection: 인들 (predictors) 중 종속변인인 Y와 상관관계가 가장 높은 변인부터 먼저 투입되어 회귀계산이 수행된다. 먼저 투입된 변인은 (상관관계가 높으므로) 이론적으로 종속변인을 설명하는 중요한 요소로 여겨지게 된다. 또한 다음 변인은 우선 투입된 변인을 고려한 상태로 투입된다. | + | |
- | - backward deletion: 모든 독립변인들이 한꺼번에 투입되어 회귀계산이 시작된다. 이어서 회귀식에 통계학적으로 기여하지 못한다고 판단되는 X변인이 하나씩 제거되면서 회귀계산을 반복적으로 한다. | + | |
- | - stepwise selection: Forward와 같은 방식으로 회귀계산을 하되, 투입된 변인의 설명력을 계산하여 버릴 것인지 취할 것인지를 결정한다. 각 IV에 대한 t-test를 근거로 그 IV가 significant한 기여를 했는지를 판단하는 것을 말한다. | + | |
- | - Sequential regression method | + | |
- | + | ||
- | See also {{youtube> | + | |
- | See also {{http:// | + | |
- | ---- | + | |
- | The below is from http:// | + | |
- | <WRAP box 70%> | + | |
- | **Forward selection** begins with an empty equation. | + | |
- | + | ||
- | **Backward elimination** (or backward deletion) is the reverse process. | + | |
- | + | ||
- | + | ||
- | **Stepwise regression** is a combination of the forward and backward selection techniques. . . . Stepwise regression is a modification of the forward selection so that __after each step in which a variable was added, all candidate variables in the model are checked to see if their significance has been reduced below the specified tolerance level.__ If a nonsignificant variable is found, it is removed from the model. Stepwise regression requires two significance levels: one for adding variables and one for removing variables. The cutoff probability for adding variables should be less than the cutoff probability for removing variables so that the | + | |
- | procedure does not get into an infinite loop. | + | |
- | + | ||
- | + | ||
- | Sequential Regression Method of Entry: | + | |
- | + | ||
- | **Block-wise selection** is a version of forward selection that is achieved in blocks or sets. The predictors are grouped into blocks based on psychometric consideration or theoretical reasons and __a stepwise selection is applied__. | + | |
- | + | ||
- | Essentially, | + | |
- | + | ||
- | Two criterion are used to achieve the best set of predictors; these include meaningfulness to the situation and statistical significance. | + | |
- | </ | + | |
- | + | ||
- | ====== e.g. 1 ====== | + | |
- | {{: | + | |
- | + | ||
- | < | + | |
- | names(lbw) <- tolower(names(lbw)) | + | |
- | </ | + | |
- | + | ||
- | < | + | |
- | lbw <- within(lbw, { | + | |
- | ## race relabeling | + | |
- | race.cat <- factor(race, | + | |
- | + | ||
- | ## ftv (frequency of visit) relabeling | + | |
- | ftv.cat <- cut(ftv, breaks = c(-Inf, 0, 2, Inf), labels = c(" | + | |
- | ftv.cat <- relevel(ftv.cat, | + | |
- | + | ||
- | ## ptl | + | |
- | preterm <- factor(ptl >= 1, levels = c(F,T), labels = c(" | + | |
- | })</ | + | |
- | + | ||
- | < | + | |
- | lm.null <- lm(bwt | + | |
- | </code> | + | |
- | + | ||
- | < | + | |
- | < | + | |
- | lm(formula = bwt ~ age + lwt + race.cat + smoke + preterm + ht + | + | |
- | ui + ftv.cat, data = lbw) | + | |
- | + | ||
- | Residuals: | + | |
- | | + | |
- | -1896.38 | + | |
- | + | ||
- | Coefficients: | + | |
- | Estimate Std. Error t value Pr(> | + | |
- | (Intercept) | + | |
- | age | + | |
- | lwt 4.205 1.717 2.448 0.015316 * | + | |
- | race.catBlack -467.043 | + | |
- | race.catOther -323.144 | + | |
- | smoke | + | |
- | preterm1+ | + | |
- | ht -568.111 | + | |
- | ui -494.168 | + | |
- | ftv.catNone | + | |
- | ftv.catMany | + | |
- | --- | + | |
- | Signif. codes: | + | |
- | 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 | + | |
- | + | ||
- | Residual standard error: 646.9 on 178 degrees of freedom | + | |
- | Multiple R-squared: | + | |
- | F-statistic: | + | |
- | + | ||
- | </ | + | |
- | + | ||
- | < | + | |
- | </ | + | |
- | + | ||
- | < | + | |
- | + | ||
- | Model: | + | |
- | bwt ~ age + lwt + race.cat + smoke + preterm + ht + ui + ftv.cat | + | |
- | Df Sum of Sq RSS AIC F value Pr(>F) | + | |
- | < | + | |
- | age | + | |
- | lwt | + | |
- | race.cat | + | |
- | smoke | + | |
- | preterm | + | |
- | ht 1 | + | |
- | ui 1 | + | |
- | ftv.cat | + | |
- | + | ||
- | < | + | |
- | age | + | |
- | lwt * | + | |
- | race.cat ** | + | |
- | smoke ** | + | |
- | preterm | + | |
- | ht ** | + | |
- | ui *** | + | |
- | ftv.cat | + | |
- | --- | + | |
- | Signif. codes: | + | |
- | 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 | + | |
- | > | + | |
- | > | + | |
- | </ | + | |
- | + |
statistical_regression_methods.1510537089.txt.gz · Last modified: 2017/11/13 10:08 by hkimscil