c:ma:2019:multiple_regression_exercise
This is an old revision of the document!
Table of Contents
Class Activities
Ex. 1
- Install packages ISLR
- use a dataset, Carseats
- Build regression models with a DV, sales and IVs, your choices
- Use
?Carseats
command for the explanation of the dataset - Use
str
function to see the characteristic of each variable. Make it sure thatSelvesLoc
variable should be factor, not int or anything.
- Make a full model (with all variables) then reduce down the model until you find it fitted.
- Make a null model (with no variables) then, build up the model with additional IVs until you find a fitted model.
- Can we use
step
orstepAIC
(MASS package needed) function? - Interpret the result
step(lm.full, direction=“back”)
Ex. 2
- Install packages tidyverse
- load the tidyverse
install.packages("car")
data("Salaries", package = "car")
- Use a dataset Salaries
- describe the data set
—-
- Regress sex variable on salary variable
- Write the regression model
- Discuss the difference
- Use rank variable for the same purpose
- –
- Use yrs.service + rank + discipline + sex
- on salary
- How do you interpret the result?
—–
위의 Salaries 데이터사용이 안 될 때
- download to R from here salaries.csv
- use to import the data set.
Salaries <- read.csv("http://commres.net/wiki/_media/salaries.csv")
- for information about Salaries (it may not be loaded),
- use
??Salaries
to describe the data set.
—–
Please copy and paste the proper r command and output to a txt file (use notepad or some other text editing program). You could use MS Word, but, please make it sure that you use type-setting fonts such as “Courier New.” The below output, as an example, includes the r command head(Salaries)
and the output.
> head(Salaries) rank discipline yrs.since.phd yrs.service sex salary 1 Prof B 19 18 Male 139750 2 Prof B 20 16 Male 173200 3 AsstProf B 4 3 Male 79750 4 Prof B 45 39 Male 115000 5 Prof B 40 41 Male 141500 6 AssocProf B 6 6 Male 97000
> lm.sal.sex <- lm(salary ~ sex, data=Salaries) > summary(lm.sal.sex) Call: lm(formula = salary ~ sex, data = Salaries) Residuals: Min 1Q Median 3Q Max -57290 -23502 -6828 19710 116455 Coefficients: Estimate Std. Error (Intercept) 101002 4809 sexMale 14088 5065 t value Pr(>|t|) (Intercept) 21.001 < 2e-16 *** sexMale 2.782 0.00567 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 30030 on 395 degrees of freedom Multiple R-squared: 0.01921, Adjusted R-squared: 0.01673 F-statistic: 7.738 on 1 and 395 DF, p-value: 0.005667
> lm.sal.rank <- lm(salary ~ rank, data=Salaries) > summary(lm.sal.rank) Call: lm(formula = salary ~ rank, data = Salaries) Residuals: Min 1Q Median 3Q Max -68972 -16376 -1580 11755 104773 Coefficients: Estimate Std. Error (Intercept) 80776 2887 rankAssocProf 13100 4131 rankProf 45996 3230 t value Pr(>|t|) (Intercept) 27.976 < 2e-16 *** rankAssocProf 3.171 0.00164 ** rankProf 14.238 < 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 23630 on 394 degrees of freedom Multiple R-squared: 0.3943, Adjusted R-squared: 0.3912 F-statistic: 128.2 on 2 and 394 DF, p-value: < 2.2e-16 >
> summary(lm.sal.many) Call: lm(formula = salary ~ yrs.service + rank + discipline + sex, data = Salaries) Residuals: Min 1Q Median 3Q Max -64202 -14255 -1533 10571 99163 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 68351.67 4482.20 15.250 < 2e-16 *** yrs.service -88.78 111.64 -0.795 0.426958 rankAssocProf 14560.40 4098.32 3.553 0.000428 *** rankProf 49159.64 3834.49 12.820 < 2e-16 *** disciplineB 13473.38 2315.50 5.819 1.24e-08 *** sexMale 4771.25 3878.00 1.230 0.219311 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 22650 on 391 degrees of freedom Multiple R-squared: 0.4478, Adjusted R-squared: 0.4407 F-statistic: 63.41 on 5 and 391 DF, p-value: < 2.2e-16 >
Discussion
Common topics
- What affects students GPA? Or what determines students' GPA?
Group topics
Making Questionnaire
Questions you submit at the ajoubb.
Then we will list questions in Google docs Google survey
c/ma/2019/multiple_regression_exercise.1573177879.txt.gz · Last modified: 2019/11/08 10:51 by hkimscil