c:ma:2019:multiple_regression_exercise
This is an old revision of the document!
Table of Contents
Class Activities
Ex. 1
- Install packages ISLR
- use a dataset, Carseats
- Build regression models with a DV, sales and IVs, your choices
- Use
?Carseats
command for the explanation of the dataset - Use
str
function to see the characteristic of each variable. Make it sure thatSelvesLoc
variable should be factor, not int or anything.
- Make a full model (with all variables) then reduce down the model until you find it fitted.
- Make a null model (with no variables) then, build up the model with additional IVs until you find a fitted model.
- Can we use
step
orstepAIC
(MASS package needed) function? - Interpret the result
step(lm.full, direction=“back”)
Ex. 2
- Install packages tidyverse
- load the tidyverse
install.packages("car")
data("Salaries", package = "car")
- Use a dataset Salaries
- describe the data set
—-
- Regress sex variable on salary variable
- Write the regression model
- Discuss the difference
- Use rank variable for the same purpose
- –
- Use yrs.service + rank + discipline + sex
- on salary
- How do you interpret the result?
—–
위의 Salaries 데이터사용이 안 될 때
- download to R from here salaries.csv
- use to import the data set.
Salaries <- read.csv("http://commres.net/wiki/_media/salaries.csv")
- for information about Salaries (it may not be loaded),
- use
??Salaries
to describe the data set.
—–
Please copy and paste the proper r command and output to a txt file (use notepad or some other text editing program). You could use MS Word, but, please make it sure that you use type-setting fonts such as “Courier New.” The below output, as an example, includes the r command head(Salaries)
and the output.
> head(Salaries) rank discipline yrs.since.phd yrs.service sex salary 1 Prof B 19 18 Male 139750 2 Prof B 20 16 Male 173200 3 AsstProf B 4 3 Male 79750 4 Prof B 45 39 Male 115000 5 Prof B 40 41 Male 141500 6 AssocProf B 6 6 Male 97000
> lm.sal.sex <- lm(salary ~ sex, data=Salaries) > summary(lm.sal.sex) Call: lm(formula = salary ~ sex, data = Salaries) Residuals: Min 1Q Median 3Q Max -57290 -23502 -6828 19710 116455 Coefficients: Estimate Std. Error (Intercept) 101002 4809 sexMale 14088 5065 t value Pr(>|t|) (Intercept) 21.001 < 2e-16 *** sexMale 2.782 0.00567 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 30030 on 395 degrees of freedom Multiple R-squared: 0.01921, Adjusted R-squared: 0.01673 F-statistic: 7.738 on 1 and 395 DF, p-value: 0.005667
Discussion
Common topics
- What affects students GPA? Or what determines students' GPA?
Group topics
Making Questionnaire
Questions you submit at the ajoubb.
Then we will list questions in Google docs Google survey
c/ma/2019/multiple_regression_exercise.1573177353.txt.gz · Last modified: 2019/11/08 10:42 by hkimscil