User Tools

Site Tools


r:data_structures

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
r:data_structures [2018/11/20 08:25] – [Selecting data frame columns by position] hkimscilr:data_structures [2019/09/19 18:16] (current) – [Creating a Factor (Categorical Variable)] hkimscil
Line 54: Line 54:
  
 ^ Object  ^ Example  ^ Mode  ^ ^ Object  ^ Example  ^ Mode  ^
-| Number  | 3.1415  | numeric +| Number ''%%3.1415%%''  | numeric 
-| Vector of numbers  | c(2.7.182, 3.1415)  | numeric +| Vector of numbers ''%%c(2.7.182, 3.1415)%%''  | numeric 
-| Character string  | "Moe"  | character +| Character string ''%%"Moe"%%''  | character 
-| Vector of character strings  | c("Moe", "Larry", "Curly" | character +| Vector of character strings ''%%c("Moe", "Larry", "Curly")%%''  | character 
-| Factor  | factor(c("NY", "CA", "IL"))  | numeric +| Factor ''%%factor(c("NY", "CA", "IL"))%%''  | numeric 
-| List  | list("Moe", "Larry", "Curly" | list  | +| List  | ''%%list("Moe", "Larry", "Curly")%%''  | list  | 
-| Data frame  | data.frame(x=1:3, y=c("NY", "CA", "IL"))  | list  | +| Data frame  | ''%%data.frame(x=1:3, y=c("NY", "CA", "IL"))%%''  | list  | 
-| Function  | print  | function  |+| Function ''%%print%%''  | function  |
  
 ===== Class ===== ===== Class =====
Line 135: Line 135:
  
 Grouping: This is a technique for labeling or tagging your data items according to their group. See the Introduction to Chapter 6. Grouping: This is a technique for labeling or tagging your data items according to their group. See the Introduction to Chapter 6.
 +
 +<code>> A <- c(1,2,2,3,3,4,4,4,4,2,1,2,3,3)
 +> A
 + [1] 1 2 2 3 3 4 4 4 4 2 1 2 3 3
 +> str(A)
 + num [1:14] 1 2 2 3 3 4 4 4 4 2 ...
 +> fA <- factor(A)
 +> fA
 + [1] 1 2 2 3 3 4 4 4 4 2 1 2 3 3
 +Levels: 1 2 3 4
 +> str(fA)
 + Factor w/ 4 levels "1","2","3","4": 1 2 2 3 3 4 4 4 4 2 ...
 +
 +</code>
  
 ===== Data Frames ===== ===== Data Frames =====
Line 227: Line 241:
 [1] 11 12 13 14 15 16 [1] 11 12 13 14 15 16
 </code> </code>
- 
-<WRAP box help>The above code is very useful. But, sometimes the recycling rule is very annoying. How would I avoid it? 
-</WRAP> 
  
 ====== Creating a Factor (Categorical Variable) ====== ====== Creating a Factor (Categorical Variable) ======
Line 252: Line 263:
  
 <code>> f <- factor(wday, c("Mon","Tue","Wed","Thu","Fri")) # c(...) part means "levels" not data  <code>> f <- factor(wday, c("Mon","Tue","Wed","Thu","Fri")) # c(...) part means "levels" not data 
-> f+> f  # note that there is no Fri in the below output.
  [1] Wed Thu Mon Wed Thu Thu Thu Tue Thu Tue  [1] Wed Thu Mon Wed Thu Thu Thu Tue Thu Tue
 Levels: Mon Tue Wed Thu Fri Levels: Mon Tue Wed Thu Fri
Line 819: Line 830:
 ====== Selecting data frame columns by position ====== ====== Selecting data frame columns by position ======
 <code csv suburbs.csv> <code csv suburbs.csv>
-                city   county state     pop +city county state pop 
-1            Chicago     Cook    IL 2853114 +Chicago Cook IL 2853114 
-2            Kenosha  Kenosha    WI   90352 +Kenosha Kenosha WI 90352 
-3             Aurora     Kane    IL  171782 +Aurora Kane IL 171782 
-4              Elgin     Kane    IL   94487 +Elgin Kane IL 94487 
-5               Gary Lake(IN)    IN  102746 +Gary Lake(IN) IN 102746 
-6             Joliet  Kendall    IL  106221 +Joliet Kendall IL 106221 
-7         Naperville   DuPage    IL  147779 +Naperville DuPage IL 147779 
-8  Arlington Heights     Cook    IL   76031 +Arlington Heights Cook IL 76031 
-9        Bolingbrook     Will    IL   70834 +Bolingbrook Will IL 70834 
-10            Cicero     Cook    IL   72616 +Cicero Cook IL 72616 
-11          Evanston     Cook    IL   74239 +Evanston Cook IL 74239 
-12           Hammond Lake(IN)    IN   83048 +Hammond Lake(IN) IN 83048 
-13          Palatine     Cook    IL   67232 +Palatine Cook IL 67232 
-14        Schaumburg     Cook    IL   75386 +Schaumburg Cook IL 75386 
-15            Skokie     Cook    IL   63348 +Skokie Cook IL 63348 
-16          Waukegan Lake(IL)    IL   91452+Waukegan Lake(IL) IL 91452
 </code> </code>
-or {{:r:suburbs.csv|download file}}+<code>suburbs <- read.csv("http://commres.net/wiki/_export/code/r/data_structures?codeblock=96", head=T, sep=" ")</code>
  
 <code>> suburbs[[1]] <code>> suburbs[[1]]
Line 866: Line 877:
 </code> </code>
  
-<code>> suburbs[c(1,3)]+<code>> suburbs[c(1,4)]
                 city     pop                 city     pop
 1            Chicago 2853114 1            Chicago 2853114
Line 994: Line 1005:
 # then, close the edit window # then, close the edit window
 </code> </code>
 +
 +<WRAP box help>Can you save it as "mat.csv." Then, retrieve it again into r space?
 +
 +When you read back the csv file? How would you avoid like the below output? I mean aovid X column?
 +<code>  X before treatment  after
 +1 1 -0.818    -0.946 -0.611
 +2 2 -0.667    -0.205 -2.155
 +3 3 -0.494     0.385 -0.535
 +4 4 -0.819     1.531 -0.316</code>
 +
 +Or even, how would I save the csv file, without the X column?
 +</WRAP>
  
 ====== Removing NAs from a Data Frame ====== ====== Removing NAs from a Data Frame ======
r/data_structures.1542669940.txt.gz · Last modified: 2018/11/20 08:25 by hkimscil

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki