r:data_structures
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
r:data_structures [2018/11/20 08:25] – [Selecting data frame columns by position] hkimscil | r:data_structures [2019/09/19 18:16] (current) – [Creating a Factor (Categorical Variable)] hkimscil | ||
---|---|---|---|
Line 54: | Line 54: | ||
^ Object | ^ Object | ||
- | | Number | + | | Number |
- | | Vector of numbers | + | | Vector of numbers |
- | | Character string | + | | Character string |
- | | Vector of character strings | + | | Vector of character strings |
- | | Factor | + | | Factor |
- | | List | list(" | + | | List | '' |
- | | Data frame | data.frame(x=1: | + | | Data frame | '' |
- | | Function | + | | Function |
===== Class ===== | ===== Class ===== | ||
Line 135: | Line 135: | ||
Grouping: This is a technique for labeling or tagging your data items according to their group. See the Introduction to Chapter 6. | Grouping: This is a technique for labeling or tagging your data items according to their group. See the Introduction to Chapter 6. | ||
+ | |||
+ | < | ||
+ | > A | ||
+ | [1] 1 2 2 3 3 4 4 4 4 2 1 2 3 3 | ||
+ | > str(A) | ||
+ | num [1:14] 1 2 2 3 3 4 4 4 4 2 ... | ||
+ | > fA <- factor(A) | ||
+ | > fA | ||
+ | [1] 1 2 2 3 3 4 4 4 4 2 1 2 3 3 | ||
+ | Levels: 1 2 3 4 | ||
+ | > str(fA) | ||
+ | | ||
+ | > | ||
+ | </ | ||
===== Data Frames ===== | ===== Data Frames ===== | ||
Line 227: | Line 241: | ||
[1] 11 12 13 14 15 16 | [1] 11 12 13 14 15 16 | ||
</ | </ | ||
- | |||
- | <WRAP box help>The above code is very useful. But, sometimes the recycling rule is very annoying. How would I avoid it? | ||
- | </ | ||
====== Creating a Factor (Categorical Variable) ====== | ====== Creating a Factor (Categorical Variable) ====== | ||
Line 252: | Line 263: | ||
< | < | ||
- | > f | + | > f # note that there is no Fri in the below output. |
[1] Wed Thu Mon Wed Thu Thu Thu Tue Thu Tue | [1] Wed Thu Mon Wed Thu Thu Thu Tue Thu Tue | ||
Levels: Mon Tue Wed Thu Fri | Levels: Mon Tue Wed Thu Fri | ||
Line 819: | Line 830: | ||
====== Selecting data frame columns by position ====== | ====== Selecting data frame columns by position ====== | ||
<code csv suburbs.csv> | <code csv suburbs.csv> | ||
- | | + | city county state pop |
- | 1 | + | Chicago Cook IL 2853114 |
- | 2 | + | Kenosha Kenosha WI 90352 |
- | 3 Aurora | + | Aurora Kane IL 171782 |
- | 4 | + | Elgin Kane IL 94487 |
- | 5 Gary Lake(IN) | + | Gary Lake(IN) IN 102746 |
- | 6 Joliet | + | Joliet Kendall IL 106221 |
- | 7 Naperville | + | Naperville DuPage IL 147779 |
- | 8 | + | Arlington Heights Cook IL 76031 |
- | 9 | + | Bolingbrook Will IL 70834 |
- | 10 | + | Cicero Cook IL 72616 |
- | 11 | + | Evanston Cook IL 74239 |
- | 12 Hammond Lake(IN) | + | Hammond Lake(IN) IN 83048 |
- | 13 | + | Palatine Cook IL 67232 |
- | 14 | + | Schaumburg Cook IL 75386 |
- | 15 | + | Skokie Cook IL 63348 |
- | 16 | + | Waukegan Lake(IL) IL 91452 |
</ | </ | ||
- | or {{:r:suburbs.csv|download file}} | + | < |
< | < | ||
Line 866: | Line 877: | ||
</ | </ | ||
- | < | + | < |
city pop | city pop | ||
1 Chicago 2853114 | 1 Chicago 2853114 | ||
Line 994: | Line 1005: | ||
# then, close the edit window | # then, close the edit window | ||
</ | </ | ||
+ | |||
+ | <WRAP box help>Can you save it as " | ||
+ | |||
+ | When you read back the csv file? How would you avoid like the below output? I mean aovid X column? | ||
+ | < | ||
+ | 1 1 -0.818 | ||
+ | 2 2 -0.667 | ||
+ | 3 3 -0.494 | ||
+ | 4 4 -0.819 | ||
+ | |||
+ | Or even, how would I save the csv file, without the X column? | ||
+ | </ | ||
====== Removing NAs from a Data Frame ====== | ====== Removing NAs from a Data Frame ====== |
r/data_structures.1542669940.txt.gz · Last modified: 2018/11/20 08:25 by hkimscil