b:head_first_statistics:variability_and_spread
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
b:head_first_statistics:variability_and_spread [2020/09/21 13:52] – hkimscil | b:head_first_statistics:variability_and_spread [2023/09/13 08:59] (current) – [Variability and Spread] hkimscil | ||
---|---|---|---|
Line 59: | Line 59: | ||
> | > | ||
> | > | ||
+ | |||
> sapply(data, | > sapply(data, | ||
[1] 1.825742 1.563472 7.362065 | [1] 1.825742 1.563472 7.362065 | ||
Line 87: | Line 88: | ||
아웃라이어의 (극단치의) 문제 | 아웃라이어의 (극단치의) 문제 | ||
- | '' | + | < |
- | b <- c(1, | + | a <- c(1, |
+ | b <- c(1, | ||
+ | </ | ||
range(a) vs. range(b) | range(a) vs. range(b) | ||
Line 112: | Line 115: | ||
</ | </ | ||
====== Percentile ====== | ====== Percentile ====== | ||
+ | <WRAP info> | ||
+ | How to find percentile | ||
+ | - First of all, line all your values up in ascending order. | ||
+ | - To find the position of the kth percentile out of n numbers, start off by calculating .$ k(\frac{n}{100})$ | ||
+ | - If this gives you an integer, then your percentile is halfway between the value at position $ k(\frac{n}{100})$ and the next number along. Take the average of the numbers at these two positions to give you your percentile. | ||
+ | - If $ k(\frac{n}{100})$ is not an integer, then round it up. This then gives you the position of the percentile. | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | > k <- c(1:125) | ||
+ | > length(k) | ||
+ | [1] 125 | ||
+ | > k | ||
+ | [1] | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | [101] 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | ||
+ | [121] 121 122 123 124 125 | ||
+ | > | ||
+ | </ | ||
+ | 10th percentile 을 구하려면 | ||
+ | 10 * ( 125 / 100) = 12.5 | ||
+ | 이 숫자를 반올림하면 13이므로 13번째 숫자가 10번째 페센타일이 된다 (13). | ||
+ | |||
+ | < | ||
+ | > k <- c(1:10) | ||
+ | > length(k) | ||
+ | [1] 10 | ||
+ | > k | ||
+ | | ||
+ | </ | ||
+ | |||
+ | 20th percentile을 구하려면 | ||
+ | $ 20 * (10 /100) = 2 $ 이므로 | ||
+ | 2번째와 3번째 사이의 점수의 평균이므로, | ||
+ | |||
+ | ====== Boxplot ====== | ||
+ | < | ||
+ | # j <- c(6, | ||
+ | j <- c(7, | ||
+ | # m <- c(3, | ||
+ | m <- c(3, | ||
+ | |||
+ | median(j) | ||
+ | median(m) | ||
+ | </ | ||
+ | |||
+ | [{{hf.boxplot.ex.jpg}}] | ||
+ | |||
+ | |||
+ | < | ||
+ | boxplot(j) | ||
+ | boxplot(m) | ||
+ | </ | ||
+ | |||
+ | < | ||
+ | boxplot(j, m) | ||
+ | boxplot(j, m, horizontal = T) | ||
+ | </ | ||
+ | |||
+ | |||
+ | |||
====== Variance ====== | ====== Variance ====== | ||
Line 130: | Line 197: | ||
* calculation of variance (an easy way) see [[: | * calculation of variance (an easy way) see [[: | ||
* $ \displaystyle \frac{\sum(X_{i})}{N} - \mu^2$ | * $ \displaystyle \frac{\sum(X_{i})}{N} - \mu^2$ | ||
+ | * [{{variance.cal.jpg? | ||
+ | |||
[[:standard deviation]] | [[:standard deviation]] | ||
+ | ====== Standard score ====== | ||
[[:standard score]] | [[:standard score]] | ||
+ | $ z = \large\frac {x-\mu}{\sigma} $ |
b/head_first_statistics/variability_and_spread.1600663964.txt.gz · Last modified: 2020/09/21 13:52 by hkimscil