b:head_first_statistics:variability_and_spread
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| b:head_first_statistics:variability_and_spread [2020/09/21 13:52] – hkimscil | b:head_first_statistics:variability_and_spread [2025/09/17 08:12] (current) – [Standard score] hkimscil | ||
|---|---|---|---|
| Line 59: | Line 59: | ||
| > | > | ||
| > | > | ||
| + | |||
| > sapply(data, | > sapply(data, | ||
| [1] 1.825742 1.563472 7.362065 | [1] 1.825742 1.563472 7.362065 | ||
| Line 87: | Line 88: | ||
| 아웃라이어의 (극단치의) 문제 | 아웃라이어의 (극단치의) 문제 | ||
| - | '' | + | < |
| - | b <- c(1, | + | a <- c(1, |
| + | b <- c(1, | ||
| + | </ | ||
| range(a) vs. range(b) | range(a) vs. range(b) | ||
| Line 112: | Line 115: | ||
| </ | </ | ||
| ====== Percentile ====== | ====== Percentile ====== | ||
| + | <WRAP info> | ||
| + | How to find percentile | ||
| + | - First of all, line all your values up in ascending order. | ||
| + | - To find the position of the kth percentile out of n numbers, start off by calculating .$ k(\frac{n}{100})$ | ||
| + | - If this gives you an integer, then your percentile is halfway between the value at position $ k(\frac{n}{100})$ and the next number along. Take the average of the numbers at these two positions to give you your percentile. | ||
| + | - If $ k(\frac{n}{100})$ is not an integer, then round it up. This then gives you the position of the percentile. | ||
| + | </ | ||
| + | |||
| + | < | ||
| + | > k <- c(1:125) | ||
| + | > length(k) | ||
| + | [1] 125 | ||
| + | > k | ||
| + | [1] | ||
| + | | ||
| + | | ||
| + | | ||
| + | | ||
| + | [101] 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | ||
| + | [121] 121 122 123 124 125 | ||
| + | > | ||
| + | </ | ||
| + | 10th percentile 을 구하려면 | ||
| + | 10 * ( 125 / 100) = 12.5 | ||
| + | 이 숫자를 반올림하면 13이므로 13번째 숫자가 10번째 페센타일이 된다 (13). | ||
| + | < | ||
| + | > k <- c(1:125) | ||
| + | > # x th percentile | ||
| + | > k[ceiling(10 * (125 / 100))] | ||
| + | [1] 13 | ||
| + | > k[ceiling(.1 * length(k))] | ||
| + | [1] 13 | ||
| + | > quantile(k, .1) | ||
| + | | ||
| + | 13.4 | ||
| + | > | ||
| + | > # 50% percentile | ||
| + | > k[ceiling(50*(length(k)/ | ||
| + | [1] 63 | ||
| + | > k[ceiling(.5*(length(k)))] | ||
| + | [1] 63 | ||
| + | > median(k) | ||
| + | [1] 63 | ||
| + | > quantile(k, .5) | ||
| + | 50% | ||
| + | | ||
| + | > | ||
| + | > | ||
| + | </ | ||
| + | |||
| + | |||
| + | < | ||
| + | > k <- c(1:10) | ||
| + | > length(k) | ||
| + | [1] 10 | ||
| + | > k | ||
| + | | ||
| + | </ | ||
| + | |||
| + | 20th percentile을 구하려면 | ||
| + | $ 20 * (10 /100) = 2 $ 이므로 | ||
| + | 2번째와 3번째 사이의 점수의 평균이므로, | ||
| + | |||
| + | ====== Boxplot ====== | ||
| + | < | ||
| + | # j <- c(6, | ||
| + | j <- c(7, | ||
| + | # m <- c(3, | ||
| + | m <- c(3, | ||
| + | |||
| + | median(j) | ||
| + | median(m) | ||
| + | </ | ||
| + | |||
| + | [{{hf.boxplot.ex.jpg}}] | ||
| + | |||
| + | |||
| + | < | ||
| + | boxplot(j) | ||
| + | boxplot(m) | ||
| + | </ | ||
| + | |||
| + | < | ||
| + | boxplot(j, m) | ||
| + | boxplot(j, m, horizontal = T) | ||
| + | </ | ||
| + | |||
| + | |||
| + | |||
| ====== Variance ====== | ====== Variance ====== | ||
| Line 122: | Line 214: | ||
| * (평균으로 추측했을 때 생기는) 오차의 제곱의 합 | * (평균으로 추측했을 때 생기는) 오차의 제곱의 합 | ||
| * (오차의) 제곱의 합 | * (오차의) 제곱의 합 | ||
| + | * $ \sum \text{random}^2 $ | ||
| + | * $ \sum \text{residual}^2 $ | ||
| + | \begin{eqnarray*} | ||
| + | \text{Individual score } X_{i} & = & \text{group (common) part} + \text{random part} \\ | ||
| + | & = & \text{group mean} + \text{random or residual} \\ | ||
| + | \end{eqnarray*} | ||
| + | |||
| + | 우리는 이것을 흔히 아래처럼 부른다 | ||
| * 제곱의 합 | * 제곱의 합 | ||
| * Sum of Square (SS) | * Sum of Square (SS) | ||
| Line 130: | Line 230: | ||
| * calculation of variance (an easy way) see [[: | * calculation of variance (an easy way) see [[: | ||
| * $ \displaystyle \frac{\sum(X_{i})}{N} - \mu^2$ | * $ \displaystyle \frac{\sum(X_{i})}{N} - \mu^2$ | ||
| + | * [{{variance.cal.jpg? | ||
| + | |||
| [[:standard deviation]] | [[:standard deviation]] | ||
| + | ====== Standard score ====== | ||
| + | see | ||
| [[:standard score]] | [[:standard score]] | ||
| + | [[:z score]] | ||
| + | $ z = \large\frac {x-\mu}{\sigma} $ | ||
b/head_first_statistics/variability_and_spread.1600663964.txt.gz · Last modified: by hkimscil
