binomial_distribution
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| binomial_distribution [2019/11/04 15:13] – hkimscil | binomial_distribution [2025/10/11 08:26] (current) – [e.g.,] hkimscil | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | ====== Binomial | + | ====== Binomial |
| + | |||
| + | - 1번의 시행에서 특정 사건 A가 발생할 확률을 p라고 하면 | ||
| + | - n번의 (독립적인) 시행에서 사건 A가 발생할 때의 확률 분포를 | ||
| + | - **이항확률분포**라고 한다. | ||
| + | 아래를 보면 | ||
| + | * 각 한문제를 맞힐 확률은 1/4, 틀릴 확률은 3/4 | ||
| + | * 3문제를 풀면서 (3번의 시행) 각 문제를 맞힐 확률 분포를 말한다. | ||
| + | * 기하분포의 경우, 각 문제를 맞히거나 틀리거나를 고려하지 않고 계속 틀리다가 언젠가 한번 맞힘으로써 사건이 끝난다. | ||
| + | |||
| + | {{: | ||
| + | {{: | ||
| + | |||
| + | | x | P(X=x) | ||
| + | | 0 | 0.75 * 0.75 * 0.75 | 3 | 0 | | ||
| + | | 1 | 3 * (0.75 * 0.75 * 0.25) | 2 | 1 | | ||
| + | | 2 | 3 * (0.75 * 0.25 * 0.25) | 1 | 2 | | ||
| + | | 3 | 0.25 * 0.25 * 0.25 | 0 | 3 | | ||
| + | {{: | ||
| + | |||
| + | $$P(X = r) = {\huge\text{? | ||
| + | $$P(X = r) = {\huge_{3}C_{r}} \cdot 0.25^{r} \cdot 0.75^{3-r}$$ | ||
| + | |||
| + | $_{n}C_{r}$은 n개의 사물에서 r개를 (순서없이) 고르는 방법의 수라고 할 때, 3개의 질문 중에서 한 개의 정답을 맞히는 방법은 $_{3}C_{1} = 3$ 세가지가 존재. | ||
| + | |||
| + | |||
| + | |||
| + | Probability for getting one question right | ||
| + | \begin{eqnarray*} | ||
| + | P(X = r) & = & _{3}C_{1} \cdot 0.25^{1} \cdot 0.75^{3-1} \\ | ||
| + | & = & \frac{3!}{1! \cdot (3-1)!} \cdot 0.25 \cdot 0.75^2 \\ | ||
| + | & = & 3 \cdot 0.25 \cdot 0.5625 \\ | ||
| + | & = & 3 \cdot 0.25 \cdot 0.5625 \\ | ||
| + | & = & 0.421875 | ||
| + | \end{eqnarray*} | ||
| + | |||
| + | $$P(X = r) = _{n}C_{r} \cdot 0.25^{r} \cdot 0.75^{n-r}$$ | ||
| + | $$P(X = r) = _{n}C_{r} \cdot p^{r} \cdot q^{n-r}$$ | ||
| + | |||
| + | - You’re running a series of independent trials. (n번의 시행을 하게 된다) | ||
| + | - There can be either a success or failure for each trial, and the probability of success is the same for each trial. (각 시행은 성공/ | ||
| + | - There are a finite number of trials. Note that this is different from that of geometric distribution. (n번의 시행으로 한정된다. 무한대 시행이 아님) | ||
| + | |||
| + | X가 n번의 시행에서 성공적인 결과를 얻는 수를 나타낸다고 할 때, r번의 성공이 있을 확률을 구하려면 아래 공식을 이용한다. | ||
| + | |||
| + | \begin{eqnarray*} | ||
| + | P(X = r) & = & _{n}C_{r} \cdot p^{r} \cdot q^{n-r} \;\;\; \text{Where, | ||
| + | _{n}C_{r} & = & \frac {n!}{r!(n-r)!} | ||
| + | \end{eqnarray*} | ||
| + | |||
| + | p = 각 시행에서 성공할 확률 | ||
| + | n = 시행 숫자 | ||
| + | r = r 개의 정답을 구할 확률 | ||
| + | |||
| + | $$X \sim B(n,p)$$ | ||
| + | |||
| \begin{eqnarray*} | \begin{eqnarray*} | ||
| - | {n \choose | + | P(X=2) & = & {{3} \choose |
| + | & = & 0.0694 | ||
| \end{eqnarray*} | \end{eqnarray*} | ||
| - | **The number of successes in n independent Bernoulli trials has a binomial distribution.** | + | < |
| + | > dbinom(2, 3, 1/6) | ||
| + | [1] 0.06944444 | ||
| + | > | ||
| + | </ | ||
| - | n independent Bernoulli trials | + | ====== Expectation and Variance of ====== |
| - | * There are n independent trials | + | Toss a fair coin once. What is the distribution of the number of heads? |
| - | * Each trial can result in one of two possible outcomes, labelled | + | * A single trial |
| - | * success can be a bad thing -- tire blow-up. | + | * The trial can be one of two possible outcomes |
| - | * P(success) = p, | + | * P(success) = p |
| * P(failure) = 1-p | * P(failure) = 1-p | ||
| + | |||
| + | X = 0, 1 (failure and success) | ||
| + | $P(X=x) = p^{x}(1-p)^{1-x}$ or | ||
| + | $P(x) = p^{x}(1-p)^{1-x}$ | ||
| + | |||
| + | 참고. | ||
| + | | x | 0 | 1 | | ||
| + | | p(x) | q = (1-p) | p | | ||
| + | |||
| + | When x = 0 (failure), $P(X = 0) = p^{0}(1-p)^{1-0} = (1-p)$ = Probability of failure | ||
| + | When x = 1 (success), $P(X = 1) = p^{1}(1-p)^{0} = p $ = Probability of success | ||
| + | |||
| + | |||
| + | This is called Bernoulli distribution. | ||
| + | * Bernoulli distribution expands to binomial distribution, | ||
| + | * Binomial distribution = The distribution of number of success in n independent Bernoulli trials. | ||
| + | * Geometric distribution = The distribution of number of trials to get the first success in independent Bernoulli trials. | ||
| + | |||
| + | $$X \sim B(1,p)$$ | ||
| \begin{eqnarray*} | \begin{eqnarray*} | ||
| - | P(X=x) = _{n}C_{x} \cdot p^{x} \cdot (1-p)^{n-x}, \;\; \text{for} | + | E(X) & = & \sum{x * p(x)} \\ |
| + | & = & (0*q) + (1*p) \\ | ||
| + | & = & p | ||
| + | \end{eqnarray*} | ||
| + | |||
| + | |||
| + | \begin{eqnarray*} | ||
| + | Var(X) & = & E((X - E(X))^{2}) \\ | ||
| + | & = & \sum_{x}(x-E(X))^2p(x) | ||
| + | & = & (0 - p)^{2}*q + (1 - p)^{2}*p | ||
| + | & = & (0^2 - 2p0 + p^2)*q + (1-2p+p^2)*p | ||
| + | & = & p^2*(1-p) + (1-2p+p^2)*p \\ | ||
| + | & = & p^2 - p^3 + p - 2p^2 + p^3 \\ | ||
| + | & = & p - p^2 \\ | ||
| + | & = & p(1-p) \\ | ||
| + | & = & pq | ||
| \end{eqnarray*} | \end{eqnarray*} | ||
| + | For generalization, | ||
| + | $$X \sim B(n,p)$$ | ||
| + | \begin{eqnarray*} | ||
| + | E(X) & = & E(X_{1}) + E(X_{2}) + ... + E(X_{n}) \\ | ||
| + | & = & n * E(X_{i}) \\ | ||
| + | & = & n * p | ||
| + | \end{eqnarray*} | ||
| \begin{eqnarray*} | \begin{eqnarray*} | ||
| - | X \sim B(n, p) \\ | + | Var(X) & = & Var(X_{1}) + Var(X_{2}) + ... + Var(X_{n}) \\ |
| + | & = & n * Var(X_{i}) \\ | ||
| + | & = & n * p * q | ||
| \end{eqnarray*} | \end{eqnarray*} | ||
| + | |||
| + | ===== Proof of Binomial Expected Value and Variance ===== | ||
| + | [[:Mean and Variance of Binomial Distribution|이항분포에서의 기댓값과 분산에 대한 수학적 증명]], Mathematical proof of Binomial Distribution Expected value and Variance | ||
| + | |||
| + | ====== e.g., ====== | ||
| + | <WRAP box> | ||
| + | In the latest round of Who Wants To Win A Swivel Chair, there are 5 questions. The probability of | ||
| + | getting a successful outcome in a single trial is 0.25 | ||
| + | - What’s the probability of getting exactly two questions right? | ||
| + | - What’s the probability of getting exactly three questions right? | ||
| + | - What’s the probability of getting two or three questions right? | ||
| + | - What’s the probability of getting no questions right? | ||
| + | - What are the expectation and variance? | ||
| + | </ | ||
| + | |||
| + | Ans 1. | ||
| + | < | ||
| + | p <- .25 | ||
| + | q <- 1-p | ||
| + | r <- 2 | ||
| + | n <-5 | ||
| + | # combinations of 5,2 | ||
| + | c <- choose(n, | ||
| + | ans1 <- c*(p^r)*(q^(n-r)) | ||
| + | ans1 # or | ||
| + | |||
| + | choose(n, r)*(p^r)*(q^(n-r)) | ||
| + | |||
| + | dbinom(r, n, p) | ||
| + | |||
| + | </ | ||
| + | |||
| + | < | ||
| + | > p <- .25 | ||
| + | > q <- 1-p | ||
| + | > r <- 2 | ||
| + | > n <-5 | ||
| + | > # combinations of 5,2 | ||
| + | > c <- choose(n,r) | ||
| + | > ans <- c*(p^r)*(q^(n-r)) | ||
| + | > ans | ||
| + | [1] 0.2636719 | ||
| + | > | ||
| + | > choose(n, r)*(p^r)*(q^(n-r)) | ||
| + | [1] 0.2636719 | ||
| + | > | ||
| + | > dbinom(r, n, p) | ||
| + | [1] 0.2636719 | ||
| + | > | ||
| + | > | ||
| + | </ | ||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | |||
| + | Ans 2. | ||
| + | < | ||
| + | p <- .25 | ||
| + | q <- 1-p | ||
| + | r <- 3 | ||
| + | n <-5 | ||
| + | # combinations of 5,3 | ||
| + | c <- choose(n,r) | ||
| + | ans2 <- c*(p^r)*(q^(n-r)) | ||
| + | ans2 | ||
| + | |||
| + | choose(n, r)*(p^r)*(q^(n-r)) | ||
| + | |||
| + | dbinom(r, n, p) | ||
| + | |||
| + | </ | ||
| + | < | ||
| + | > p <- .25 | ||
| + | > q <- 1-p | ||
| + | > r <- 3 | ||
| + | > n <-5 | ||
| + | > # combinations of 5,3 | ||
| + | > c <- choose(n,r) | ||
| + | > ans2 <- c*(p^r)*(q^(n-r)) | ||
| + | > ans2 | ||
| + | [1] 0.08789062 | ||
| + | > | ||
| + | > choose(n, | ||
| + | [1] 0.08789062 | ||
| + | > | ||
| + | > dbinom(r, n, p) | ||
| + | [1] 0.08789063 | ||
| + | > | ||
| + | > | ||
| + | </ | ||
| + | |||
| + | Ans 3. 중요 | ||
| + | < | ||
| + | ans1 + ans2 | ||
| + | dbinom(2, 5, .25) + dbinom(3, 5, .25) | ||
| + | dbinom(2:3, 5, .25) | ||
| + | sum(dbinom(2: | ||
| + | pbinom(3, 5, .25) - pbinom(1, 5, .25) | ||
| + | </ | ||
| + | |||
| + | < | ||
| + | > ans1 + ans2 | ||
| + | [1] 0.3515625 | ||
| + | > dbinom(2, 5, .25) + dbinom(3, 5, .25) | ||
| + | [1] 0.3515625 | ||
| + | > dbinom(2:3, 5, .25) | ||
| + | [1] 0.26367187 0.08789063 | ||
| + | > sum(dbinom(2: | ||
| + | [1] 0.3515625 | ||
| + | > pbinom(3, 5, .25) - pbinom(1, 5, .25) | ||
| + | [1] 0.3515625 | ||
| + | > | ||
| + | </ | ||
| + | |||
| + | Ans 4. | ||
| + | < | ||
| + | p <- .25 | ||
| + | q <- 1-p | ||
| + | r <- 0 | ||
| + | n <-5 | ||
| + | # combinations of 5,3 | ||
| + | c <- choose(n,r) | ||
| + | ans4 <- c*(p^r)*(q^(n-r)) | ||
| + | ans4 | ||
| + | </ | ||
| + | |||
| + | < | ||
| + | > q <- 1-p | ||
| + | > r <- 0 | ||
| + | > n <-5 | ||
| + | > # combinations of 5,3 | ||
| + | > c <- choose(n,r) | ||
| + | > ans4 <- c*(p^r)*(q^(n-r)) | ||
| + | > ans4 | ||
| + | [1] 0.2373047 | ||
| + | > </ | ||
| + | |||
| + | Ans 5 | ||
| + | < | ||
| + | p <- .25 | ||
| + | q <- 1-p | ||
| + | n <- 5 | ||
| + | exp.x <- n*p | ||
| + | exp.x | ||
| + | </ | ||
| + | < | ||
| + | > q <- 1-p | ||
| + | > n <- 5 | ||
| + | > exp.x <- n*p | ||
| + | > exp.x | ||
| + | [1] 1.25</ | ||
| + | |||
| + | < | ||
| + | p <- .25 | ||
| + | q <- 1-p | ||
| + | n <- 5 | ||
| + | var.x <- n*p*q | ||
| + | var.x | ||
| + | </ | ||
| + | < | ||
| + | > q <- 1-p | ||
| + | > n <- 5 | ||
| + | > var.x <- n*p*q | ||
| + | > var.x | ||
| + | [1] 0.9375 | ||
| + | > </ | ||
| + | |||
| + | Q. 한 문제를 맞힐 확률은 1/4 이다. 총 여섯 문제가 있다고 할 때, 0에서 5 문제를 맞힐 확률은? dbinom을 이용해서 구하시오. | ||
| + | < | ||
| + | p <- 1/4 | ||
| + | q <- 1-p | ||
| + | n <- 6 | ||
| + | pbinom(5, n, p) | ||
| + | 1 - dbinom(6, n, p) | ||
| + | sum(dbinom(0: | ||
| + | </ | ||
| + | |||
| + | < | ||
| + | > p <- 1/4 | ||
| + | > q <- 1-p | ||
| + | > n <- 6 | ||
| + | > pbinom(5, n, p) | ||
| + | [1] 0.9997559 | ||
| + | > 1 - dbinom(6, n, p) | ||
| + | [1] 0.9997559 | ||
| + | |||
| + | </ | ||
| + | |||
| + | 중요 . . . . | ||
| + | < | ||
| + | # http:// | ||
| + | # ################################################################## | ||
| + | # | ||
| + | p <- 1/4 | ||
| + | q <- 1 - p | ||
| + | n <- 5 | ||
| + | r <- 0 | ||
| + | all.dens <- dbinom(0:n, n, p) | ||
| + | all.dens | ||
| + | sum(all.dens) | ||
| + | |||
| + | choose(5, | ||
| + | choose(5, | ||
| + | choose(5, | ||
| + | choose(5, | ||
| + | choose(5, | ||
| + | choose(5, | ||
| + | all.dens | ||
| + | |||
| + | choose(5, | ||
| + | choose(5, | ||
| + | choose(5, | ||
| + | choose(5, | ||
| + | choose(5, | ||
| + | choose(5, | ||
| + | sum(all.dens) | ||
| + | # | ||
| + | (p+q)^n | ||
| + | # note that n = whatever, (p+q)^n = 1 | ||
| + | |||
| + | </ | ||
| + | |||
| + | < | ||
| + | > # http:// | ||
| + | > # ################################################################## | ||
| + | > # | ||
| + | > p <- 1/4 | ||
| + | > q <- 1 - p | ||
| + | > n <- 5 | ||
| + | > r <- 0 | ||
| + | > all.dens <- dbinom(0:n, n, p) | ||
| + | > all.dens | ||
| + | [1] 0.2373046875 0.3955078125 0.2636718750 0.0878906250 | ||
| + | [5] 0.0146484375 0.0009765625 | ||
| + | > sum(all.dens) | ||
| + | [1] 1 | ||
| + | > | ||
| + | > choose(5, | ||
| + | [1] 0.2373047 | ||
| + | > choose(5, | ||
| + | [1] 0.3955078 | ||
| + | > choose(5, | ||
| + | [1] 0.2636719 | ||
| + | > choose(5, | ||
| + | [1] 0.08789062 | ||
| + | > choose(5, | ||
| + | [1] 0.01464844 | ||
| + | > choose(5, | ||
| + | [1] 0.0009765625 | ||
| + | > all.dens | ||
| + | [1] 0.2373046875 0.3955078125 0.2636718750 0.0878906250 | ||
| + | [5] 0.0146484375 0.0009765625 | ||
| + | > | ||
| + | > choose(5, | ||
| + | + | ||
| + | + | ||
| + | + | ||
| + | + | ||
| + | + | ||
| + | [1] 1 | ||
| + | > sum(all.dens) | ||
| + | [1] 1 | ||
| + | > # | ||
| + | > (p+q)^n | ||
| + | [1] 1 | ||
| + | > # note that n = whatever, (p+q)^n = 1 | ||
| + | > | ||
| + | </ | ||
| + | |||
| + | |||
binomial_distribution.1572847993.txt.gz · Last modified: by hkimscil
