User Tools

Site Tools


binomial_distribution

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
binomial_distribution [2020/11/27 19:42] hkimscilbinomial_distribution [2025/10/11 08:26] (current) – [e.g.,] hkimscil
Line 1: Line 1:
-====== Binomial Distribution ====== +====== Binomial Distributions ======
-  - 1번의 시행에서 특정 사건 A가 발생할 확률을 p라고 하면 +
-  - n번의 (독립적인) 시행에서 사건 A가 발생할 때의 확률 분포를 +
-  - 이항확률분포라고 한다.+
  
 +  - 1번의 시행에서 특정 사건 A가 발생할 확률을 p라고 하면 
 +  - n번의 (독립적인) 시행에서 사건 A가 발생할 때의 확률 분포를 
 +  - **이항확률분포**라고 한다.
 +아래를 보면
 +  * 각 한문제를 맞힐 확률은 1/4, 틀릴 확률은 3/4
 +  * 3문제를 풀면서 (3번의 시행) 각 문제를 맞힐 확률 분포를 말한다. 
 +  * 기하분포의 경우, 각 문제를 맞히거나 틀리거나를 고려하지 않고 계속 틀리다가 언젠가 한번 맞힘으로써 사건이 끝난다. 
 +
 +{{:b:head_first_statistics:pasted:20191030-035316.png}}
 +{{:b:head_first_statistics:pasted:20191030-035452.png}}
 +
 +| x  | P(X=x)                    | power of .75  | power of .25  |
 +| 0  | 0.75 * 0.75 * 0.75        | 3  | 0  |
 +| 1  | 3 * (0.75 * 0.75 * 0.25)  | 2  | 1  |
 +| 2  | 3 * (0.75 * 0.25 * 0.25)  | 1  | 2  |
 +| 3  | 0.25 * 0.25 * 0.25        | 0  | 3  |
 +{{:b:head_first_statistics:pasted:20191030-040346.png}}
 +
 +$$P(X = r) = {\huge\text{?} \cdot 0.25^{r} \cdot 0.75^{3-r}} $$
 +$$P(X = r) = {\huge_{3}C_{r}} \cdot 0.25^{r} \cdot 0.75^{3-r}$$
 +
 +$_{n}C_{r}$은 n개의 사물에서 r개를 (순서없이) 고르는 방법의 수라고 할 때, 3개의 질문 중에서 한 개의 정답을 맞히는 방법은 $_{3}C_{1} = 3$ 세가지가 존재.
 +
 +
 +
 +Probability for getting one question right
 \begin{eqnarray*} \begin{eqnarray*}
-{\choose x\displaystyle \frac {n!}{x!(n-x)!}  \\ +P(X = r) & = &  _{3}C_{1} \cdot 0.25^{1} \cdot 0.75^{3-1} \\ 
-\end{eqnarray*}+& = & \frac{3!}{1\cdot (3-1)!} \cdot 0.25 \cdot 0.75^2 \\ 
 +& = & 3 \cdot 0.25 \cdot 0.5625 \\ 
 +& = & 3 \cdot 0.25 \cdot 0.5625 \\ 
 +& = & 0.421875 
 +\end{eqnarray*} 
  
-**The number of successes in independent Bernoulli trials has a binomial distribution.** +$$P(X = r) = _{n}C_{r} \cdot 0.25^{r} \cdot 0.75^{n-r}$$ 
 +$$P(X = r) = _{n}C_{r} \cdot p^{r} \cdot q^{n-r}$$
  
-이는 n 번의 독립적인 Bernoulli trials 로 볼 수 있. +  - You’re running a series of independent trials. (n번의 시행을 하게 된) 
-  There are n independent trials +  There can be either a success or failure for each trial, and the probability of success is the same for each trial. (각 시행은 성공/실패로 구분되고 성공의 확률은 (반대로 실패의 확률도) 각 시행마다 동일하다
-  * Each trial can result in one of two possible outcomes, labelled success and failure+  - There are a finite number of trials. Note that this is different from that of geometric distribution. (n번의 시행으로 한정된다. 무한대 시행이 아님)
-    * success can be a bad thing -- tire blow-up. +
-  * P(success= p,  +
-  * P(failure= 1-p+
  
-일반으로 binomial distribution은 아래와 같이 계산된다. +X가 n번의 시행에서 성공인 결과를 얻는 수를 나타낸다고 할 때, r번의 성공이 있을 확률을 구하려면 아래 공식을 용한다.
  
-\begin{align*} +\begin{eqnarray*}  
-P(X=x) & = _{n}C_{x} \cdot p^{x} \cdot (1-p)^{n-x}\;\; \text{for} \;\; x = 0, 1, 2, . . ., n. \\ +P(X = r) & = _{n}C_{r} \cdot p^{r} \cdot q^{n-r} \;\;\; \text{Where,} \\ 
-\text{or \\ +_{n}C_{r& = & \frac {n!}{r!(n-r)!} 
-P(X=x) & = {{n} \choose {x}} \cdot p^{x\cdot (1-p)^{n-x}\;\; \text{for\;\; x 0, 1, 2, . . ., n. \\ +\end{eqnarray* 
-\end{align*}+ 
 +각 시행에서 성공할 확률 
 += 시행 숫자 
 +r = r 개의 정답을 구할 확률 
 + 
 +$$X \sim B(n,p)$$
  
-A balanced dice is rolled 3 times. What is probability a 5 comes up exactly twice? 
  
-p = 1/6 
-n = 3 
-x = 2 
  
 \begin{eqnarray*} \begin{eqnarray*}
Line 42: Line 68:
 </code> </code>
  
 +====== Expectation and Variance of ======
 +Toss a fair coin once. What is the distribution of the number of heads?
 +  * A single trial
 +  * The trial can be one of two possible outcomes -- success and failure
 +  * P(success) = p
 +  * P(failure) = 1-p
  
 +X = 0, 1 (failure and success)
 +$P(X=x) = p^{x}(1-p)^{1-x}$ or 
 +$P(x) = p^{x}(1-p)^{1-x}$
 +
 +참고.
 +| x     | 0          | 1  |
 +| p(x)  | q = (1-p)  | p  | 
 +
 +When x = 0 (failure), $P(X = 0) = p^{0}(1-p)^{1-0} = (1-p)$ = Probability of failure
 +When x = 1 (success), $P(X = 1) = p^{1}(1-p)^{0} = p $ = Probability of success
 +
 +
 +This is called Bernoulli distribution.
 +  * Bernoulli distribution expands to binomial distribution, geometric distribution, etc.
 +  * Binomial distribution = The distribution of number of success in n independent Bernoulli trials.
 +  * Geometric distribution = The distribution of number of trials to get the first success in independent Bernoulli trials.
 +
 +$$X \sim B(1,p)$$
  
 \begin{eqnarray*} \begin{eqnarray*}
-X \sim B(n, p) \\+E(X) & = & \sum{x * p(x)} \\ 
 +& = & (0*q) + (1*p) \\ 
 +& = & p  
 +\end{eqnarray*}  
 + 
 + 
 +\begin{eqnarray*} 
 +Var(X) & = & E((X - E(X))^{2}) \\ 
 +& = & \sum_{x}(x-E(X))^2p(x)   \ldots \ldots \ldots E(X) = p \\ 
 +& = & (0 - p)^{2}*q + (1 - p)^{2}*p  \\ 
 +& = & (0^2 - 2p0 + p^2)*q + (1-2p+p^2)*p \\ 
 +& = & p^2*(1-p) + (1-2p+p^2)*p \\ 
 +& = & p^2 - p^3 + p - 2p^2 + p^3 \\ 
 +& = & p - p^2 \\ 
 +& = & p(1-p) \\ 
 +& = & pq
 \end{eqnarray*} \end{eqnarray*}
 +
 +For generalization, 
 +
 +$$X \sim B(n,p)$$
 +
 +\begin{eqnarray*}
 +E(X) & = & E(X_{1}) + E(X_{2}) + ... + E(X_{n}) \\
 +& = & n * E(X_{i}) \\
 +& = & n * p 
 +\end{eqnarray*}
 +
 +\begin{eqnarray*}
 +Var(X) & = & Var(X_{1}) + Var(X_{2}) + ... + Var(X_{n}) \\
 +& = & n * Var(X_{i}) \\
 +& = & n * p * q 
 +\end{eqnarray*}
 +
 +===== Proof of Binomial Expected Value and Variance =====
 +[[:Mean and Variance of Binomial Distribution|이항분포에서의 기댓값과 분산에 대한 수학적 증명]], Mathematical proof of Binomial Distribution Expected value and Variance
 +
 +====== e.g., ======
 +<WRAP box>
 +In the latest round of Who Wants To Win A Swivel Chair, there are 5 questions. The probability of
 +getting a successful outcome in a single trial is 0.25
 +  - What’s the probability of getting exactly two questions right?
 +  - What’s the probability of getting exactly three questions right? 
 +  - What’s the probability of getting two or three questions right? 
 +  - What’s the probability of getting no questions right?
 +  - What are the expectation and variance?
 +</WRAP>
 +
 +Ans 1. 
 +<code>
 +p <- .25
 +q <- 1-p
 +r <- 2
 +n <-5
 +# combinations of 5,2
 +c <- choose(n,r) 
 +ans1 <- c*(p^r)*(q^(n-r))
 +ans1    # or
 +
 +choose(n, r)*(p^r)*(q^(n-r))
 +
 +dbinom(r, n, p)
 +
 +</code>
 +
 +<code>
 +> p <- .25
 +> q <- 1-p
 +> r <- 2
 +> n <-5
 +> # combinations of 5,2
 +> c <- choose(n,r)
 +> ans <- c*(p^r)*(q^(n-r))
 +> ans
 +[1] 0.2636719
 +>
 +> choose(n, r)*(p^r)*(q^(n-r))
 +[1] 0.2636719
 +>
 +> dbinom(r, n, p)
 +[1] 0.2636719
 +
 +
 +</code>
 +
 +
 +
 +
 +
 +
 +Ans 2. 
 +<code>
 +p <- .25
 +q <- 1-p
 +r <- 3
 +n <-5
 +# combinations of 5,3
 +c <- choose(n,r)
 +ans2 <- c*(p^r)*(q^(n-r))
 +ans2
 +
 +choose(n, r)*(p^r)*(q^(n-r))
 +
 +dbinom(r, n, p)
 +
 +</code>
 +<code>
 +> p <- .25
 +> q <- 1-p
 +> r <- 3
 +> n <-5
 +> # combinations of 5,3
 +> c <- choose(n,r)
 +> ans2 <- c*(p^r)*(q^(n-r))
 +> ans2
 +[1] 0.08789062
 +
 +> choose(n,r)*(p^r)*(q^(n-r))
 +[1] 0.08789062
 +
 +> dbinom(r, n, p)
 +[1] 0.08789063
 +
 +
 +</code>
 +
 +Ans 3. 중요 
 +<code>
 +ans1 + ans2
 +dbinom(2, 5, .25) + dbinom(3, 5, .25) 
 +dbinom(2:3, 5, .25)
 +sum(dbinom(2:3, 5, .25))
 +pbinom(3, 5, .25) - pbinom(1, 5, .25)
 +</code>
 +
 +<code>
 +> ans1 + ans2
 +[1] 0.3515625
 +> dbinom(2, 5, .25) + dbinom(3, 5, .25) 
 +[1] 0.3515625
 +> dbinom(2:3, 5, .25)
 +[1] 0.26367187 0.08789063
 +> sum(dbinom(2:3, 5, .25))
 +[1] 0.3515625
 +> pbinom(3, 5, .25) - pbinom(1, 5, .25)
 +[1] 0.3515625
 +
 +</code>
 +
 +Ans 4. 
 +<code>
 +p <- .25
 +q <- 1-p
 +r <- 0
 +n <-5
 +# combinations of 5,3
 +c <- choose(n,r)
 +ans4 <- c*(p^r)*(q^(n-r))
 +ans4
 +</code>
 +
 +<code>> p <- .25
 +> q <- 1-p
 +> r <- 0
 +> n <-5
 +> # combinations of 5,3
 +> c <- choose(n,r)
 +> ans4 <- c*(p^r)*(q^(n-r))
 +> ans4
 +[1] 0.2373047
 +> </code>
 +
 +Ans 5
 +<code>
 +p <- .25
 +q <- 1-p
 +n <- 5
 +exp.x <- n*p
 +exp.x
 +</code>
 +<code>> p <- .25
 +> q <- 1-p
 +> n <- 5
 +> exp.x <- n*p
 +> exp.x
 +[1] 1.25</code>
 +
 +<code>
 +p <- .25
 +q <- 1-p
 +n <- 5
 +var.x <- n*p*q
 +var.x
 +</code>
 +<code>> p <- .25
 +> q <- 1-p
 +> n <- 5
 +> var.x <- n*p*q
 +> var.x
 +[1] 0.9375
 +> </code>
 +
 +Q. 한 문제를 맞힐 확률은 1/4 이다. 총 여섯 문제가 있다고 할 때, 0에서 5 문제를 맞힐 확률은? dbinom을 이용해서 구하시오.
 +<code>
 +p <- 1/4
 +q <- 1-p
 +n <- 6
 +pbinom(5, n, p)
 +1 - dbinom(6, n, p)
 +sum(dbinom(0:5, n, p))
 +</code> 
 +
 +<code>
 +> p <- 1/4
 +> q <- 1-p
 +> n <- 6
 +> pbinom(5, n, p)
 +[1] 0.9997559
 +> 1 - dbinom(6, n, p)
 +[1] 0.9997559
 +
 +</code>
 +
 +중요 . . . . 
 +<code>
 +# http://commres.net/wiki/mean_and_variance_of_binomial_distribution
 +# ##################################################################
 +#
 +p <- 1/4
 +q <- 1 - p
 +n <- 5
 +r <- 0
 +all.dens <- dbinom(0:n, n, p)
 +all.dens
 +sum(all.dens)
 +
 +choose(5,0)*p^0*(q^(5-0))
 +choose(5,1)*p^1*(q^(5-1))
 +choose(5,2)*p^2*(q^(5-2))
 +choose(5,3)*p^3*(q^(5-3))
 +choose(5,4)*p^4*(q^(5-4))
 +choose(5,5)*p^5*(q^(5-5))
 +all.dens
 +
 +choose(5,0)*p^0*(q^(5-0)) + 
 +  choose(5,1)*p^1*(q^(5-1)) + 
 +  choose(5,2)*p^2*(q^(5-2)) + 
 +  choose(5,3)*p^3*(q^(5-3)) + 
 +  choose(5,4)*p^4*(q^(5-4)) + 
 +  choose(5,5)*p^5*(q^(5-5))
 +sum(all.dens)
 +
 +(p+q)^n
 +# note that n = whatever, (p+q)^n = 1
 +
 +</code>
 +
 +<code>
 +> # http://commres.net/wiki/mean_and_variance_of_binomial_distribution
 +> # ##################################################################
 +> #
 +> p <- 1/4
 +> q <- 1 - p
 +> n <- 5
 +> r <- 0
 +> all.dens <- dbinom(0:n, n, p)
 +> all.dens
 +[1] 0.2373046875 0.3955078125 0.2636718750 0.0878906250
 +[5] 0.0146484375 0.0009765625
 +> sum(all.dens)
 +[1] 1
 +
 +> choose(5,0)*p^0*(q^(5-0))
 +[1] 0.2373047
 +> choose(5,1)*p^1*(q^(5-1))
 +[1] 0.3955078
 +> choose(5,2)*p^2*(q^(5-2))
 +[1] 0.2636719
 +> choose(5,3)*p^3*(q^(5-3))
 +[1] 0.08789062
 +> choose(5,4)*p^4*(q^(5-4))
 +[1] 0.01464844
 +> choose(5,5)*p^5*(q^(5-5))
 +[1] 0.0009765625
 +> all.dens
 +[1] 0.2373046875 0.3955078125 0.2636718750 0.0878906250
 +[5] 0.0146484375 0.0009765625
 +
 +> choose(5,0)*p^0*(q^(5-0)) + 
 ++   choose(5,1)*p^1*(q^(5-1)) + 
 ++   choose(5,2)*p^2*(q^(5-2)) + 
 ++   choose(5,3)*p^3*(q^(5-3)) + 
 ++   choose(5,4)*p^4*(q^(5-4)) + 
 ++   choose(5,5)*p^5*(q^(5-5))
 +[1] 1
 +> sum(all.dens)
 +[1] 1
 +> # 
 +> (p+q)^n
 +[1] 1
 +> # note that n = whatever, (p+q)^n = 1
 +
 +</code>
 +
 +
binomial_distribution.1606473755.txt.gz · Last modified: by hkimscil

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki