Inferences Based on the MLE (MSE, Standard Error, Consistency, Confidence Interval)

728x90

MSE and Unbiased Estimator

MLE를 통해 추정량 $\hat{\theta}$를 구할 수 있었다. 우리는 이렇게 구한 추정량이 실제 참 값 $\theta$가 되기를 원한다. 이를 평가하기 위한 measure가 필요하다. (to evaluate MLE, which is good and bad)

Mean-squared error (MSE, 평균제곱오차)

$\theta$에 대한 추정량 $\hat{\theta}$의 평균제곱오차 MSE는 다음과 같다.
\[ \text{MSE}(\hat{\theta}) = E[(\hat{\theta} - \theta)^2] \]

Decomposition of MSE
\[ \text{MSE}(\hat{\theta}) = Var(\hat{\theta}) + [\text{Bias}(\hat{\theta})]^2 \]
이 때, $\text{Bias}(\hat{\theta})=E(\hat{\theta}) - \theta$ 이고, $\text{Bias}=0$이면 unbiased estimator of $\theta$라 한다.

How to calculate MSE

$MSE$ 역시 기댓값으로 정의하기 때문에 pdf가 주어졌을 때 $\int (\hat{\theta} - \theta)^2 ?? \ d??$ 의 형태로 계산할 수 있다.

pdf of $\theta$ 인 경우

\[ \text{MSE}(\hat{\theta}) = \int (\hat{\theta} - \theta)^2 f(\hat{\theta})\ d\theta \]

joint pdf of $(X_1, \dots, X_n)$ 인 경우

\[ \text{MSE}(\hat{\theta}) = \idotsint (\hat{\theta} - \theta)^2 f(x_1, \dots x_n)\, dx_1 \dots dx_n \]

Decomposition of MSE

\begin{align*} \text{MSE}(\hat{\theta}) &= E[(\hat{\theta} - \theta)^2] \\ &= E[(\hat{\theta} - E(\hat{\theta}) + E(\hat{\theta}) - \theta )^2] \\ &= E[(\hat{\theta} - E(\hat{\theta}) )^2] + 2\{E(\hat{\theta}) - \theta \}E[\hat{\theta} - E(\hat{\theta})] + E\bigg[ \left( E(\hat{\theta}) - \theta \right)^2 \bigg] \\ &= \text{Var}(\hat{\theta}) + \text{Bias}^2(\hat{\theta}) \end{align*}

Bias-Variance tradeoff

(특히 machine learning 분야에서) simpler model은 low variance를 갖는다.

Standard Error of an Estimator

$\hat{\theta}$의 standard error는 $\hat{\theta}$의 표준편차이다. 즉 $\text{Sd}(\hat{\theta}) = \sqrt{\text{Var}(\hat{\theta})}$가 된다.

그러나 종종 estimated standard error를 "standard error"로 부르기도 한다.

Normal distribution - $\mu$ estimator

$\text{MSE}(\overline{X}) = Var(\overline{X}) + \{ E(\overline{X} - \mu) \}^2 = \frac{\sigma^2}{n}$

$SE(\overline{X}) = \frac{\sigma}{\sqrt{n}}$

$\widehat{SE}(\overline{X}) = \frac{s}{\sqrt{n}}$

Consistency of Estimators

데이터가 많을 수록 추정량은 참 값에 수렴하기를 희망한다. 이를 일치성(consistency)라 한다.

Consistency of Estimators

$T_1, T_2, \dots$가 sequences of estimators라 하자. 이 때 $n \to \infty$일 때 $T_n \overset{P}{\to} \theta$면 consistent하다고 한다.

$n \to \infty$이고 $\text{MSE}(\hat{\theta}) \to 0$이면 $\hat{\theta}$는 consistent하다. (체비셰프 부등식으로 증명)

Confidence Intervals

$\gamma$-confidence interval

모든 $\theta \in \Omega$에 다형 $P(\theta \in C(X)) \ge \gamma$이면, 구간 $C(X) = (l(X),\ u(X))$는 $\theta$의 $\gamma$-confidence interval이라 한다.

Likelihood Method
likelihood를 이용하여 신뢰구간(confidence interval)을 구할 수 있다.
\[ C(x) = \{ \theta: L(\theta | x) \ge k \} \]
이때 $k$를 정하는 방법은 여러가지 있는데
(1) 정확히 $\gamma$로 수렴할 때
(2) width가 최소화 될 때
(3) 구간이 추정량(estimator)에 대하여 대칭이 될 때(preferably)

Likelihood Method

Note: $C(x)$ 자체는 specific한 신뢰구간을 의미하지 않는다. repeated sampling을 했을 때, 해당 구간이 $100 \gamma \%$ 확률로 true value $\theta$를 포함한다는 의미이다.

z-Confidence Intervals

sample $(x_1, \dots, x_n)$이 정규분포 $N(\mu, \sigma_0)$에서 추출된 표본이라 하자. ($\mu$: unknown, $\sigma_0^2$: known)

$x = x_1, \dots, x_n$이라 간략히 표기하면

\begin{align*} C(x) &= \bigg\{\mu: \text{exp}\left[ -\cfrac{n(\overline{x} - \mu)^2}{2\sigma_0^2} \right] \bigg\} \\ &= \bigg\{ \mu: -\cfrac{n(\overline{x} - \mu)^2}{2\sigma_0^2} \ge \ln{k} \bigg\} \\ &= \bigg\{\mu: \overline{x} - k^*\cfrac{\sigma_0}{\sqrt{n}} \le \mu \le \overline{x} + k^* \cfrac{\sigma_0}{\sqrt{n}} \bigg\} \\ &= \bigg[ \overline{x} - k^*\cfrac{\sigma_0}{\sqrt{n}} ,\ \overline{x} + k^* \cfrac{\sigma_0}{\sqrt{n}} \bigg] \end{align*}

이때 $k^* = \sqrt{-2\ln{k}}$

여기서 CLT에 의해 $Z=\cfrac{\overline{X} - \mu}{\sigma_0 \ \sqrt{n}} \sim N(0,1)$임을 이용하여 $k^*$의 값을 구할 것이다.

\begin{align*} \gamma &= P\bigg( \overline{X}-k^*\cfrac{\sigma_0}{\sqrt{n}} \le \mu \le \overline{X}+k^*\cfrac{\sigma_0}{\sqrt{n}} \bigg) \\ &= P(|Z| \le k^*) \\ &= 1-2\left( 1-\Phi(k^*) \right) \end{align*}

$\Phi(k^*) = \cfrac{1+\gamma}{2}$ 임을 얻을 수 있다.

따라서 likelihood로 구한 CI는 아래와 같다.

\[ \bigg[ \overline{x} - z_{(1+\gamma)/2}\cfrac{\sigma_0}{\sqrt{n}},\ \overline{x} + z_{(1+\gamma)/2}\cfrac{\sigma_0}{\sqrt{n}} \bigg] \]

여기서 $\gamma=0.95$라 하면($95\%$) $\cfrac{1+0.95}{2}=0.975$이므로 $k^*=z_{0.975}=1.96$ 을 대입하면 된다.

t-Confidence Intervals

$\mu, \sigma$ 모두 unknown인 정규분포 $N(\mu, \sigma^2)$에서 추출한 sample $(x_1, \dots, x_n)$에 대한 신뢰구간 CI를 구해보자. $\mu, sigma$ 모두 unknown이므로 $SE(\overline{X})=S/ \sqrt{n}$을 이용한다.

위의 동일한 논리(이 경우, CLT를 이용한 $t$분포 근사)를 이용한다.

$T= \left( \cfrac{\overline{X}-\mu}{\sigma / \sqrt{n}} \right) / \sqrt{\cfrac{(n-1)S^2}{\sigma^2}} =\cfrac{\overline{X}-\mu}{S / \sqrt{n}} \sim t(n-1)$를 적용하여 얻은 신뢰구간은 다음과 같다.

\[ \bigg[ \overline{x} - t_{(1+\gamma)/2}(n-1) \cfrac{s}{\sqrt{n}},\ \overline{x} + t_{(1+\gamma)/2}(n-1) \cfrac{s}{\sqrt{n}} \bigg] \]

Confidence Interval of Proprotion

베르누이분포에서 추출한 sample $(x_1, \dots, x_n) \sim Ber(\theta)$에서 $\theta$에 대한 신뢰구간을 구해보자.

\[ C(X) = \{ \theta: \theta^{n\overline{x}} (1-\theta)^{n-n\overline{x}} \ge k \} \Rightarrow \{ ?? \le \theta \le ?? \} \]

근데 $n$이 작으면 이를 계산할 수 있는데 $n$이 크면 계산하기가 매우 곤란하다.

CLT를 이용한 $\cfrac{\sqrt{n}(\overline{X}-\theta)}{\sqrt{\theta(1-\theta)}}$ 대신에 $\cfrac{\sqrt{n}(\overline{X}-\theta)}{\sqrt{\overline{X}(1-\overline{X})}} \overset{D}{\to} N(0, 1)$을 이용하자.

\begin{align*} \gamma &= P \bigg( -z_{(1+\gamma)/2} \le \cfrac{\sqrt{n}(\overline{X} - \theta)}{\sqrt{\overline{X}(1-\overline{X})}} \le z_{(1+\gamma)/2} \bigg) \\ &= P\bigg( \overline{X} - z_{(1+\gamma)/2} \sqrt{\frac{\overline{X}(1-\overline{X})}{n} } \le \theta \le \overline{X} + z_{(1+\gamma)/2} \sqrt{\frac{\overline{X}(1-\overline{X})}{n} } \bigg) \end{align*}

정규분포의 경우와는 다르게 이는 (exact가 아니라) approximate $\gamma$-confidence interval이다. 평균과 분산이 $\theta$로 서로 얽혀있기 때문이다. 그러나 $n$이 충분히 크면 accurate하다.

728x90

'스터디 > 확률과 통계' 카테고리의 다른 글

Power Function of a test (0)	2023.05.26
Testing Hypothesis and p-values (0)	2023.05.25
Maximum Likelihood Estimation (MLE, 최대우도추정법) (0)	2023.05.17
Likelihood function, Sufficient Statistics, Minimum Sufficient Statistics (가능도함수, 충분통계량, 최소충분통계량) (0)	2023.05.16
Ch5. Statistical Inference (0)	2023.05.08

궁금한게많은joon

Inferences Based on the MLE (MSE, Standard Error, Consistency, Confidence Interval)

MSE and Unbiased Estimator

How to calculate MSE

Decomposition of MSE

Bias-Variance tradeoff

Standard Error of an Estimator

Consistency of Estimators

Confidence Intervals

z-Confidence Intervals

t-Confidence Intervals

Confidence Interval of Proprotion

'스터디 > 확률과 통계' 카테고리의 다른 글

티스토리툴바

Inferences Based on the MLE (MSE, Standard Error, Consistency, Confidence Interval)

MSE and Unbiased Estimator

How to calculate MSE

Decomposition of MSE

Bias-Variance tradeoff

Standard Error of an Estimator

Consistency of Estimators

Confidence Intervals

z-Confidence Intervals

t-Confidence Intervals

Confidence Interval of Proprotion

'스터디 > 확률과 통계' 카테고리의 다른 글

관련글

티스토리툴바