Central Limit Theorem(CLT)

Stat Digest: What is the intution behind Central Limit Theorem(CTL)?¶

Ref : Stat Digest: What is the intuition behind Central Limit Theorem?¶

임의이 확률변수의 샘플링 분포는 정규분포(normal distribution) 이다.
샘플링 분포의 평균은 모집단(population)평균의 훌륭한 추정 이다.
일반적인 샘플의 크기는 ? 30 이상
CTL은 어디에 적용되는가 ?
- 평균(mean), 합계(sum), 비율(proportion), 임의의 first-order 측정(any first-order measurement)
CTL이 표준편차(standard deviation)에도 적용되는가? NO!
샘플 표준편차($\sigma_s$)와 모집단 표준편차$(\sigma)$ 간의 관계는 ?
- $\sigma_s = \frac{\sigma}{\sqrt{N}}$
- $N$ : 샘플 크기
샘플크기외 CTL을 충족하기 위한 다른 중요한 요건은 ?
- 관찰데이터(observation)은 무작위로 샘플링 되어야 한다.
- 샘플크기의 상한선 필요
  - 샘플크기 : 최소 30 이상
  - 샘플크기는 모집단의 크기에 비해 충분히 작아야 한다
  - 샘플크기는 일반적인 경험상 모집단의 10% 미만

예시¶

1 ~ 85 까지 나이를 무작위로 10,000개를 생성하고 이를 모집단으로 사용한다. 이 모집단에서 30개의 샘플을 복원추출로 무작위로 100,1000,10000개를 샘플링 했을 때 각 샘플의 평균이 CTL이 되는지 본다.

In [1]:

using Random
using Distributions
using StatsBase
using Statistics
using StatsPlots

In [2]:

Random.seed!(1);
population = rand(1:85,10000);

In [3]:

begin
    stephist(population,bins=30,
        xlabel="Age", ylabel="Frequency",
        legend=false);
end

Out[3]:

In [5]:

function sampling_distribution(population::Vector{Int64}, 
            sample_size::Int64,
            sampling_count::Int64)
    sample_means = Vector{Float64}();
    for i in 1:sampling_count
        sample = StatsBase.sample(population,sample_size,replace=true);
        sample_mean = StatsBase.mean(sample);
        push!(sample_means,sample_mean);
    end
    mean_of_samples_mean = mean(sample_means);
    stephist(sample_means,
        xlabel="Sample Mean Age", ylabel="Frequency",
        title="Sample Size = $(sample_size), 
            mean = $(mean_of_samples_mean)",legend=false)
end
display(sampling_distribution(population, 30, 100));
display(sampling_distribution(population, 30, 1000));
display(sampling_distribution(population, 30, 10000));

In [ ]:

Stat Digest: What is the intution behind Central Limit Theorem(CTL)?¶

Ref : Stat Digest: What is the intuition behind Central Limit Theorem?¶

예시¶

댓글 달기 댓글 취소