Statistics 简明教程

Statistics - Required Sample Size

检验的一个关键部分是选择检验测量值，即从人群中选择的要用于完成探索的单位数量。对于表征最合适的规模，没有明确的答案或答案。关于检验范围存在一些错误的判断，例如样本应该是人群的 10% 或样本规模与总体的大小有关。然而，如前所述，这些只是错误的判断。样本应有多大是所研究的人群参数中的差异容量，以及专家所需的评估准确性。

A critical part of testing is the choice of the measure of test i.e. the quantity of units to be chosen from the populace for completing the exploration. There is no unequivocal answer or answer for characterizing the most suitable size. There are sure misguided judgments with respect to the span of test like the example ought to be 10% of the populace or the specimen size is relative to the extent of the universe. However as said before, these are just misguided judgments. How extensive a specimen ought to be is capacity of the variety in the populace parameters under study and the assessing exactness required by the specialist.

对最佳样本规模的决策可以从主观和数学两个角度进行。

The decision on optimum size of the sample can be approached from two angles viz. the subjective and mathematical.

Subjective Approach to Determining Sample Size

样本规模的选择受以下所讨论的各种因素影响：

The choice of the size of sample is affected by various factors discussed as below:

The Nature of Population - The level of homogeneity or heterogeneity influences the extent of a specimen. On the off chance that the populace is homogeneous concerning the qualities of interest then even a little size of the specimen is adequate. However in the event that the populace is heterogeneous then a bigger example would be required to guarantee sufficient representativeness.
Nature of Respondent - If the respondents are effortlessly accessible and available then required data can be got from a little example. On the off chance that, notwithstanding, the respondents are uncooperative and non-reaction is relied upon to be high then a bigger specimen is required.
Nature of Study - A onetime study can be led utilizing a substantial example. If there should be an occurrence of examination studies which are of constant nature and are to be seriously completed, a little specimen is more suitable as it is anything but difficult to oversee and hold a little example over a long compass of time.
Sampling Technique Used - An essential variable affecting the span of test is the examining system received. Firstly a non-likelihood system requires a bigger specimen than a likelihood strategy. Besides inside of likelihood testing, if straightforward irregular examining is utilized it requires a bigger example than if stratification is utilized, where a little specimen is adequate.
Complexity of Tabulation - While settling on the specimen estimate the specialist ought to likewise consider the quantity of classifications and classes into which the discoveries are to be assembled and broke down. It has been seen that more the quantity of classifications that are to be produced the bigger is the example size. Since every class ought to be enough spoken to, a bigger specimen is required to give solid measures of the littlest classification.
Availability of Resources - The assets and the time accessible to specialist impact the span of test. Examination is a period and cash escalated assignment, with exercises like readiness of instrument, contracting and preparing field staff, transportation costs and so forth taking up a considerable measure of assets. Subsequently if the scientist does not have enough time and supports accessible he will settle on a littler example.
Degree of Precision and Accuracy Required - . It has turned out to be clear from our prior discourse that accuracy, which is measured by standard blunder, wills high just if S.E is less or the example size is substantial.

此外，为了获得高水平的精确度，需要更大的样本。除了这些主观努力之外，还可以通过数学方法确定样本大小。

Also to get a high level of precision a bigger specimen is required. Other then these subjective efforts, sample size can be determined mathematically also.

Mathematical Approach to Sample Size Determination

在确定样本大小的数学方法中，首先说明所需的估计精度，然后计算样本大小。精度可以指定为真均值的 ${\pm}$ 1，置信度为 99%。这意味着如果样本均值为 200，则均值的真实值将在 199 和 201 之间。这种精度水平由术语“c”表示

In the mathematical approach to sample size determination the precision of estimate required is stated first and then the sample size is worked out. The precision can be specified as ${\pm}$ 1 of the true mean with 99% confidence level. This means that if the sample mean is 200, then the true value of the mean will be between 199 and 201. This level of precision is denoted by the term 'c'

Sample Size determination for means.

总体平均值的置信区间由以下公式给出

The confidence interval for the universe mean is given by

其中——

Where −

${\bar x}$ = Sample mean
${e}$ = Acceptable error
${Z}$ = Value of standard normal variate at a given confidence level
${\sigma_p}$ = Standard deviation of the population
${n}$ = Size of the sample

可接受的误差“e”，即 ${\mu}$ 和 ${\bar x}$ 之间的差值由以下公式给出

The acceptable error 'e' i.e. the difference between ${\mu}$ and ${\bar x}$ is given by

因此，样本大小为：

Thus, Size of the sample is:

或

如果样本大小相对于总体大小显着，则上述公式将由有限总体乘数进行校正。

In case the sample size is significant visa-a-vis the population size then above formula will be corrected by the finite population multiplier.

其中——

Where −

${N}$ = size of the population

Sample Size Determination for Proportions

在估计比例时确定样本大小的方法与估计平均值的方法相同。总体比例 ${\hat p}$ 的置信区间由以下公式给出

The method for determining the sample size when estimating a proportion remains the same as the method for estimating the mean. The confidence interval for universe proportion ${\hat p}$ is given by

其中——

Where −

${p}$ = sample proportion
${q = (1 - p)}$
${Z}$ = Value of standard normal variate for a sample proportion
${n}$ = Size of the sample

由于 ${ \hat p}$ 将被估计，因此可以通过采用一个可接受值 p = 0.5 来确定 p 的值，进而得出保守样本量。另一种选择是通过试点研究或根据个人判断来估计 p 的值。已知 p 的值，则可接受的误差“e”表示为：

Since ${ \hat p}$ is to be estimated hence the value of p can be determined by taking the value of p = 0.5, an acceptable value, giving a conservative sample size. The other option is that the value of p is estimated either through a pilot study or on a personal judgement basis. Given the value of p, the acceptable error 'e' is given by

如果总体是有限的，那么上述公式将由有限总体乘数进行修正。

In case the population is finite then the above formula will be corrected by the finite population multiplier.

Example

Problem Statement:

一家商店有兴趣估计拥有商店特权会员卡的家庭的比例。以前的研究表明，59% 的家庭拥有商店信用卡。在 95% 的置信度下，容忍误差水平为 05。

A shopping store is interested in estimating the proportion of households possessing the store Privilege Membership card. Previous studies have shown that 59% of the household had a store credit card. At 95% confidence level with a tolerable error level of 05.

Solution:

该商店拥有以下信息：

The store has the following information

可以通过应用以下公式来确定样本量：

The sample size can be determined by applying the following formula:

因此，369 户家庭的样本足以进行研究。

Hence a sample of 369 households is sufficient to conduct the study.

由于人口，即目标家庭被认为是 1000 户，并且上述样本是总人口的很大一部分，因此使用了包含有限总体乘数的修正公式。

Since the population i.e. target households are known to be 1000 and the above sample is a significant proportion of total population hence the corrected formula which includes finite population multiplier is used.

因此，如果人口是 1000 户的有限人口，那么进行研究所需的样本量是 270 户。

Thus if the population is a finite one with 1000 households then the sample size required to conduct the study is 270.

从这个例子中可以明显看出，如果已知总体规模，则确定的样本量就会减少。

It is evident from this illustration that if the population size is known then the sample size determined has decreased in size.