In surveys research, statistics are applied to samples that have been generated using conventional strategies for randomization. These statistics represent the degree to which a researcher can be confident that the study sample is reasonably valid and reliable.
What is a Confidence Interval?
A confidence interval is the margin of error that a researcher would experience if he or she could ask a particular research question, say, of every member of the target population and receive the same answer back that the members of the sample gave in the survey. For example, if the researcher used a confidence interval of 4 and 60% of the participants in the survey sample answered "Would recommend to friends," he could be sure that between 54% and 64% of the members of the entire target population would also say "Would recommend to friends" when asked the same question. The confidence interval, in this case, is +/- 4.
What is a Confidence Level?
A confidence level is an expression of how confident a researcher can be of the data obtained from a sample. Confidence levels are expressed as a percentage and indicate how frequently that percentage of the target population would give an answer that lies within the confidence interval. The most commonly used confidence level is 95%. A related concept is called statistical significance.
A researcher's confidence in the probability that his sample is truly representative of the target population is influenced by a number of factors. A researcher's confidence in their study design and implementation -- and an awareness of its limitations -- is largely based on three important variables: Sample size, frequency of response, and population size. Researchers have long agreed that these variables must be carefully considered during the research planning phase.
- Sample Size Generally speaking, larger samples deliver data that truly reflect the target population. A wide confidence interval is indicative of less confidence in the data because there is a greater margin for error. A wide confidence interval is like hedging your bets. Although there is a relationship between confidence interval and sample size, but it is not a linear relationship. A researcher cannot cut a confidence level in half by doubling the sample size.
Frequency of response The accuracy with which sample data reflects the target population depends also on the percentage of respondents who gave a particular answer or responded in a specific way. The greater the number of respondents who gave a particular answer, say "Very happy," the more sure the researcher can be of that response. There will be some variability in the percentage in the middle areas of the normal curve. That is, if a researcher is 50% confident that members of the target populations will respond (within a confidence interval) like members of the sample population, there is likely to be some variation from that 50% level.
It is good to remember that outliers (data that is on the far ends, or tails, of the normal curve) are more likely to occur at about the same rate in the population as they do in a sample -- there is less variability here, because there is lower frequency. (Consider how the balls in a Galton Box tend to stack up in the middle at the Science Center exhibit? Only a few balls bounce off into the tails.) For this reason, it is easier to be confident of the frequency of extreme answers.
- Population Size is not an important factor in sample size unless a researcher is working with a population that is very small and known to him or her (e.g., small enough so that all the members of the population can be identified by the researcher).
Creative Research Systems points out that:
The mathematics of probability proves the size of the population is irrelevant unless the size of the sample exceeds a few percent of the total population you are examining. This means that a sample of 500 people is equally useful in examining the opinions of a state of 15,000,000 as it would be a city of 100,000.
Generating a representative sample can be a costly and time-consuming process. Researchers always face a trade-off between the confidence level they would like to obtain - or the degree of accuracy they need to achieve -- and the confidence level they can afford.