Sampling Methods

Random Sampling

Using a statistic (a number derived from a sample) to estimate or generalize a parameter (a number that represents an entire group or population) requires the sample to be representative of, or look like, the rest of the larger sample. There are different ways of selecting samples in order to ensure that the sample is not systematically different from the remaining larger population. One of the most effective, but in some scenarios perhaps the most difficult, ways of assuring that the sample data represents the population is random sampling. Once again, the jargon of statistics may draw a contrast with more a common usage of the word "random." In statistics, the "random" in "random sampling" does not mean disorganized, erratic, or haphazard. Instead, statisticians use "random" to mean that every case or member of a population has an equal chance of being selected for the sample.

To use the example in the previous section, in order to have a sample of a particular model of microwaves be representative of the entire set of microwaves of that model, each individual microwave would need to have an equal likelihood of being chosen for the sample. That is to say that those microwaves manufactured early in the production cycle would have an equal likelihood of being selected as those manufactured late in the production cycle. Defective units or cases—objects or data points of interest to the researcher—would have an equal chance at selection as non-defective units/cases. No characteristic of the microwaves would influence the likelihood of being selected.

There are many techniques used to select random samples, most of which involve computer-aided random selection. Each technique shares a core characteristic illustrated in the classic lottery method. First, each unit or case of the population is assigned a unique identifier. Second, each identifier is placed in a figurative "bowl" and mixed thoroughly. Next, a blindfolded researcher then picks identifiers one at a time from the bowl. In this basic way, no one unit or case has preference of being selected. Each case has an equal chance of being selected.

The major benefit of random sampling is that any differences between the sample and the population from which the sample was selected will not be systematic. You won't be able to pick and choose which microwaves to keep track of. Although in reality, samples with random selection may differ from the larger population (the likelihood of significant differences increases with a smaller sample size), these differences are due to chance rather than to a systematic bias in the selection process. It's better to accidentally get your sample wrong (through random selection) than it is to pick a bad sample on purpose (through biased selection).

Representative Sampling

Another way of selecting cases for a sample is representative sampling. This general term is used to describe a wide variety of sampling methods that share the characteristic of intentional (not random) selection of cases to ensure that the sample matches the larger population, taking specific characteristics of the population into account.

For example, let's say we wanted to know how much our customers spent on average in the last quarter, and we wanted a sample of our customers to be representative of the whole of our customers. In this case, we may be concerned about a number of naturally occurring differences among our customers, such as their age, gender, ethnicity, work status, and so on. Customers with these differences might be expected to have different expenditures, and so to obtain an accurate picture of the overall population of customers during a given quarter, we would want to select a sample that represents each important difference in the population.

Accordingly, we would aim to match the percentages of each important difference in our sample to that of our population. For example, if 10 percent of our customer population during that quarter was unemployed, we would select participants in a manner that resulted in having 10 percent—and only 10 percent—unemployed customers in our sample. If 60 percent of our customer population was female, then 60 percent of our sample should be female, and so on. This will help account for any differences in purchasing habits.

Representative sampling techniques tend to be costly and time-consuming, but they increase our chances of being able to generalize the results from our sample of customers to our population of customers, particularly with respect to pre-existing differences among our customers. In fact, when individual differences within our sample are judged to be important, a representative sampling technique is preferred over random sampling, because with random sampling there is no assurance that our sample will look similar to their population on each of those important variables (characteristics of interest for the cases/units in the population).

To illustrate, imagine you want to survey potential customers in all 50 of the United States because you had about equal sales in each of the states. Since California has a population of almost 40 million and Montana has a population of about 1 million, random sampling would make it far more likely (40 times more likely, in fact) that you would talk to a Californian than someone from Montana. Representative sampling would allow them an equal chance to be contacted.

Convenience Sampling

Convenience sampling is a common method of selecting a sample for research involving human respondents, such as conducting surveys or questionnaires. A core characteristic of the various techniques of convenience sampling is the selection of participants on the basis of availability and willingness to participate. For example, if you want to know if your new store hours are more convenient for customers, you might have comment cards by the door or a survey code at the bottom of your receipts, but the people most likely to complete those things are customers who either feel very strongly in one direction or the other or those who enjoy filling out surveys. Most of your customers will not participate, and you will never know how they feel about the changes you have put in place.

Bias

Bias, or systematic error, is favoring some parts of the population over others. For example, those with strong opinions, more free time, and access to the internet are more likely to take online surveys.

However, if a convenience sample does not differ from a population of interest in ways that influence the outcome of the study, then it is a perfectly acceptable method of selecting a sample.

Want to try our built-in assessments?


Use the Request Full Access button to gain access to this assessment.