Now that you have your survey and design all lined up, it’s time to invite respondents to your survey. This may seem like the easiest step in research, how hard can gathering data be? Let me burst your bubble and tell you, this is one of the most important aspects of survey research and can determine the fate of the entire research endeavor. No pressure.
You need to make sure that you target the right respondents so that they accurately represent what you wish to look into. But, how do you ensure accurate targeting without introducing a source of bias to your research? Bias is the concurring evil of all research, so here are 4 things to consider:
1. The what and the who: research interest and population size
When determining who should answer your survey and how they get a hold of your survey, you first need to figure out what your research interest is. It doesn’t make sense to survey all students at the University of Oxford if you’re trying to figure out how satisfied students are with the dorms at the university. For this, you need to define your population, which is the entirety of your research subjects. For this example, the population is the number of students living in the University of Oxford’s dorms. However, it’s often impossible to survey an entire population due to time and cost issues. Luckily, most of the time, surveying the entire population isn’t necessary. Drawing inferences from samples often get you pretty close to the actual population, just be mindful of the inherent uncertainty they carry.
2. Random sampling
In order to achieve a result that comes as close to the truth as possible, you need to carefully sample your respondents. A sample frame is a list containing the entire population, from which a random sample is drawn. Continuing with the above example, this would be a list of all students living in the dorms, from which random names are selected to participate in the survey. You need to make sure that each student has the same chance of being selected so that you can minimize the sampling error (the natural deviation between sample and population). This is called probability sampling.
3. Sample size
We know you’ve been waiting for this one. The answer to the question; how many respondents do I need? Well… that depends on many factors but don’t worry we have a formula for you.
Let’s not get ahead of ourselves and take this one step at a time. You have to set the maximum error you’re willing to accept in your survey. When doing this, you should be aware of the following two parameters: margin of error and confidence level.
The margin of error is the interval within which you expect to find the value from the population you’re measuring. For instance, let’s say you wish to determine how large the proportion of the 10.000 students in the dorm are exchange students. If we assume that 1.000 are exchange students (10% of the population) with 5% margin of error, it really means it is about 500 (5%) and 1500 (15%).
The confidence level expresses how confident you feel about the value you look for within the margin of error. For example, in the previous case, if you choose a 95% confidence level, we could say the percentage of students in the dorm that are exchange students in 95% of the cases, is between 5% and 15%. In other words, if we repeat the survey 100 times, the proportion we’re looking for would be within the interval 95% of the cases and it would be out the interval in the remaining 5% of the cases.
The margin of error, confidence level, and sample size are always linked and co-dependent. Modifying any of these values will change the others:
- Minimizing the margin of error will require a bigger sample size.
- Increasing the confidence level will require a bigger sample size.
So, once you’ve decided on the margin of error and the confidence level you can use the formula to determine your sample size!
Pro tip: As a reference, the margin of error in political polls is usually 3%.
n: | sample size to be calculated |
N: | size of the population (e.g. 1 000 students) |
Z: | refers to the confidence level and is derived from a statistical distribution
|
e: | maximum margin of error I tolerate (e.g. 5%) |
p: | proportion we expect to find. As a general rule, if we don’t have any information about the value we expect to find, we use p=50% . |
4. Non-response and response bias
Once you’ve determined your ideal sample size, add a couple extra! Why, you ask? Because there will most likely be some who don’t want to answer your survey. To counter the effects of people not responding, you may want to increase your sample size by the expected non-response rate. So, why should you care?
This has got to do with two more biases related to your respondents that may influence your data. A response bias is mainly on the side of the respondent, who doesn’t understand the question or is lying while answering the question. You can counter this by making sure your questions are properly phrased and that respondents trust that their answers are anonymous and/or confidential.