Every measurement is subject to some uncertainty but sometimes researchers tend to forget this. A common mistake researchers usually make when interpreting results is ignoring the uncertainty of samples, which leads to decisions based on wrong data interpretations. To make sure we’re all on the same page, let’s start with the basics.
What are samples and what do we use them for?
Market researchers and analysts are usually interested in obtaining knowledge from a certain population, e.g. all employees in an organization. Getting data from the entire population would be ideal, however, this might be impossible to obtain for various reasons, the most common ones being time and money. Instead, researchers use a sample of that specific population. The common approach is to run statistics on the specific sample and use the results as “estimates” for the entire population.
Now that we got that covered, let’s move on to an example
Pure Digital is a marketing agency and they have a customer base of 10.000 customers. They want their customers to rate their satisfaction of the marketing services Pure Digital provides. To do so, they create a one question survey and send it to a subset of 300 customers on a yearly basis.
→ Check it out
Based on the data collected from these 300 customers, Pure Digital calculates an average satisfaction score for each year:
Here’s where the common mistake happens. Researchers and analysts tend to look at the above and conclude that customer satisfaction is deteriorating. But is it? No, it’s not.
The problem
This conclusion is based on the assumption that 3.8 in the sample represents 3.8 in the total population (and in the previous years, the same is true for the average satisfaction of 4.2). This is not correct! If a different sample had been taken, the average satisfaction might have been the same or entirely different. In the above example, Pure Digital got, entirely by chance, some more or less dissatisfied customers into the sample that influenced the average rating. Thus, concluding that the satisfaction score, based on the sample, is a good indication of how satisfied the 300 customers are. What the market researcher didn’t do, is take into account the inherent uncertainty with regard to the satisfaction scores.
The consequences
If you don’t consider this uncertainty, you might end up overreacting or under-reacting. For example, let’s assume that all 10.000 customers are satisfied on average at 4.2 (while the sample tells us 3.8). What would the conclusion then be? Well, here, we mistakenly conclude that our company is not performing successfully when in fact we are doing well. However, if all customers have an average satisfaction level of 3.6 (and the sample still says 3.8) then we might think that we’re not doing as bad as we actually are.
In short, if we assume that a statistic such as an average from a sample is the same in the total population, we make mistakes. Mistakes that can potentially be costly and time-consuming.
The solution
In statistics, the average of a sample would be referred to as a point estimate. A point estimate by itself might be a good start but it doesn’t provide any information about how “good” this estimate is – it doesn’t take into account the uncertainty.
To get an idea of the error that we might have because we have a sample and not the total population, we can use confidence intervals, aka, range estimates. Contrary to point estimates, a range estimate provides a whole range of potential population estimates that are likely to be true.
The correct interpretation of data
For the example above, instead of assuming that the 3.8 average of the sample can be generalized to the total population, Pure Digital should compute the confidence interval and base their decision-making on a statement that says “we can be 95% confident that the true population average ranges between 3.8 and 4.2.”
We started with a simple point estimate (satisfaction of all customers is 3.8) to a range estimate (it is quite likely – 95% – that satisfaction ranges between 3.8 and 4.2). The difference here is vital because it directly affects decisions. In this case, we could conclude that the difference between 4.2 and the quite likely 4.0 of this year is not big enough for Pure Digital to engage into redesigning the marketing services they offer.
In conclusion, by taking random samples and computing range estimates instead of point estimates, we acknowledge that our estimate of the population is to some degree uncertain and we are better equipped to avoid costly under- or overreactions.