How to Determine a Statistically Valid Sample Size
If you’re running a survey or a test, how many responses do you need for your data to have a “statistically valid sample size?” It’s often a difficult goal to achieve, but without valid data, you can’t trust your test results.
Why is this important? Much of the reason for the surprise results in the 2016 U.S. presidential election was caused by poorly conducted polling research. On 11/8/2016, Princeton had Mrs. Clinton as a “stone-cold lock to win” at 99%. Here are some of the other polls’ predictions.
As marketers, we typically don’t have the complexity of predicting the results of something as polarizing as last year’s election, but gathering accurate data is important for key marketing decisions that result from activities like:
- Conducting a brand audit
- Measuring customer loyalty
- Surveying the market to evaluate a new brand name
Statistically Valid Sample Size Criteria
When you’re determining the statistical validity of your data, there are four criteria to consider.
- Population: The reach or total number of people to whom you want to apply the data. The size of your population will depend on your resources, budget and survey method.
- Probability or percentage: The percentage of people you expect to respond to your survey or campaign.
- Confidence: How confident you need to be that your data is accurate. Expressed as a percentage, the typical value is 95% or 0.95.
- Margin of Error or Confidence Interval: The amount of sway or potential error you will accept. It’s the “+/-” value you see in media polls. The smaller the percentage, the larger your sample size will need to be.
For example, if 45% of your survey respondents choose a particular answer and you have a 5% (+/- 5) margin of error, then you can assume that 40%-50% of the entire population will choose the same answer.
Sample Size Calculators
If you’re looking to determine how many participants you need in an A/B test, check out this sample size tool that will tell you how many visitors you need at various conversion rates for different desired confidence levels.