Testing the reliability of data distribution in survey analysis
What was the experiment?
How did we do it?
What did we find out?
TL;DR
Assessing how well synthetic data can align with real survey responses.
We conducted a study with 7,681 respondents across seven international markets, swapping portions of real survey data with synthetic data to see how this would impact the results.We replaced different percentages of original responses with synthetic ones—starting at 10% and increasing up to 50%—and compared how answer distributions changed across 57 survey questions.
Even when half of the data was synthetic, the difference from the original remained around just 1%. Notably, this pattern held true across markets including Australia, India, Japan, Mexico, Singapore, the UK, and the US, reinforcing the potential for synthetic data to support global research without compromising quality.
Synthetic data is well-aligned with real data, and thus can be a reliable way to supplement real survey responses while maintaining accuracy.
EXPERIMENT #1