📍To business decision-makers: As a data scientist who taught at university and wrote two textbooks in the field, I would like to share my knowledge in bite-sized articles to help you navigate the world of data and AI with confidence and clarity.
ℹ️ This symbol means you can click to learn more
Since this is a bite-sized article, I will stick to the storyline and cover the essentials in the main text. But if you are keen to learn more or go deeper, further explanations are available under ℹ️
🔗To my fellow data specialists: Alongside each article, I’ll share the full code, often packaged as handy helper functions you can easily integrate into your own workflow.
: you’re the CEO of a retail chain with two department stores, A and B. You’re reviewing the quarterly report, where a bar chart shows that Store A scores 80 out of 100 in customer satisfaction while Store B scores 75. Should you replicate Store A’s practices and invest in improving Store B?
What if I told you that in one scenario, this action could cost your company millions, while in another scenario, it’s exactly the right move?
The difference between the two scenarios isn’t in the numbers you see—it’s in the numbers you don’t.
🎯In the next 10 minutes, you’ll learn:
- How very different business realities can hide behind the same bar chart
- Three practical steps to uncover the full story and avoid costly misinterpretations
The Problem with Summaries
Business decisions often rely on simple summaries shown in bar or line charts:
- ratings across products
- customer satisfaction across stores
- employee engagement across teams
But summaries like this hide critical details—the very details that can make or break your next strategic move.
Let’s go back to the store example. When you imagine the chart comparing Store A and Store B, what do you see? Likely something like below: two bars, one a little taller than the other.
Here’s the twist: three distinct business scenarios—each requiring a different decision—could produce the exact same bar chart 🤯.
🔎Ready to see what your data isn’t telling you?
What the Bar Chart Hides – The Rest of the Story
Let’s look at three very different business realities that can hide behind the same bar chart.
Scenario 1: Small Sample, Small Variance
In Scenario 1, both stores have relatively small sample sizes (n = 50) and low variance (standard deviation = 5).
ℹ️ Variance and standard deviation (std) measure how spread out the data is from the average.
- Variance is the average of the squared differences from the mean. It gives a sense of the overall spread of data points, but its unit is squared, which makes it less intuitive.
- Standard deviation (std) is the square root of variance. Because it’s in the same unit as the data (e.g., satisfaction points), it’s much easier to interpret directly. For example, it means that roughly two-thirds of customer satisfaction scores are within about 5 points above or below the average.
These details are invisible in the bar chart. But when we switch to an alternative graph—the box-scatter plot—you can see each customer’s score as a point, and you can also see the statistical test result displayed in the corner.

The graph above tells us:
- Customer scores are tightly clustered around each store’s mean.
- The 5-point gap between stores is consistently visible.
- Statistical testing (ANOVA) confirms the difference is real, not just chance.
💡Key insight: In this scenario, you would be right to replicate Store A’s practice and invest in Store B’s improvement.
ℹ️ Think of ANOVA as a referee: it checks whether the difference between groups is big enough that it’s unlikely to be random noise.
- ANOVA (Analysis of Variance): Compares the averages of two or more groups and asks, “Is this gap larger than what random chance would usually create?” If yes, we say the difference is statistically significant
- Other common tests include
- T-test: Compares the means of two groups.
- Welch’s t-test: A variant of the t-test that handles groups with unequal variances.
- Kruskal-Wallis test: Similar to ANOVA, but for data that isn’t normally distributed; it compares the rankings of the groups rather than their averages.
- Reading p-values (practical guide for business):
- The p-value tells you how likely the observed difference is due to random chance.
- Smaller p-values mean the difference is less likely to be random:
- p < 0.05 → reasonably confident the difference is real
- p < 0.01 → very confident the difference is real
- p < 0.001 → extremely confident the difference is real
- If a statistical test is not significant (i.e., p > 0.05). It doesn’t mean there is no difference between the groups. It just means that, given the sample size and variability, we cannot confidently say the difference is real—the observed gap could be due to random noise.
- Tip for business decision-makers: Choosing the right statistical test depends on your data type, sample size, and distribution. It’s always wise to consult your data specialist to ensure the test as well as the interpretation of its results match your scenario.
📦Tip for fellow data specialist: The above graph is easy to make with the code below. In addition to customising the appearance, you can choose between different statistical tests suitable for your data, too. Pls check out MLarena docs on github for details.
from mlarena.utils.plot_utils import plot_box_scatter
fig, ax = plot_box_scatter(scenario_a,
x='store',
y= 'satisfaction',
show_stat_test=True,
stat_test='anova',
palette = colors)
Scenario 2: Small Sample, Large Variance
In Scenario 2, both stores still have small sample sizes (n = 50) and the same mean scores (80 for Store A, 75 for Store B). But now, customer satisfaction scores have high variance. This changes the story dramatically:

- While bar chart will look exactly the same for the two scenarios, from the above box-scatter plot you can tell that data points are more widely scattered for scenario 2.
- The difference between two stores is now hard to distinguish from random noise.
- Consistent with this intuition reflected from the plot, statistical analysis shows the difference is not statistically significant.
- Even though the means are identical to Scenario 1, we cannot confidently conclude that Store A truly outperforms Store B.
💡Key insight: The same mean difference can tell completely different stories depending on data variability.
What To Do With Noisy Data?
How do you make data-driven decisions then, when your data is noisy (i.e., has high variance)? Scenario 3 provides the answer.
In Scenario 3, we maintain the same high variance as Scenario 2 but dramatically increase the sample size. This demonstrates the power of larger datasets:

- Data points remain widely scattered (same high variance as Scenario 2)
- However, the larger sample size provides much more statistical power
- With more data points, we can now distinguish the signal from the noise: Statistical analysis shows the difference IS statistically significant despite the high variance
- The larger sample gives us confidence that Store A truly outperforms Store B
💡Key insight: When variance is high, larger sample sizes can increase our ability to detect a real difference.
ℹ️ Statistical power is the ability of a test to detect a difference when one actually exists.
- Low power (small, noisy samples): Even if a real difference exists, the test may fail to detect it — like trying to spot a faint signal on a fuzzy radio
- Power and sample size: One of the most practical ways to increase power is to collect more data. For example, in Scenario 3, we kept the same high variance as Scenario 2 but increased the sample size tenfold. That extra data gave us the statistical power to separate signal from noise and confidently conclude that Store A outperformed Store B.
- How big is big enough? Great question. The answer depends on the variability in your data and the size of the difference you care about. Stay tuned, in the next bite-sized article, I’ll share a practical guide for business decision-makers on power and sample size so you know when you have “enough data” to act with confidence.
📦Tip for fellow data specialists: I will introduce easy-to-use functions on power and sensitivity analysis in a future bite-sized article.
When a Significant Result Isn’t a Big Deal
Comparing Scenario 1 and Scenario 3, would you say that since both show 5-point differences that are statistically significant, the two scenarios are essentially the same?
The answer is a big NO ⛔
- Scenario 1:
- The 5-point difference represents 100% of the standard deviation — a very strong effect.
- 👉 Suggests a major operational difference worth immediate replication.
- Scenario 3:
- The same 5-point difference is only 25% of the standard deviation — a small effect.
- 👉 Indicates only a modest advantage that may not justify large-scale changes.
💡 Key insight: Statistical significance tells you whether a difference is real. Effect size tells you whether that difference is big enough to matter for business.
ℹ️ Effect size measures the magnitude of the difference, not just whether it exists.
- It puts the difference in context of the variability in your data (e.g., a 5-point gap can look huge if your data is tightly clustered, or tiny if your data is very spread out).
- Different measures exist (Cohen’s d, Pearson’s r, odds ratios, etc.), but the core idea is the same: how big is the impact?
- For business, effect size helps decide whether a result is worth acting on — not just whether it passes a statistical test.
- I will explain
effect sizemore in a future bite-sized article.
📦Tip for fellow data specialists: You guessed it, I have easy-to-use functions on effect-size to share with you too in a future article.
💡Key insight: Don’t assume all statistically significant results deserve the same response—the size of the effect matters for resource allocation.
Put It All Together
Key takeaways and actionable steps for business decision makers:
🚫 What NOT to do:
- Don’t make decisions based solely on mean differences
- Don’t assume identical means represent identical business situations
✅ What TO do:
- Always request distribution information alongside means (e.g., box plots, scatter plots, or variance metrics such as standard deviation)
- Ask for statistical significance testing before concluding that observed differences are actionable
- Ask for effect size to understand whether statistically significant differences justify the cost of action
🎁 Bonus point: When results are inconclusive due to high variance, consider collecting larger samples to increase statistical power and bring clarity.
🎯 Bottom line: The same 5-point mean difference can justify immediate action (Scenario 1), require more data collection (Scenario 2), or confirm action with high confidence but modest impact (Scenario 3). Understanding data variability, statistical significance, and effect size prevents costly misinterpretations of your business metrics.
🔮 What’s next: I’ll write more bite-sized articles illustrating key concepts in Data and AI for business decision-making. Effect size, statistical tests and statistical power which we touched on in this article are all on the list. Let me know what else you’d like to see next 🤗
I write about data, ML, and AI for problem-solving. You can also find me on 💼LinkedIn | 😺GitHub | 🕊️Twitter/
Unless otherwise noted, all images are by the author.
