Sheet Smart
Posts
Crunching Numbers Like a Superhero: 3 Data Formulas You Should Know

Crunching Numbers Like a Superhero: 3 Data Formulas You Should Know

From Averages to Standard Deviation and Standard Error—Learn How to Crunch Superhero Stats Like a Data Pro!

Steven Bradburn
February 18, 2025 • Estimated Reading Time: 6 minutes

Hey guys! 👋

When it comes to data analysis in spreadsheets, there are 3 essential statistics you should learn.

And when I say learn, I mean know how to calculate them and understand what they represent (do you know all 3?):

Average (or mean)
Standard deviation (SD)
Standard error (SE)

Let’s break it down—Marvel style!

Giphy

Imagine you randomly pick 10 Marvel superheroes and count how many villains each has defeated.

The numbers vary—Iron Man wipes out armies, while Ant-Man takes down a few at a time.

But how do we make sense of this data?

P.S. If you want to download the dataset and follow along, then grab a copy of the spreadsheet below.

1. Average

Let’s start with the easiest one - the average (or mean).

The average is calculated by summing all the values and then dividing by the number of values you have.

Fortunately, there’s a function to calculate the average from a dataset quickly.

=AVERAGE(number1, [number2], ...)

Where:

Number1: The first number, cell reference, or range to average
Number2: Additional numbers, cell references, or ranges to average

❗ Note, the function is AVERAGE and not MEAN.

Here’s how we calculate the average for our example:

=AVERAGE(B2:B11)

The average gives us a single number representing the typical villain count per hero in the sample. In this case, it’s 546.

So, on average, my collection of superheroes defeated 546 villains.

But — it’s just a summary!

It doesn’t tell you how spread out the numbers are (that’s what SD is for) or how much it might change with another sample (that’s what SE is for).

2. Standard deviation (SD)

Some heroes—like Iron Man or Black Panther—take down tons of villains, while others—like Ant-Man—defeat way fewer.

SD tells you how spread out these numbers are.

Small SD = Most heroes defeat a similar number of villains. The data are clustered.
Large SD = Some heroes are wiping out armies while others barely fight, making the numbers wildly different.

The SD would be low if everyone defeated around the same number of villains.

However, SD will be high if your sample includes Ant-Man and Iron Man because their numbers are far apart.

So, SD measures how much individual heroes’ scores vary from the average!

You can use two functions to calculate the SD, and the approach you pick will depend on what your dataset represents - is it a population or a sample?

Population = The entire group that you are interested in
Sample = A subset of the population

In our example, I have a sample of 10 Marvel superheroes. If I had a sheet listing all the Marvel superheroes, I would have a population instead.

The function to calculate the SD is:

Population: =STDEV.P(number1, [number2], ...)
Sample: =STDEV.S(number1, [number2], ...)

❗ Note, for a sample, you can also use a more straightforward function by dropping the S; =STDEV(number1, [number2], ...)

Where:

Number1: The first number or cell reference
Number2: Additional numbers or cell references

For our example, the complete function is:

=STDEV.S(B2:B11)

We get an SD of 267 villains defeated.

Since 267 is almost half of the average (546), it means some heroes are much higher or lower than the average, likely due to extreme cases.

3. Standard error (SE)

Even after calculating an average, how confident can we be that it truly represents all Marvel superheroes?

That’s where SE comes in.

SE tells us how much our sample’s average might change if we picked a different sample of heroes.

A small SE means our sample’s average is pretty stable—if we chose another set of 10 superheroes, the average wouldn’t change much.
A large SE means our sample’s average could vary a lot—meaning we might need a bigger sample for a more reliable estimate.

👉 Key takeaway: The smaller the SE, the more we can trust our sample’s average. Want to shrink SE? Increase the sample size!

Unfortunately, there isn’t an SE function we can easily use in Excel or Sheets.

The SE is calculated by dividing the SD by the square root of the sample size.

So, we can use a simple formula to work this out:

=STDEV.S(B2:B11)/SQRT(COUNT(B2:B11))

Our SE comes to 85.

Since SE is much smaller than SD, our sample average is a reasonably good estimate of the true average for all Marvel superheroes, but there's still some uncertainty.

If we wanted to be even more precise, we could increase the sample size, which would shrink the SE further.

🎓 Want to learn more?

Check out my video tutorial below for a walkthrough of this example by using Google Sheets.

If you’re an Excel user, don’t worry the steps are exactly the same!

I hope you found this article useful!

Until next time.

Stay sheet smart!

Steven

Reply

or to participate.