Excel continues to be THE major tool in analytics.

It shouldn’t be.

Excel:

  • Does not scale beyond a single computer, and frequently fails to load with very large data sets
  • Does not contain separate model, controller, and viewer modules unless completely forced
  • Allows human beings to make too many big mistakes; too much error

On the other hand, Excel:

  • Is easy to use
  • Creates pretty charts that, with effort, can be dragged into PowerPoint presentations
  • Is shareable
  • Is cheap
  • Is fast (compared to building something accurate or scalable)

There are broader problems with Excel, namely:

  • They’re prone to complexity creep
  • Engenders disrespect (After all, it’s just pizza and spreadsheets, derp)
  • Are generally not easily importable into statistical software for analysis

The Excel / SPSS / Excel / PPT Shuffle:

  • Extract the data from disparate platforms, typically in Excel format.
  • Load into SPSS / R
  • Clean it
  • Analyze it (real analytics, hypothesis based statistical tests)
  • Consult with others, looking for spurious variables and potential actionable angles
  • Write out your story
  • Export the data from SPSS / R into Excel
  • Create pretty charts
  • Load the charts into PPT
  • Frig around with the damned image settings for too long
  • Edit up your story; tuck and tails
  • Communicate
  • Action it
  • ?????
  • Profit!

This workflow is full on, pants on head, stupid. Concretely:

  • It’s slow
  • It’s prone to a large amount of error

Previous attempts to change this script:

2008, SPSS viewer format output for your report.
Advantage: It’s fast, more accurate, and you see what we see.
Disadvantage: it’s ugly.
Feedback: Oh god, my eyes, they burn they burn. Go back to Excel. Go back to Excel. (pained sobs).

2009, SPSS viewer format output for your report.
Learned a few new prettification techniques to make it a bit more attractive.
Advantage: It’s fast, more accurate, and you see what we see.
Disadvantage: it’s ugly.
Result: Oh god, my eyes, they burn they burn. Stop it. Go back to PPT. Go back to PPT. (sobs).

2010, SaaS format for your report.
Advantage: It’s pretty, near real time, and has a memory.
Disadvantage: It’s taking too long.
Result: Oh god, my budget, kill it! Kill it with fire. (sobs).

2011, Advanced dashboard SaaS format for your report
Advantage: It’s far more functional, is much faster, and has a data dictionary.

Result: The SaaS format is tractable.

I have a huge problem with Excel

Do you see as many problems with it as I?

***

I’m Christopher Berry.
I tweet about analytics @cjpberry
I write at christopherberry.ca

One thought on “The Problem With Excel

  1. Tim Wilson says:

    I fall more in the “Excel’s fine for a lot of legitimate reporting and analysis.” If the premise is that, without applying advanced statistical techniques, then no good can come from the data…then I’d agree. But, I think that’s a flawed premise.

    Portability (and cost)? You can’t beat it — everyone has Excel. $0 incremental cost.

    Data visualization? Excel has more flexibility than many analytics platforms (less than pure play data viz platforms…but see ‘portability’ above).

    Complexity creep? Oh. YEAH. THAT’s a risk. I draw the line at including macros that are required (I’ll use them for one-time cleanups and such). But, formulas get bloated in a hurry, and that’s a definite risk, because they can’t be documented and they’re just one long-ass text string.

    Just like with web analytics platforms, it’s easy to get really comfortable with a single platform, when, in reality, there IS NO “one single platform.” They all overlap capabilities / usage-wise (which is better than having gaps!), and they can all be made to do things where other platforms would be much more appropriate.

    It’s not black-and-white.

Comments are closed.