The Problem With Excel
Excel continues to be THE major tool in analytics.
It shouldn’t be.
Excel:
- Does not scale beyond a single computer, and frequently fails to load with very large data sets
- Does not contain separate model, controller, and viewer modules unless completely forced
- Allows human beings to make too many big mistakes; too much error
On the other hand, Excel:
- Is easy to use
- Creates pretty charts that, with effort, can be dragged into PowerPoint presentations
- Is shareable
- Is cheap
- Is fast (compared to building something accurate or scalable)
There are broader problems with Excel, namely:
- They’re prone to complexity creep
- Engenders disrespect (After all, it’s just pizza and spreadsheets, derp)
- Are generally not easily importable into statistical software for analysis
The Excel / SPSS / Excel / PPT Shuffle:
- Extract the data from disparate platforms, typically in Excel format.
- Load into SPSS / R
- Clean it
- Analyze it (real analytics, hypothesis based statistical tests)
- Consult with others, looking for spurious variables and potential actionable angles
- Write out your story
- Export the data from SPSS / R into Excel
- Create pretty charts
- Load the charts into PPT
- Frig around with the damned image settings for too long
- Edit up your story; tuck and tails
- Communicate
- Action it
- ?????
- Profit!
This workflow is full on, pants on head, stupid. Concretely:
- It’s slow
- It’s prone to a large amount of error
Previous attempts to change this script:
2008, SPSS viewer format output for your report.
Advantage: It’s fast, more accurate, and you see what we see.
Disadvantage: it’s ugly.
Feedback: Oh god, my eyes, they burn they burn. Go back to Excel. Go back to Excel. (pained sobs).
2009, SPSS viewer format output for your report.
Learned a few new prettification techniques to make it a bit more attractive.
Advantage: It’s fast, more accurate, and you see what we see.
Disadvantage: it’s ugly.
Result: Oh god, my eyes, they burn they burn. Stop it. Go back to PPT. Go back to PPT. (sobs).
2010, SaaS format for your report.
Advantage: It’s pretty, near real time, and has a memory.
Disadvantage: It’s taking too long.
Result: Oh god, my budget, kill it! Kill it with fire. (sobs).
2011, Advanced dashboard SaaS format for your report
Advantage: It’s far more functional, is much faster, and has a data dictionary.
Result: The SaaS format is tractable.
I have a huge problem with Excel
Do you see as many problems with it as I?
***
I’m Christopher Berry.
I tweet about analytics @cjpberry
I write at christopherberry.ca