There are two great stacks in modern analytics: the technology stack and the social stack.

The technology stack includes all the tools you’re going to invest in to shape the experiences you (and your team) will have.

The social stack includes all the tools and people you’re going to invest in to shape the experiences you (and your team) will have.

Constructing both stacks have benefits and drawbacks. They always have consequences and always carry diseases with them. They are always subject to their own peculiar form of lock-in that’s far more resistant to change than you might suspect.

The good people at Keplar LLC have put together a neat take on the web analytics stack.

Quoting from them:

“What is SnowPlow?

SnowPlow does three things:

  1. Identifies users, and tracks the way they engage with one or more websites
  2. Stores the associated data in a scalable “clickstream” data warehouse
  3. Makes it possible to [use] a big data toolset (e.g. Hadoop, Pig, Hive) to analyse that data”

It’s open sourced, but, until it’s on GitHub, you’ll have to contact them.

Why SnowPlow?

“We developed SnowPlow out of frustration with the limitations of web analytics solutions today. We believe that there are many things that are wrong with today’s web analytics packages.” (Source)

Hopes for SnowPlow:

  • That it will expose more web analysts to machine learning experiences with WEKA, MaHoot, R, or python
  • That it will reduce the reporting efforts and increase data latency
  • That it will drive better predictions and stronger optimization routines

The implications for the social stack follows tomorrow.


I’m Christopher Berry.
I tweet about analytics @cjpberry
I write at