Raw Data is a commodity.
That’s the overwhelming conclusion I’m running into – that’s the direction of my thinking over the past 4 weeks.
I had an excellent talk today with Jennifer Day. It’s a ‘catalytic’ talk. She called me inquiring about a tweet on the pre-click analytics side, and she very patiently listened, in great detail, about the procedures involved and the value of that type of analytics. Somehow I spun off into a rant about data. (Hard to imagine).
I said, in effect:
“See, the problem with the web analytics vendors today is that they’re in a false trap. They are only as successful as the people who use their tools. And, so many of the people who use their tools are not statisticians. There’s a huge amount of processing power that goes into real analytics. We’re moving big amounts of data around. If you’re offering a distributed computing service, like web analytics, you want to minimize those processing cycles. But in the meantime, you force people like me to extract, transform, and load the data into a desktop client and run the heavy lifting analyses there. It would be comparatively easy for you guys just to do that heavy lifting within your web analytics framework, but the incentive to do so isn’t obvious.”
I won’t quote Jennifer Day here. But, I went onto say:
“Now you have this great ‘data visualization’ movement. You have Google pushing it very hard with that entire flashterbation circle API thingy. Google is trying to make it easy for people to do ‘analysis’ by way of visualizataion, but notice how they’re trying to move the processing power behind it away from their servers, and do it on the client side – on the browser. And I don’t forecast browswers being able to process the kind of heavy lifting required anytime soon. They’re certainly trying to get the browser to perform calculations faster and faster – but I don’t know if you could do real mining that way. At least not for another 3 or 7 years.”
Jennifer said a few brilliant things, which led me to say:
“Data is cheap. Data is to a business as fresh water is to a Canadian. A Canadian will go to the bathroom, deposit 15 militers of urine into 8 litres of fresh water, and flush it. Data is ridiculously cheap. Businesses are drowning in it, and there’s nobody who can swim, little though detect which way the current is going.”
Data is like water.
When I take the analogy too far though in hindsight – it’s not pure Canadian toilet water.
It’s salty. Like seawater.
Data is like seawater.
In order to get it into something consumable, it actually takes a lot of energy. It takes filtering out the fish and seaweed at minimum. But then you actually have to distill it so that what you get in the end is something consumable. There’s extracting, transforming, and loading – all before analysis.
I’ll be really clear on what I need to drive business value:
I really need for technologists to make the process of pointing a hose at the ocean easy. I need a desalination plant that works well. I need piping.
I need that big old Cloud Computer to actually want to crunch through my data. I need to have vendors *want* to provide me with analytical utilities.
So my note to vendors is pretty simple.:
Data is a Commodity. Stop thinking that you’re selling me water, because you’re not. Help me move it and use it.