Web analytics uses clickstream data.
It’s data that is:
- Generally Anonymous
- Generally Aggregated
- Heavily Abstracted
Most commercial web analytics software abstracts away the raw data with fairly usable interfaces. You’ll be hard pressed to find many people these days who know how to work with server log data. Yet, it’s still possible to segment a population of browsers based on the characteristics of the browser, computer, and reverse geographic lookup.
That is to say, I can query, through the software, the differences between IE browsers in Toronto originating from Reddit from, say, Chrome browsers in New York originating from search. And then I can compare the differences between them. If it’s an eCommerce site, I may even be able to compare the differences in purchasing value.
Most websites are not eCommerce enabled.
Most database analysts deal with customer transaction data.
It’s data that is:
- Personally Identifiable
- Generally unaggregated
These tend to be very large databases with very poor interfaces. And, generally speaking, people are segmented based on what they purchase and the marketing treatments they have received.
The three key letters to understand here are RFM:
That is to say, I can query, through specialized software, the differences between customers in Toronto originating from direct mail, from, say, customers who have recently changed their address to New York originating from a campaign in 2002. And, applying some more analytical magic, and this part is key – when nobody is interfering, I can work out lifetime value (LTV).
You’re examining a superset of traffic, and, you’re examining browsers. You’re not really looking at people across multiple sessions, either. At least, not in the sense that others have been led to believe.
Rudimentary RFM analysis has been attempted at the Clickstream level, with mixed results. It’s possible to do. But I’d wager it’s actively practiced in fewer than 5% of all web analytics practices out there.
There is a third category of data that has since been lumped in – called ‘VOC’, or ‘Voice of Customer’ data. It’s Market Research surveys on the website. This data, again, is not linked in any way to identifiable customers. And, it also contains all those other populations found on the website.
Traditional customer analytics / database mining on customer data is extremely narrow, and that web analytics is extremely broad.
I know it’s popular to say that clickstream data is customer intelligence, but knowing what we know about the data itself, that it contains more than just customers in the aggregate, do we really believe that?
I’m Christopher Berry.
I tweet about analytics @cjpberry
I write at christopherberry.ca