It was six months ago that I called on data miners to really unite with web analysts on the next generation of Internet marketing and web analytics.
I’ll extend an invite to members of the Toronto Data Mining Forum to come out for a Web Analytics Wednesday in November.
The reasons are multiple.
Last week, Google announced a whole bunch of new data exploration functionality at eMetrics. I must say, much of that functionality is beautiful, and it certainly entices people to explore.
It also struck an internal cord with just how much more vigorous we need to get with the application of the scientific method on web analytics data.
Data exploration has its function – ideally, an individual would be able to formulate a hypothesis, and then test it very rapidly. Some of Google’s new functionality enables that kind of rapid-hypothesis forming.
But how many people are going to screw around with the data before getting frustrated, and then disengaging?
Hypotheses are important when it comes to searching for business insight in Web Analytics data, as aptly demonstrated by Peter C Austin’s study. The deeper you look for random insights, without any sort of a hypothesis, the greater the chance you’re going to stumble upon Type I (false positive) errors. In Austin’s case – that various Zodilogical signs cause various health conditions to emerge in Ontario populations. (Perhaps date of birth could be related to certain long term health conditions based on climate and diet of the monther, however, I see no plausible hypothesis for linking the breaking of bones to Zodilogical sign!) The longer you look, the more likely you’re to find something that is really random, but appears as though it’s significant.
Focused hypotheses are also time saving devices. Vital time saving devices. So often, people head out into the data without a concrete idea of what they’re looking to achieve.
Data miners, at least my perception is, are more careful with their hypothesis formation because the costs of analysis are that much higher. They’ve been around longer as a discipline as opposed to web analysts, and as a result, web analysts can learn a lot from their approaches.
Just as web analysts stand to benefit from the rigor of data miners, data miners too stand to benefit. There is an entire world of relevant analytics that data miners might be very interested in knowing about. The data also happens to be dirtier in web analytics, so, there could be very important challenges that would be very unique in the data mining world. Their insights would be actionable.
In sum, Unite!