Effective analytics is disruptive because being smarter causes smarter actions. Organizations do not, and probably can not, change as rapidly as the intelligence suggests. This alone can be a massive source of frustration, both for the analytics professionals, and for other areas of management within the organization. Three key questions to consider: Small failure is likely and common within most organizations – are you comfortable with those getting surfaced? Small success is likely and common within most organizations – are you more concerned with sharing the resulting insights instead of investing in assigning credit? Do you have a system for change management and updating strategy? Analytics shines a harsh light on previously dark corners. And yet, knowing what you don’t[…]

No fewer than four companies in Toronto looking to build analytics departments. I’m excited for them. A few points of advice as they move forward: If you don’t like the truth, you’re not going to like analytics. Effective analytics is disruptive and prompts change. If you’re not open to changing, then there’s no point in being smarter. You’re better off being dumb. You’ll hit a trough of disillusionment, usually because too many of the wrong people in the organization are looking for too many of the wrong numbers, getting too frustrated that nothing is telling them anything (that they want to hear or see), and that there isn’t a transformation. Some organizations never get out of that trough and give[…]

Marketing Science isn’t Physics. One of the great things about the Marketing Science community is the sane approach to assumptions. Unlike economics, marketing science aims to make reliable predictions about the world, just like the other grown up sciences.Consider: Physics is just called physics. Chemistry is just called chemistry. Biology is just called biology. None of these three fields use the description ‘science’ to affirm that they’re science. They just are ‘science’. Next, consider: Marketing. Politics. These are two sciences which are very young – especially as compared to physics. Marketing Science is really only 50 years old. Marketing Science has special problems that are created by the subject matter itself. Consider: The same laws of motion that put a[…]

I believe this classification was first enunciated by Alex Langhshur at eMetrics Toronto (2008). It’s worth expanding upon. Consider the following classification for web analytics metrics: Pre-Click Metrics On-Site Metrics Post-Click Metrics To unpack that: Pre-Click Metrics refer to all activities that led to a visit to a digital owned property. (E.G. paid search keywords, referring domain, any traditional spend) On-Site Metrics refer to all the activities that can be observed on the site. (E.G. Visits to specific pages, graph analysis / path analysis, time spent on site, and a host of very specific things like the nebulous world of engagement.) Post-Click Metrics refer to all the activities that occur after the visit. (E.G. Money getting transferred to your bank[…]

I can tell you right now how I’m behaving in a multi-medium media world. I have several tabs open as I write this: Facebook – which contains a newsticker stream of activity. RDIO – which is playing music, pipped right into my ear. Blogger – where I’m typing this. A half dozen tabs I haven’t visited for a few hours, or, I have yet to actively click upon. I also have an Adobe Air App loaded: Tweetdeck. That’s just one device – my MacBook. I attended an INFORMS conference in 2008 where a particularly bright professor presented findings from a new type of diary. His findings suggested that multi-medium media consumption was the norm. Ie. Many people reported listening to[…]

SPSS, R, and Python (matplotlib) have very functional visualization libraries because seeing the data is vital, even when armed with statistical methods. The chart below, called Anscombe’s Quartet, illustrates why: All four data sets return the same summary statistics: Their averages are all 9.  The correlation between x and y are all 0.816.  They can be described by the best fit linear regression equation y = 3 + 0.5x. It’s important to visualize the data, even when relatively powerful summary statistics are available, because: Outliers are common in most data, deserve special attention, and can cause very large skews. You may need something a bit heavier than linear regression to predict the relationship between x and y. Summary statistics sacrifice[…]

You may have read something about the Samsung 7500 and 8000 series televisions, the ones with a camera installed in them, over the past few days. The tl;dr summary: “For Samsung’s 7500 and 8000 series TVs, all you have to do is say “Hi, TV,” when you walk into a room for the TV to turn on and know who’s there.” “Think of it: The tech means an advertiser or TV programmer could, for the first time, know which members of a Nielsen household are watching a show or an ad. Cisco has even developed a system meant to read facial expressions and determine whether you’re entertained or bored.” “Many people in the living room are multitasking with other devices.[…]

Depending on who you believe and the context, average site eCommerce conversion rates vary between 0% and 12%. That’s not very helpful. In my own experience, defining conversion as number of completed checkouts divided by total number of site visitors, that rate varies between 0.20% to 2.00%. That fact has important implications for analysis, bias, and making causal statements about what causes conversion. Specifically: When doing an experiment, the lower the conversion rate, the greater the number of visitors that are required to make a truthful causal statement that something causes conversion. As a consequence, poorly converting sites that could benefit from experimentation the most are the most disadvantaged. Methods that are more common in the machine learning community may[…]

Yesterday, I wrote: “Many [Data Scientists] will find some of their peers co-opted by tools, as it’s far easier to be religious about the merits of a tool over another one than it is to exert any sort of real leadership or independent thought.” To expand on that point: Data science is results oriented – the tool is the means to the end – it isn’t the end itself. Arguing the merits of Cognos against SAS is akin to the chefs spending an entire episode of Bravo’s Top Chef arguing whether a boning knife or a birds beak knife should be used to cut a duck. (It doesn’t make for good TV and it doesn’t matter.) The central tenet of[…]

Steve Miller wrote of Data Science Maturity yesterday. It’s a very good post. To summarize: He attended both Strata, a Data Science (DS) event, and Enzee, a Business Intelligence (BI) event, and  noted just how young all the DS kids are, and how old all the BI adults are. The DS kids come out of university armed with open source tools, the BI graybeards are all settled on enterprise tools. He predicts that DS will merge with BI, largely as BI analytic data structures are unified under the BI banner and come to dominate organizations. Editorial: BI defines itself by tools whereas DS defines itself by methods and ends. Many DS’ers will find some of their peers co-opted by tools,[…]