I believe this classification was first enunciated by Alex Langhshur at eMetrics Toronto (2008). It’s worth expanding upon. Consider the following classification for web analytics metrics: Pre-Click Metrics On-Site Metrics Post-Click Metrics To unpack that: Pre-Click Metrics refer to all activities that led to a visit to a digital owned property. (E.G. paid search keywords, referring domain, any traditional spend) On-Site Metrics refer to all the activities that can be observed on the site. (E.G. Visits to specific pages, graph analysis / path analysis, time spent on site, and a host of very specific things like the nebulous world of engagement.) Post-Click Metrics refer to all the activities that occur after the visit. (E.G. Money getting transferred to your bank[…]

I can tell you right now how I’m behaving in a multi-medium media world. I have several tabs open as I write this: Facebook – which contains a newsticker stream of activity. RDIO – which is playing music, pipped right into my ear. Blogger – where I’m typing this. A half dozen tabs I haven’t visited for a few hours, or, I have yet to actively click upon. I also have an Adobe Air App loaded: Tweetdeck. That’s just one device – my MacBook. I attended an INFORMS conference in 2008 where a particularly bright professor presented findings from a new type of diary. His findings suggested that multi-medium media consumption was the norm. Ie. Many people reported listening to[…]

SPSS, R, and Python (matplotlib) have very functional visualization libraries because seeing the data is vital, even when armed with statistical methods. The chart below, called Anscombe’s Quartet, illustrates why: All four data sets return the same summary statistics: Their averages are all 9.  The correlation between x and y are all 0.816.  They can be described by the best fit linear regression equation y = 3 + 0.5x. It’s important to visualize the data, even when relatively powerful summary statistics are available, because: Outliers are common in most data, deserve special attention, and can cause very large skews. You may need something a bit heavier than linear regression to predict the relationship between x and y. Summary statistics sacrifice[…]

You may have read something about the Samsung 7500 and 8000 series televisions, the ones with a camera installed in them, over the past few days. The tl;dr summary: “For Samsung’s 7500 and 8000 series TVs, all you have to do is say “Hi, TV,” when you walk into a room for the TV to turn on and know who’s there.” “Think of it: The tech means an advertiser or TV programmer could, for the first time, know which members of a Nielsen household are watching a show or an ad. Cisco has even developed a system meant to read facial expressions and determine whether you’re entertained or bored.” “Many people in the living room are multitasking with other devices.[…]

Depending on who you believe and the context, average site eCommerce conversion rates vary between 0% and 12%. That’s not very helpful. In my own experience, defining conversion as number of completed checkouts divided by total number of site visitors, that rate varies between 0.20% to 2.00%. That fact has important implications for analysis, bias, and making causal statements about what causes conversion. Specifically: When doing an experiment, the lower the conversion rate, the greater the number of visitors that are required to make a truthful causal statement that something causes conversion. As a consequence, poorly converting sites that could benefit from experimentation the most are the most disadvantaged. Methods that are more common in the machine learning community may[…]

Yesterday, I wrote: “Many [Data Scientists] will find some of their peers co-opted by tools, as it’s far easier to be religious about the merits of a tool over another one than it is to exert any sort of real leadership or independent thought.” To expand on that point: Data science is results oriented – the tool is the means to the end – it isn’t the end itself. Arguing the merits of Cognos against SAS is akin to the chefs spending an entire episode of Bravo’s Top Chef arguing whether a boning knife or a birds beak knife should be used to cut a duck. (It doesn’t make for good TV and it doesn’t matter.) The central tenet of[…]

Steve Miller wrote of Data Science Maturity yesterday. It’s a very good post. To summarize: He attended both Strata, a Data Science (DS) event, and Enzee, a Business Intelligence (BI) event, and  noted just how young all the DS kids are, and how old all the BI adults are. The DS kids come out of university armed with open source tools, the BI graybeards are all settled on enterprise tools. He predicts that DS will merge with BI, largely as BI analytic data structures are unified under the BI banner and come to dominate organizations. Editorial: BI defines itself by tools whereas DS defines itself by methods and ends. Many DS’ers will find some of their peers co-opted by tools,[…]

There are three broad categories of reasons why people ask for figures. They know what they know and need evidence to support what they know. (Convenient Reasoning) They know what they don’t know, and genuinely need objective evidence in support or against somebody or something. (Decision Support) They don’t know what they don’t know, and are looking for somebody to tell them what they should know, or what they should do. (Exploration) There are three questions an analyst should ask whenever they get an incomplete request for data: What problem are you trying to solve? Who are you trying to convince? What are you going to do differently if you had the evidence? Editorial: Not everybody who engages in convenient[…]

A post made the rounds on HN over the weekend about a DIY kit that enables plants to tweet when they’re thirsty. You can read the post for yourself here. Summary: Order a kit for $99.95, do some sodering, and you’ll get a sensor in the shape of a leaf. The sensor detects the relative dryness of the soil. The sensor will send a signal that ultimately causes the plant to tweet you when it’s thirsty. Editorial: It’s a novel application of sensor technology + internet of things (IoT) + a social broadcasting utility. It solves a practical problem that typically is solved by outloook reminders or having a Twilio App text you every two weeks. It’s another case study[…]

Yesterday, Jeremiah Owyang, with Andrew Jones and Christine Tran – from the Altimeter Group, published “A Strategy for Managing Social Media Proliferation”. It’s so huge and so packed with goodness about Social Media Management Systems (SMMS) that I couldn’t do it justice in three bullet points. I extracted three choice quotes below. Choice quotes: “Corporations should not jump into social media without a clear goal…Instead align goals towards business objectives.” (p. 23) “Unlike one-way marketing of yester-year, company must be ready to engage with negative conversations, often taking them head-on.” (p. 23) “New media efforts will be scrutinized by management as budgets shift. Be prepared to measure.” (p. 23) Editorial: They’re correct in emphasizing the importance of strategy and process[…]