I learned quite a few things this week thanks to a lot of our twitter exchanges. Thank you. Collectively, digital analysts do not: Have a standardized method to express causality. Have a standardized method to limit R^2 inflation as a result of collinearity. Have a standardized method to express either in a clear, simple, and concise way. A set of preferred solutions: We should use conceptual frameworks, causal diagrams, or Ishikawa diagrams, to express the relationship among variables. We should check VIF and communicate that figure when reporting R^2. I’m a long ways away from being able to be really brief WRT this problem set. What do you think? *** I’m Christopher Berry.I tweet about analytics @cjpberryI write at christopherberry.ca
Author: Christopher Berry
The objective of the series on marketing attribution was to demonstrate how constraints, caused by humans and nature themselves, generate enormous issues in the marketing sciences. Sometimes such issues are trivialized away. “After all, this isn’t exactly rocket science.” Indeed it’s not. Marketing science is harder. Konstantin Tsiolkovsky published the basic physics of how a rocket escapes Earth’s gravity in 1903. Those laws of physics applied in 1957 when Sputnik was launched. They’ll apply the same way again when/if, in 2012, 55 years later, North Korea gets something out into orbit. While the math looks intimidating, it’s Newton, some systems thinking and some calculus. There are engineering difficulties with respect to stress, force, and materials sciences that are not trivial.[…]
On Monday we set up a model relating foot traffic to patio attendance and beer revenue for a pub on Toronto’s Peter Street. On Tuesday, we expanded the model to include weather. All equations are fake and are for illustration purposes only. A Concrete Example Assume: X1 is the number of people walking past a patio on Peter Street. X2 is the number of people who are sitting on the patio, drinking a beer, on Peter Street. Y is beer sales for that pub operating the patio on Peter Street. Y = 1250 + 0.05 * X1 + 18.22 * X2 W1 = ((c0 (temperature) + c1 (humidex) + c2 (sky) + c3 (precipitation)) / (clout denominator))*100 Y = 1115[…]
Yesterday, we did some work on Peter Street. We related foot traffic to patio use, all to predict pub revenue. A Concrete Example Assume: X1 is the number of people walking past a patio on Peter Street. X2 is the number of people who are sitting on the patio, drinking a beer, on Peter Street. Y is beer sales for that pub operating the patio on Peter Street. Assume a dataset and a traditional linear regression – and get the equation (it’s for illustrative purposes only – it’s not real): Y = 1250 + 0.05 * X1 + 18.22 * X2 To which a good friend remarked: “Ha! I got you! I finally got you! What about weather?! You can’t[…]
Check out the Digital Analytics Association (DAA’s) industry compensation scan. The answer to the question “How much do digital analysts make?” is – “It depends”. Real data, provided by IQ Workforce, shows a fairly wide distribution in salaries across cities. The authors even took into account cost of living to derive a top ten list. The best average salary with cost of living factored in? Atlanta. Brian Thopsey and Casper Blicher Olsen worked very hard on this research committee project, with Amanda Watlington initiating the project and iterating upon it. I thank them for their effort. It looks great! *** I’m Christopher Berry.I tweet about analytics @cjpberryI write at christopherberry.ca
A whole range of statistical methods, both traditional and those found in machine learning, assume independence among independent variables. That assumption is pretty important when interpreting the contribution of each variable on the dependent variable (which we call Y). To unpack:We say there there is a high degree of collinearity between X1 and X2 if X1 is highly correlated with X2. It doesn’t matter if X1 causes X2. Or if X2 causes X1. The fact would remain that a change in X1 would lead to a predictable lift in X2. And, that a change in X2 would lead to a predictable change in X1. A Concrete Example Assume: X1 is the number of people walking past a patio on Peter[…]
How do ranking algorithms work? At the highest level: A machine accepts a series of independent variables. A machine interprets those variables. A machine produces an output that is, ideally, predictive of a dependent variable. The usefulness of an algorithm is in just how predictive the output is of a dependent variable. For instance: The usefulness of the Google Search algorithm depends on how relevant the results are in relation to the query. The usefulness of the Facebook GraphRank algorithm depends on how relevant the results in the news feed are in relationship to the user. The usefulness of the Netflix algorithm(s) depends on how relevant and divergent the recommendations are in relationship to the household. All three companies use[…]
One of my favourite evening reading sites is TV By The Numbers and Bill Gorman’s Renew/Cancel Index. Bill uses ratings to build a predictive index – all based on the insight that you don’t have to outrun the cancellation bear, you just have to outrun the other guy! By comparing a set of ratings against the network average, Bill can deduce which shows are likely to be renewed, on the bubble, or cancelled. The usefulness of a predictive algorithm is in how accurate the predictions are. And I’d say that his index is pretty predictive. Excluding renewals and taking his index value of.90 as the cutoff for renewal or cancellation, his model yields 95% precision and a 77% recall. The[…]
Two intersecting themes for you today – attribution and decision making. This paper from Google Analytics and eMarketer really got me started, and you can download it here. It’s a survey of marketers and agencies (n=179) gauging attitudes, expectations, and objectives in attribution. Which is so hot right now. Thank you, Google. Great stuff. There’s a big difference between satisficing decision making behavior, and optimizing decision making behavior.Satisficing decision making behavior is characterized by: Good enough because it’s good enough. If it ain’t broke, don’t fix it. We only really have to do the minimum to satisfy our expectations. Optimizing decision making behavior is characterized by: Searching for better. Thinking forward and thinking backwards. Seeking to maximize an objective. People[…]
Many of my good friends and successor staff are off to build their own analytics or marketing science practices abroad. Some are going technology side. Two are going agency side. I’m happy for all of them. Those companies chose wisely because they’re all starting off with a solid keel. Here’s what I tell them: On Hiring Your first hire is critical and most strategic – choose somebody with strengths that complement your weaknesses. You’re building an orchestra: each ones strengths will have incrementally complement each others weaknesses, when you get to a team of 4 or 5, hiring strengths to reinforce existing strengths really pays off. (Architect your orchestra.) If you can’t eat with them or have a beer with[…]