The dependent variable is the one that matters. You can’t explain why without Y. Last week, I broadcasted 10 questions on Twitter and Facebook what they thought of dependent variables. It’s what 18 smart managers, researchers, developers, strategists, planners, data scientists, executives, eComm marketers, direct marketers, and department heads thought. I thank them for their time. Thank you. Take the n and the instrument for what it is – a dipstick check on just how aligned we are. Here’s what was found. H1 >=80% will state that Conversion is the Dependent Variable Confirmed. Conversion won every matchup it was in. Number of conversions won head to head against impressions and engagement; engagement and unique visitors; impressions and unique impressions.[…]
Author: Christopher Berry
Here’s what you need to know about automated statistical analysis: 1. Automated statistical analysis is not a substitute for good judgement Statistical tests are tools. They help us understand why nature is the way that it is. Nature resists being known about. But, she is knowable. Statistical tests themselves are part of nature. The tests themselves were never meant to be substitutes for good judgement. That belief, that tests could replace people, has only ended up causing the accumulation of some pretty outrageous assumptions over the years. Just because there is a significant correlation between Magnum Ice Cream sales and Piracy in the Indian Ocean doesn’t mean that it’s causal. Statements of causality require judgement. Automated statistical analysis is not[…]
Let’s take a look at what 16-bit interfaces could do. A great simulation game begins with just a handful degrees of freedom and explodes from there. Behold the grandeur that is SimCity for the Super Nintendo. If you’re familiar with SimCity (1991), skip ahead to Data Exploration, below. On a flat plane of pixels, you have the choice to: Bulldoze a feature. Build a road. Build a mass transit unit. Build a power line. Build a park. Build a residential zone. Build a commercial zone. Build an industrial zone. Build a police department. Build a fire department. Build a stadium. Build a port. Build a coal plant. Build a nuclear plant. Build an airport. Build a special reward building.[…]
Discovering truth in data always begins with you, and your judgement. Assume that you have some idea about the world. Something that you believe is true, and you want to discover if you’re right. Here’s how I draw out that out. It becomes a matter of organizing a dataset along those thoughts. I call causal variables X1, X2, X3… I call the single variable that I’m trying to explain the Y variable. There can be only one Y variable. For your own sanity, there can only be one Y variable at a time. There are a large number of tasks to figure out if X1, X2, X3 cause Y. One of them is to run any one of the many[…]
The full New York Times Innovation Report was leaked last week. It’s worth reading if only because it lets you look at a paradigm – an entire way of thinking, laden with it’s own explanations of culture, causal factors, jargon, assumptions, myths, systems, and heretics. It enumerates the preferences and aspirations of a small group of people (including their preferred org-chart re-org!) and highlights a long-standing tension between technologists and journalists. It may also serve as a wake-up call that continuous improvement and scientific management is already a reality at several disrupting media startups. Let’s begin. Summary if you didn’t read it (and won’t): The document contains 97 pages. The term “Competitor” is mentioned on 39 of those pages. Analytics[…]
It’s a big week for analytics in Toronto. There’s a growing industry of digital intelligence / analytics, professionals in southern Ontario. It’s a brilliant and welcoming industry. This is the week when we get together, share knowledge, and welcome newcomers. The eMetrics Summit, the conference of the Digital Analytics Association (use the promo code BERSPK for a discount to the summit), will also mark second major Southern Ontario Chapter meeting. There will be case studies from TD, CBC, Bombardier, Intuit, The New York Times, TVO, Hyatt, and Maple Leaf Entertainment. Zoe Morawetz (TD) is showing us how they execute digital segmentation. Gareth Cull (Mozilla), Mark Dykeman (BMO) and Tim Ashby (CM) will be sharing which technical traps to avoid, Greg[…]
This piece from McKinsey highlighted the inflated expectations of big data analytics – “…expectations of senior management are a real issue…but too often senior leaders’ hopes for benefits are divorced from the realities of frontline application. That leaves them ill prepared for the challenges that inevitably arise and quickly breed skepticism.” The listicle (et tu, McKinsey?) summarized below, is somewhat related to that concern: 1. Data and analytics aren’t overhyped—but they’re oversimplified 2. Privacy concerns must be addressed—and giving consumers control can help 3. Talent challenges are stimulating innovative approaches—but more is needed 4. You need a center of excellence—and it needs to evolve 5. Two paths to spur adoption—and both require investment (automation and training) In a fit of[…]
There are varying concerns about what constitutes a causal model, the degree to which data is biased, certainty that the model is predictive about the future, and, that the model itself is a truthful depiction of nature. Over the course of the past two weeks I’ve talked with many people about their perspectives – data scientist, developers, technologies, product managers, brand managers, statisticians, consultants, professors, executive producers, and founders. We’ve talked about everything from why analysts and their customers won’t accept narrow models, why it’s far easier to summarize data than it is to describe the relationships in it, and the intractable differences between what is performance reporting and what constitutes an insight. The verdict is not in. There are varying beliefs[…]
This is a lot of inside baseball. The motivation is to share information while acknowledging that it’s wildly anecdotal. It’s directed at data scientists thinking about business. The Facts Andrew and I founded Authintic in late 2012. We landed three great customers. We met between 1,600 and 1,900 well wishers, competitors and prospective customers. Five major market hypotheses were tested. Revenue was earned and value was generated. Authintic was acquired by 500px in early 2014. The Feels Thrilled. Very excited. And a tad skeptical about the lessons learned. People are terrible about extracting causal factors from an experience. I’m people. So I reckon that applies to me too. A sample size of 1 isn’t authoritative. It doesn’t constitute proof, or evidence[…]
The listicle is an amazing communication device. A listicle schema for communication – always in the form of a list. Sometimes that list is random, but, often ordered. I continue to be in awe of the ongoing effectiveness of the listicle. Lists are effective communication devices in analytics. Why not listicles? Lists Effective analytics dashboards are filled with lists. “The top 10 performing landing pages” “The top 5 posts” “The top 7 competitor ads…they don’t want you to know about!” Lists are visually compact and editorial appropriate. An executive might scan a list for the top performers and the bottom performers. An analytics executive might scan a list for the top 20% and verify that it accounts for 80% of[…]