The data is imperfect. Judgement is imperfect. Decisions are imperfect. The question isn’t about perfection. It’s about progression. What becomes true if we were to focus on progression? Credit goes to Matt Gershoff for inspiring this post. A remark he made at a recent #miToronto grabbed me. To paraphrase, he said that when you stop obsessing over which model is right or wrong, because all models are wrong by definition, and start focusing on just making it better, you get a lot further. He used the term liberating. And it is. The Data Reality is flawed, and as a direct result, it generates data that is flawed. The machine housed in your skull is a serious piece of technology, but[…]

A great mind in public policy told me, just this last September, that people are really bad at judging the rate of technological change and when it’ll affect them. It’s like standing on a railway. You can see the train out there. Some people assume that the train is going to hit them very soon. They get off the tracks. Then, when the train is getting very close, others misjudge the speed and assume that it’s still a far way. And then they get hit. It’s a great analogy because it combines prediction with decision. The rate of technological change is actually quite difficult to predict. If it was easy there’d be a lot more successful startups. One Heuristic Start[…]

A score serves as an ultimate abstraction or summary. That’s especially true in sport. “Who won?” “The Blue Jays. 11 to 5.” The Blue Jays won because they moved men more often across one specific plate more often than the other team. This is all very American. A brief period of action. Collect statistics about that brief period. ???. Profit. And it’s easy. Baseball is nice for the 1 to 1 correspondence of points to a single event. American football and basketball are spicier. Cricket, with all due respect to my antipodean friends, is ridiculous. There’s so much more to the performance of The Blue Jays or the Australian National Cricket Team. But the score is the ultimate summary. There’s[…]

This series originally appeared in Eyes on Analytics on April 16, 2012 The City of Edmonton posted a pretty interesting position last month. The description is so good that it bears repeating in this space. Bolding is my emphasis. Traffic Safety Predictive Analyst Put your superior analytical skills to work in North America’s first and only municipal Office of Traffic Safety. You will be joining the rapidly growing field of urban traffic safety where the application of statistics and predictive analysis is becoming a vital decision support tool in reducing motor vehicle collisions. ¬†Your responsibilities will be: Provide short, medium, and long-term predictions of collisions and/or speeding by considering current and historical traffic safety related data as well as other[…]

Predictive analytics is somewhat mysterious. So, let’s shed some light on it. (Note that I’m simplifying this quite a bit to be accessible.) The first step in predictive analytics is to understand what you’re predicting. We’ll call this the Y variable. In this instance, ‘how many visits from Boston can I expect on a given day’. My Y will be ‘Visits’. I’m curious about it. Have some discipline. I see way too many analysts change the Y variable before their investigation is through. The second step is to identify all the variables that might be associated with a variation in Y. These might include factors like paid media, search, new visits, returning visits – and date. Then there are paid[…]