This is part five in a five part series on Analytics and GIS. Part one looked at a job posting in Edmonton, part two on scoring, part three on model builder, and part four on discrimination. I’m excited for Edmonton and Edmontonians. The decision to hire a predictive analyst for road safety is an awesome one, and one that ought to generate real results well into the future. There will be no real way for any individual Edmontonian to know if their life was saved as a result of the recommendations realized and applied as a result of this program. On the aggregate, over time however, fewer fatalities and serious injuries should accrue. I’d like to see a long term[…]
This is part four in a five part series on Analytics and GIS. We’ve previous seen a position for Road Safety Predictive Analyst, that roads can be scored, that hazard can be modeled with existing tools. Today, we hit ethics. DiscriminationYou may have heard about Microsoft’s GPS Patent, the ghetto avoidance algorithm. It caused a bit of furor. Concretely: Microsoft filed a patent for a GPS guided walking app. It has an algorithm allowing the user to avoid a certain neighborhood if the crime threshold is too high. Some people say that it’s racist and prejudicial. Other people say that it has nothing to do with race or prejudice. Like People Clump Alike People who are alike tend to live[…]
This is part three in a five part series on Analytics and GIS. Part one focused on a road safety predictive analyst position, and part two focused on scoring algorithms. Tool Time ESRI produces a tool called ArcGIS. It uses open source mapping data, and has a fairly sophisticated set of functions. ArcGIS has an optional tool called Model Builder. GIS analysts use it to score environments. Or, in this specific case, road segments. Modelling Yesterday, I enumerated a sequence of attributes that a road has, ranging from speed limit, neighborhood, size, intersection and crossings. There are also attributes like weather and time of day, which are hiding yet more spurious variables like number of suspended drivers on the road[…]
This is part two in a five part series on Analytics and GIS. Yesterday, we looked at a job posting for a road safety predictive analyst. Scoring Algorithms Collisions, injury, serious injuries, and fatalities happen in a time and in a space. Both the attributes of the space, and the attributes of the space at the time can be recorded and understood. It is always preferable, when executing any optimization program, to optimize for a single variable. Scoring is one way that we take a whole bunch of factors and derive a single figure. Road segments have attributes. Is it an intersection? Is it two lanes? Is there parking? Is there a bike lane? Does it twist? Does it have[…]
The City of Edmonton posted a pretty interesting position last month. The description is so good that it bears repeating in this space. Bolding is my emphasis. Traffic Safety Predictive Analyst Put your superior analytical skills to work in North America’s first and only municipal Office of Traffic Safety. You will be joining the rapidly growing field of urban traffic safety where the application of statistics and predictive analysis is becoming a vital decision support tool in reducing motor vehicle collisions. Your responsibilities will be: Provide short, medium, and long-term predictions of collisions and/or speeding by considering current and historical traffic safety related data as well as other influential factors, including weather and demographic data Identify, generate and monitor[…]
Post frequency on the analytics focused blog, Eyes on Analytics has increased to daily. In part, this is to solidify the understanding of the frequency-reach curve in blogging, and in part, it’s an attempt to understand where the broader market is at. I’m testing three themes: How to fight nature’s pesky way of inhibiting our ability to make clean causal statements. The importance of imagination in identifying independent variables. The role of evidence in decision making. Simplification of a message is not pandering. However, many pandering statements are deliberate simplifications. If your optimization objective is to gain followers: Post often. Post simply. Post what people want to hear. I’m choosing simplification while avoiding pandering. Let’s see how that unfolds over[…]
I learned quite a few things this week thanks to a lot of our twitter exchanges. Thank you. Collectively, digital analysts do not: Have a standardized method to express causality. Have a standardized method to limit R^2 inflation as a result of collinearity. Have a standardized method to express either in a clear, simple, and concise way. A set of preferred solutions: We should use conceptual frameworks, causal diagrams, or Ishikawa diagrams, to express the relationship among variables. We should check VIF and communicate that figure when reporting R^2. I’m a long ways away from being able to be really brief WRT this problem set. What do you think? *** I’m Christopher Berry.I tweet about analytics @cjpberryI write at christopherberry.ca
The objective of the series on marketing attribution was to demonstrate how constraints, caused by humans and nature themselves, generate enormous issues in the marketing sciences. Sometimes such issues are trivialized away. “After all, this isn’t exactly rocket science.” Indeed it’s not. Marketing science is harder. Konstantin Tsiolkovsky published the basic physics of how a rocket escapes Earth’s gravity in 1903. Those laws of physics applied in 1957 when Sputnik was launched. They’ll apply the same way again when/if, in 2012, 55 years later, North Korea gets something out into orbit. While the math looks intimidating, it’s Newton, some systems thinking and some calculus. There are engineering difficulties with respect to stress, force, and materials sciences that are not trivial.[…]
On Monday we set up a model relating foot traffic to patio attendance and beer revenue for a pub on Toronto’s Peter Street. On Tuesday, we expanded the model to include weather. All equations are fake and are for illustration purposes only. A Concrete Example Assume: X1 is the number of people walking past a patio on Peter Street. X2 is the number of people who are sitting on the patio, drinking a beer, on Peter Street. Y is beer sales for that pub operating the patio on Peter Street. Y = 1250 + 0.05 * X1 + 18.22 * X2 W1 = ((c0 (temperature) + c1 (humidex) + c2 (sky) + c3 (precipitation)) / (clout denominator))*100 Y = 1115[…]
Yesterday, we did some work on Peter Street. We related foot traffic to patio use, all to predict pub revenue. A Concrete Example Assume: X1 is the number of people walking past a patio on Peter Street. X2 is the number of people who are sitting on the patio, drinking a beer, on Peter Street. Y is beer sales for that pub operating the patio on Peter Street. Assume a dataset and a traditional linear regression – and get the equation (it’s for illustrative purposes only – it’s not real): Y = 1250 + 0.05 * X1 + 18.22 * X2 To which a good friend remarked: “Ha! I got you! I finally got you! What about weather?! You can’t[…]