Sucharita Mulpuru is among my favorite people at Forrester. She’s pragmatic, technical, and, in my view, a brilliant forecaster – three key skills and traits to be an effective strategist. Last month she wrote “Why Facebook Is Still A Tough Sell For Retailers“. She was called out as a hater. I don’t think that was fair. The crystallizing quote comes from an interview two weeks later – and you may have seen this article in Bloomberg on February 22. TL;DR: “There was a lot of anticipation that Facebook would turn into a new destination, a store, a place where people would shop,” Mulpuru said in a telephone interview. “But it was like trying to sell stuff to people while they’re[…]
Web analytics uses clickstream data. It’s data that is: Generally Anonymous Generally Aggregated Heavily Abstracted Most commercial web analytics software abstracts away the raw data with fairly usable interfaces. You’ll be hard pressed to find many people these days who know how to work with server log data. Yet, it’s still possible to segment a population of browsers based on the characteristics of the browser, computer, and reverse geographic lookup. That is to say, I can query, through the software, the differences between IE browsers in Toronto originating from Reddit from, say, Chrome browsers in New York originating from search. And then I can compare the differences between them. If it’s an eCommerce site, I may even be able to[…]
I’m a reader of Theory of Reddit. This thread, entitled “Who’s manipulating Reddit and how? Who’s buying votes.” is very interesting. Farshad signed up for a program. He upvoted a link. He’s eligible to get paid 8 cents. Read on. This is all unverified. I’m inclined to believe that Farshad is genuine, owing to the fact the account has 2 years of tenure and has reddit gold. It seems genuine. And so, if it’s a troll, this would be a pretty damn esoteric one. It stands to reason that somebody is executing the experiment. To the author’s credit, he proposes a few ways that a machine could detect paid upvoting, including comment quality and concurrency among recently created accounts. It[…]
Metaphor: A putter is PowerPoint / Excel. A 3 Iron is SPSS or R. A 1 Wood is python or octave. Here we go: The putter is used when you’re on the green and trying to get it into the hole. Lots of nuance on the slope. Lots of finesse with arms and angles. High variability. Little repeatability. It doesn’t scale, won’t scale, and doesn’t need to. The 3 Iron has a bit more power. It’s really great for getting onto the green from certain approaches. The wood has a lot of power. It’s really great for driving right down the range. It’s essential on the par 5’s. Knowing which tool when is important, right? Unless you’re playing mini putt,[…]
A few highlights from eMetrics SF 2012: The Web Analytics Association is now the Digital Analytics Association. There are now several shareable end-to-end B2B tracking case studies that are finally out, including Intuit and Symantec. Special call out the Michael Parker who spoke clearly and persuasively during his keynote. Prezi makes for some pretty engaging presentations. Digital Analytics is now a young adult. Advice for when you’re back in the office: You’re standing in front of a golf ball. Every millimeter on where you strike that ball makes a big difference months out. Tee up, and make it a really great swing. Take thirty minutes to integrate your notes into your work checklist / burn list. If you were inspired[…]
There’s a good discussion going on within data science circles. First, a brief background: It has been observed that complex models, generated by Machine Learning, are frequently more predictive (and developed faster), than the alternative approach of Domain Experts (Subject Matter Experts) generating a model and deploying it the field. This observation forms a very major fault line in science more generally. What happened: An Oxford-Style debate was held at Strata, pitting Machine Learning against Domain Expertise. Machine Learning won. If the problem to be solved is well framed, the machine trumps domain. However, the problem has to be well framed. What it means: Alistair Croll crystallizes the implication in this piece. And rightly eggs us on to go forth[…]
I’m working on another 5-part blog post series on “How individuals decide”. I’ve hit a snag. And it’s a bad one. It has to do with triggers of search. There are reasons for why people ask for evidence the way they do, and their subsequent reactions to follow up questions. For instance:Questioner: “I need to know how, of how many people visited the Vegan Microsite, who also saw my tweet about Chicken two months later.” Alright. So, there’s obviously a reason why the questioner is asking the question. And it’s a pretty strange one from the outset. What do they mean by ‘saw’. Two months from when? Cause and effect appear to be really messed up from the way I[…]
First, a thread of thought. Second, a brief exhortation. Summary: Chris Broadfoot showed some pretty amazing visualizations he had created using some open data. Mark Hahnel showed Figshare, which aims to help academics make their data open and available. He’s a big part of the open data movement. Flip Kromer, CTO of Infochimps, built the core technology that made sharing that data set possible, earlier on in the day. (And he kicked my ass in a German boardgame). Why I’m optimistic: I see in Chris’ work was the opportunity for the public and decision makers to make very well informed decisions about transportation policy. Relevant. I see in Mark’s work was the opportunity for others to, with greater ease, replicate[…]
eMetrics San Francisco is this week, and #measure can expect the usual volume of hashtags and quotes. For those of us at home or in the office, the flow can be pretty annoying. That torrent causes a fairly warped view of what’s really going on. eMetrics is far more than the witty one liners delivered in a BIG way in REAL TIME. There’s a lot of substantive material. A few questions to ask yourself: What is the definition of Big Data? What is the definition of Real Time? Can either help me win? Analysts aren’t alone in feeling like there’s too much data coming at them. Is more really better? More data might not be the right answer 80% of[…]
You may have recently clicked a link leading to this paper by Robert Ghrist on Barcodes. You may have also read a previous post about MINE. And finally, this month I talked about histograms and proceeded to subject you to their importance of seeing the data, again and again and again. TL;DR: Seeing the data helps analysts understand the data. Showing the data alone isn’t explaining the data. The first question, in response to seeing a line on a chart, is “why”? Sure, if the line is going up, I caused that. If it’s going down, that’s the weather’s fault. Fine. Those are great, convenient reasoning, guesses. It’s much harder to assert that a relationship between two things really exists.[…]