Category Archives: Social Analytics

Little Things that Make Big Impacts

The cleanest way I could explain the Butterfly Effect was to say:

“Let’s say my shoe is loose. So I decide to bend down and tie it really tighter, inadvertently creating a knot. Let’s say the next morning, I have a hard time getting my shoe on – for let’s say, four minutes. Then let’s say that I miss my bus by just one minute. And the bus has a frequency of thirty minutes. Well then – one seemingly unrelated decision, made 16 hours before and taking all of 2 minutes to execute, has a 30 minute tardiness impact 16 hours later. That’s pretty much like the Butterfly Effect. Writ Small. And Mundane. Without bad acting.”

The Star Trek: TNG way of saying it would be “There’s a cascade failure in the warp core”. But enough of the Laforging.

Cause and Effect dynamics are devilish. After all, my lateness could have been chalked up to not being ten minutes early as I normally am. Or it could be chalked up to the bus being on time, which is unusual. I like to think of the world as a whole bunch of cones converging on a single point. Taken from this point of view, there are as many explanations for something happening as there are people. We all have our perception and are all entitled to own opinions. Though, we’re not entitled to our own facts. (wink).

It’s just a matter of which model has the greatest predictive strength. Normally I’d head down the rabbit hole into a bias about multiple regression…but no. This isn’t going to be a statistical rant. No. I have something far funner to read. (I hope).

And of what implications for the social systems we create?

Twitter is an excellent laboratory to study for that.

And that’s where we’re going to get into a lot of trouble with each other, as social media scientists.

‘How one seemingly innocuous tweet could cause a cascade failure in the warp core?’ will be one of those great analyses someday. And it will be contested. Loudly. By very educated and sinecure analysts.

It won’t necessarily because they won’t accept that little things can make such big impacts. I’ll be referring them on back to this post at that point. And surely, every very educated analyst should be familiar, and indeed, should have experienced such dynamics in their own lives so as to be able to relate. The Butterfly is in the Sky.

Rather, the debate might be how much causality to attribute to the originating tweet, and how much causality to attribute to the reinforcing effects. And indeed, this sub-branch of analytics, of reinforcement-attribution theory, is still very young in marketing science literature. (I salute those of you who have made contributions. It’s just that I wish we had a unified language to describe it.). Someday I’d like to be able to say: “Take a look. It’s in a book.”

How do we understand cause, intervening variables, and effect – and how much we decide to respect where each other is coming from, is by and large going to paint future debates. I’m optimistic that there will exist one school of social media measurement practitioners that will rely on evidence to make assessments. And I’d like to be in that school. I’m certain that we can go twice as high.

There was a little theme running throughout the post.

That’s how little things can make big impacts. And how something little will make something big.

eScience

I’ve just learned of eScience as a result of a book entitled “The Fourth Paradigm”.

While I don’t have that much to say about the essence of the Fourth Paradigm yet, I have to admit that I feel immediately at home with this group within eScience. One of the best quotes in the book is:

“Need driven versus curiosity driven. Basic science is question driven; in contrast, the new applications science is guided more by societal needs than scientific curiosity. Rather than seeking answers to questions, it focuses on creating the ability to seek courses of action and determine their consequences.”

Substitute ’societal needs’ with ‘business needs’, and I have myself a nice bridge between eScience and commercial eScience. I suppose that’s been one of the fundamental misunderstandings about the Scientist-Practitioner: that they were only poking about out of curiosity. Science for the sake of science.

What if we were transparent about the intent to use science for purely commercial gain? Sounds Edisonian I suppose?

Much of the literature seems to be about very huge computing problems, like analyzing the data from the LHC. I’m not necessarily as concerned with problems of that order of magnitude. In fact, most business problems are fairly modest by comparison. What will, however, hold back commercial eScience, are the same forces that will hold back eScience. That is to say, the lack of unification among the fundamental tools.

At any rate – this field looks attractive.

Complexity

I’ve spent a lot of time this week managing complexity.

And it’s gone well.

I think looking for simple and remembering the end goal are two key ingredients. Backcasting happens a lot. Expecting exogenous shocks instead of being all outraged when they happen is another.

That’s all that’s really on the mind.

That and how much code I have left to write. :)

The Seven Axioms and Predictive Validity

I published seven axioms over the past week – in a not so humble fashion. I’m taking the James Burke line to heart and just putting it out there.

The Seven Axioms are:

1. The purpose of analytics is to derive competitive advantage for the organization / firm / entity.
2. Data alone does not yield competitive advantage.
3. A sequence of progressive hypothesis testing is the most efficient and effective method to derive competitive advantage from data.
4. Predicting the future requires an understanding of cause and effect.
5. Correlation is not always Causality.
6. Accuracy over Precision.
7. It is possible for there to be two optimal, equally true, answers to a problem. (And Sometimes More!) (X^2 = 4, x=-2, 2).

They might appear to be fairly straight-forward. And they are. In my opinion.

A statement like Accuracy over Precision was certain to cause problems. And it has.

If you look at the language around cause and effect, causality, and there being many correct right answers to the same problem: you get the point. It follows from the Axioms that, to derive competitive advantage, you need to be able to make predictions about the future, and the only way to really get there is through progressive hypothesis testing with accurate data, and understanding both complexity and causation.

Champagne Dreams on a Beer Bottle Budget

I’m reading Sam Ladner’s thesis.

It’s strong work, and quite possibly one of the best reading experiences I’ve had since “Reading Virtual Minds”.

On Page 149, there’s a quote in explaining the common occurrence for ‘fires’ to occur as a result of low-ball estimation:

Curt: Why do they have the fires?

Sam: Yes

Curt: There could be a million different reasons if you think about it, I mean, clients coming in with aggressive timelines period or everybody will come in with big dreams, right?…Like you never lose the champagne dream even if you’ve got a beer bottle budget, right? You always dream big but you might not be, like, okay…”

And I’m in awe.

What a gem.

And I ask myself: how can we optimize and predict dreams? How we do rationalize the denominator here?

What a fascinating business problem.

Analytics and Inside Pool

You may or may not have been hearing about a debate going on in web analytics.

To most, it might seem like a lot of inside pool. And I suppose most of these things are.

I want to talk a little bit about some of that inside pool.

Over the course of my WAA Research Committee work last week, I stumbled upon a paper entitled “Assumptions, Explanations, and Prediction in Marketing Science: “It’s the Findings, Stupid, Not the Assumptions” by Eric W. K. Tsang.

In it, he replies to a debate that’s been going on for a long time, but what natural scientists had settled a hundred years ago. Richard Staelin back in 1998 said that there’d always be debates about whether analytical models needed to have realistic assumptions or not. Shugan came out in 2007 and argued that it wasn’t about the assumptions. I can remember reading that paper back then. It had an effect on me. Let’s fast forward to 2009.

I don’t quite know how it happened, but I ended up sitting at a table with the megastars of Marketing Science research at an informs conference. Dr. Lehmann was there – as was Dr. James Lattin. From what I gather – they’re pretty distinguished researchers. I didn’t know it at the time, and I doubt that it would have changed my behavior much. Maybe only outliers would ever dare sit with that group. That’s how I met Alex.

Two outliers at a table of high insiders.

Alex is an economist out France. I won’t go so far as to call him a French Economist, but, regardless, there it is. :) We got into a discussion about how irritated I was with stupid assumptions.  I understood that without invalid assumptions that the math wouldn’t work: but maybe there’s no value in the math that doesn’t work. That unless I could use the model to understand the world, or at very maximum: predict the future in some way – that wasn’t of any use to a practitioner or to a scientist. Alex explained to me that the Math unto itself could help science chip away at the edges of complexity – and if something adds understanding, then it is of value: but maybe not to a practitioner.

I still accept where Alex comes from. I think there’s a role for trying to understand complexity by way of deliberate simplification. How those assumptions get selected still bothered me, and I continued to want to shout down anybody who had selected, in my judgement, a stupid assumption for such little gained value.

Back to Tsang, in his paper, where he takes Shugan on. Apparently there’s an entire school of thought that dismisses my belief that science should have at least a goal in making accurate predictions about the future. Tsang carefully deconstructs Shugan’s 2007 arguement, and in the end concludes that “although Shugan (2007) rightly stresses that it is inappropriate to dismiss a model or a theory based only the realism of its assumptions, realism does matter, and it matters a great deal for model building and theory development.”

And I happen to agree with Tsang. He’s helped me immensely in being able to reconcile some of that inside pool.

A lot of the inside pool going on right now in Web Analytics is very similar to Tsang-Shugan and Christopher-Alex. There are huge disconnects between what many web analytics practitioners want analytics to be, what some of the industry titans want it to be, what customers of web analytics outputs want it to be, and even within the broader analytics community (data miners, revenue managers, and market researchers are in the same neighborhood) want it to be.

All this – within an industry that couldn’t possibly employ more than 50,000 people in total.

Inside pool is important because it’s about values and refining the definitions that are in use by a community.

The Strength of Weak Ties

A tight group of friends will tend to overlap in terms of product adoption and preferences. Like people clump alike.

I hypothesize that the social graph is partially-fractal. I use the word ‘hypothesize’ because I don’t have the technology to prove it. Moreover, at this point, I don’t think I could write the proof to prove that it’s partially-fractal.

By fractal, I mean that at the most basic level, the individual with a circle of friends, they’re all alike. If you zoom out, treating each group as though it’s a person, they’re all linked together in a similar way, and if you zoom out again, treating each groups of groups…the structure is the same. In other words, the further you zoom out, the same essential pattern bears out. (I could see Maven’s clumping together in some way, even though Mavens might organize in groups of acquaintances – and it’s that pattern that replicates.)

There are times when ‘forward to a friend’ actions are important: intensity plays are one example. If a group of people enjoy wines, frequent talking about wine (and brands) will bring ideas to the front of mind, and I hypothesize that you’ll have a higher intensity of use.

There are times when ‘forward to acquaintance’ actions are important. It might very well be that you’ve achieved 90% penetration within one set of social groups, and you need to leap out.

In a way, the same rules that should apply at the micro-level should be possible at the macro-level. I suspect that there’s a law in there: perhaps a predictable step-function, that could be used to predict market penetration. I wonder if it’s really been embedded all along in our traditional S curves.

The takeaway from all this is that it’s worth considering which behaviours you want to encourage at which times in your customer lifecycle.

Social Media Measurement

It’s been a busy week in the world of social media measurement, or social analytics, as I like to call it.

Anna O’Brien, Marketing Science analyst extraordinaire, wrote a very good post on the topic. Her primary point, enough with the phony people, is polarizing and necessary. The secondary point: social monitoring is not social measuring is also apt and important.

My interests like in the measurement side: content analytics and metric analytics. There’s a lot of utility there.

A few months ago Joseph Carrabis did a very interesting sentiment analysis on Zappos’ twitter stream. “Tone optimization” will no doubt end up being a major offering sooner rather than later. Let me explain.

Optimizing a web campaign can be very hard. It’s hard because our institutions make it hard. Social media marketing strategy can be built in such a way that it can be optimized. Every response can be improved. “Learning” is possible in an accelerated way.

To that end, tone is an important variable. It’s a vital variable in web copy just as it is in social media marketing. Language matters. (I’ve been reading “Language and Human Behavior” – sometimes the paragraphs drip with frustration). Sometimes what you say isn’t nearly as important as how you say it.

Knowing how many people are saying ‘positive’ and ‘negative’ things is one thing – but what about what your staff is putting out there. What’s the tone? What’s the effect? How can be it changed to improve the outcome and hit your goal?

This is the promise of social media measurement.

In sum, it’s been a busy week for social media measurement.

Social Analytics and Sentiment Analysis

There are major problems with the way that sentiment and intent is presently being measured and reported: you need only scratch the surface a little bit to uncover the grim truth.

The business problem that sentiment analysis solves is informing a manager, at a glance, not of only of the tone and vibe that his own employees are sending out there, but also how the public is responding to the policies and practices of the company in question.

Can’t you do this qualitatively? Well sure – if you didn’t have the anchor-and-adjust function in your head, it would be just fine. And ‘normally’ functioning humans all suffer from the curse of anchor-and-adjust.

The second business problem that effective sentiment analysis solves is the measurement-optimization problem in social analytics. The Coles Notes goes like this: “how can you possibly optimize if your measuring stick isn’t consistently accurate (or consistently inaccurate?)”. Because of anchor-and-adjust and normal human bias, how can human coding ‘correct’ anything consistently? Of course, it isn’t a business problem that anybody is really aware of – even to this day, most web analysts are unaware that a shifting cookie retention curve is messing with their KPI’s.

Anyway, there are real business problems to be solved there.

Anatomy of a Clusterf**k V

Clusterfucks will happen, and nobody ever really walks away from one a winner.

A clusterfuck can be turned around by either boosting trust, hitting ‘reset’ when it comes to definitions, deliberately seeking out extra understanding, or, if there’s a hollow core of authority – electing a leviathan to run the group.

Clusterfuck avoidance is going to be a major social technology as knowledge worker teams become increasingly interdisciplinary. More problems are bound to happen because the complexity in terms of communication and the specifics of professional norms scales. Just as an example, if a chemist tells the engineer that temperatures from the mix could trough at -200 c, and asks the engineer if the structure could be designed to handle that – the engineer could choose take the question badly. The engineer could take the question as a professional afront: of course she’d check the temperature as part of the normal procedure of being an engineer, and resents the implication that they’re incompetent. Or, the engineer might appreciate the question, the -200 c might raise important points, and the engineer might use the opportunity to ask more questions about the nature of the resulting liquid.

The engineer has the choice to respond in a positive way and to propogate good will and further trust, or, the engineer has the choice to hold a grudge and start clamming up.

We can architect teams and tailor our cultural norms to avoid clusterfucks, and it would be well worth the effort.

Next week I’ll move onto another topic series.