Opinion Mining the INFORMS Marketing Science Conference 2010
The Marketing Science conference was really quite inspiring and informative.
The conference generally followed this formula:
I spent much of time alternating between Rooms 2, 14, and 25 – so I can only comment on 1/12th of the conference. So I’m not representing the diversity of the conference. A lot was said. I’m saying what I saw. (Preemptive apology aside to all those who spoke that I’m ignorant of, okay?)
Opinion mining and sentiment analysis brought out the most flood of insults. Though, image macros were sorely lacking, it really did seem like insults were being thrown.
The presenter would put forward a model and test it. Most of them are engaged in creating predictive models. It’s what they do. Don’t question it – go with it.
In a few instances (I counted 8), they were correlating ‘sentiment’ to some dependent variable – stock prices, box office sales, revenue and so on. They’d hold most of the time series ‘in sample’ and then test their model against ‘out of sample’ to test the predictive accuracy of it. If the model is successful at predicting some of the variation, then they can say that they’ve contributed to the understanding of the world.
The trouble came with the operationalization of ‘sentiment’ – of which there are a bunch of machines. I’ve talked about such machines and dimensions in this space – like NSSA for instance. The Syncapse Measurement Science team has also published on the reason for errors in this space.
Digression. Image macros. See below.
The most famous example of this are lolcats – born out of caturday (saturday). We’ve had a wave of new image macros – thousands created every single day – in an effort to create forced memes. It’s not just that I love image macros for the sake of image macros (I do), but it’s also the way they impact language.
Language has evolved. What’s especially recent – in the past 20 years or so – is how quickly. (Or my perception of how quickly it seems to have been evolving in the past 10 years in particular.)
For instance, so much of what researchers call ‘froth’, ‘foam’, or ‘gibberish’ on Twitter is really, in part, just the transference of language from one environment to another. For instance, “Epin”, in an English tweet, is not a typo. It’s an annoying cancer image macro that was mercifully killed off three months ago and has since dried up. There’s an entire lolcat bible, and indeed, such language makes its way in regular speech. Multiple alternative dictionaries for words exist, replete with their own nuances. SMS and MMS brought on their own variances, co-inspired with online gaming (pre-mic). I happen to watch image macros because I have a passion for them and they evolve really quickly.
Next related point:
Many of the existing sentiment machines are trained or based on older and mis-purposed datasets. They’ve been trained on 30 years of Time Magazine and media analysis.
How would such a machine, trained on Time Magazine, ever know what a ceiling cat is? What ‘lol’ means? In general, people don’t talk like Time Magazine. Or the New York Times. Or European Union policy statements and research documents. A casual glance at image macros, twitter, and your own SMS messages would tell you that, too.
Naturally, such machines are going to generate a huge amount of error. Much of the freeware and academic machines are rooted in a different era and just isn’t how people actually communicate to each other.
They’re perhaps designed to code mass media, but they’re not designed to code for communication by the masses.
The hardest part for the commercial industry as a whole, I think, will be embracing dynamism of the language. Any machine algorithm we embrace (And I tell you that we will, as an industry, use machine algorithms.) must embrace this dynamic aspect. And manage it.
And that’s it.
I thank all those I met and spent time talking to me. It was all very inspiring.
2 thoughts on “Opinion Mining the INFORMS Marketing Science Conference 2010”
Maybe some systems are trained with old news feeds or such text collections. But I don’t believe it’s the approach of most recent commercial systems, particularly specialized in the monitoring of social media.
If one wants to conduct sentiment analysis and monitoring of a brand, not only should he train for the language commonly used in the communication channel that will be monitored (your Twitter example is a very good one), but also for the language used to talk about this particular brand (words used to talk bout Toyota are not the same than those used to talk about BP, and they don’t necessarily convey the same sentiments).
That being said, properly guessing the sentiment of such an informal and living channel as Twitter will always be much harder and the accuracy will always be lower than formal written news-like mentions.
I ask – what can the variation in those words tell us about the relative standing of a brand in the public’s stream of consciousness?
Can real investigation (analytics) into the variation in those words lead to actionable insights that result in some sort of advantage?
Comments are closed.