Topic Bearing WOM
I’m increasingly disturbed by the accuracy of Topic Bearing Word of Mouth (WOM) algorithms.
A previous study, published in this space, expressed dissatisfaction with standard sentiment analysis. My mind has since turned to the difficulty in expressing massive amounts of WOM into simple metrics that are actionable and decomposable.
So let’s just go beyond the realm of evidence based pre-optimization of marketing messages, and set the entire area of sentiment-bearing word polarity aside for awhile. It’s relevant and important. Just not the focus tonight.
Let’s turn to topic bearing WOM.
Imagine you could listen to the world, and assume that Burke’s reality is now…a reality.
If you haven’t seen the video from my ‘about’ section – here it is again. It literally is what I’m going on about:
How would you be able to make sense of the world? How would you, as a person, listen and understand all of that material? If the world is constantly changing and is what you say it is – just say.
Well indeed. So what are people saying? How do you aggregate all of that information into a format that’s understandable to mere mortals?
How could you possibly? To use a web analytics analogy – it’s akin to reading server log-files manually, one at a time, for want of a log-file reader. Or at least, a log-file reader that you don’t really trust.
The initial reaction is to do what marketing statisticians have been trained to prior to 2004: use sample statistics. I have got to ask: why use sample statistics when you have the whole data mine right there? Isn’t the only reason for sample statistics existing is for want of the database? (And nobody truly knows the overall sample size that they’re trying to project against. In the case of many topics, the n is extremely small. In others, it’s effectively undefined until semweb comes along.)
We have a massive database.
The idea of taking 1000 log files and reading them manually – and then saying that those 1000 log files are representative of the whole isn’t psychologically acceptable to most marketers. That +/- 3.1% sampling error is reinforcing your 15 to 20% interpretation error and you’re looking at a pretty dense ROE. ROE is generally not psychologically acceptable. Shows are canceled on the basis of statistical error for want of understanding to this day (and we’re 80 years into that methodology (consider radio, yup, it goes back that far)). And yet, even if you were to pitch that sampling approach and the ROE was acceptable, that really doesn’t gel because of the expectation of drillability and a broader expectation about the granularity of the data. That drillability expectation is also vital to solving the Integral Problem. If you’re a web analyst reading this, it’s just implicit within your paradigm – the way you’ve been brought up with the data – to expect that you’re able to drill into anything. It’s a bias that’s always been there.
If you’re a digital marketer or a UX strategist – you probably won’t even question that relative availability of incredibly granular data. It’s like a can opener. You just assume it. Take that away and the beans just won’t taste the same.
The big n, the overwhelming amount of data, demands a data mining approach. It demands a machine algorithm. It also demands a statistical methodology that is scalable. This heads into a domain that lies at the intersection of data mining and computability. It’s just awesome. There are many solutions, but very few solutions that will actually produce timely intelligence.
Topic Bearing WOM and the categorization of it should be, on the surface, a much easier nut to crack than sentiment-polarity, which is intensely subjective. But it’s not. If you ask 100 marketers to write a one paragraph summary of a 600 word blog, you’ll get a diversity of opinion about what the blog was actually about. Unanimity on what the topic was is extremely difficult to achieve. Not convinced? Consider the diversity of opinion about what the topic of S.11 of the Canadian Charter of Rights and Freedoms. In fact, this is a very deep problem that has been struggled against for the better part of the last decade. It’s no easier.
In the coming days, many pixels will be spent writing about the categorization of topic bearing word of mouth. There’s just a confluence of news and opinion. We might see a resurgence of opinion-mining and, in an experiment I’m doing on you – the word-of-mouth/social nexus.
So I’ll say this:
People will write. I welcome that.
Many will claim that it’s so simple. It’s not. This 892 word post has been a hike for you.
Awesome minds have been working this problem for at least 31 years, and have been really serious about it for the past six. 100% accuracy is not probable (in your lifetime). Statistical sampling is not a panacea. And even with a unified corpus even the best analysts are going to have a tough time with it. (Though, unified corpus’ are great).
Topic Bearing WOM poses a huge opportunity, and a huge challenge. It should be tackled with same amount of care that we take at Syncapse.
So enjoy.
My point stands. I’m dissatisfied with the existing algorithms to summarize topic bearing WOM. And you should be too.
3 thoughts on “Topic Bearing WOM”
I love reading your blog at 5am in the middle of my end-of-term all-niter review session. It’s like having another diet coke. I needed that break.
Why can’t I find anything on google about “Topic Bearing Word-of-mouth algorithms” besides your blog. Are you inventing buzzword?
So, I was reading (with passion) your white paper on sentiment analysis, and this raised a few questions:
– Is there something out there that could answer a particular framed question rather than giving sentiment? For instance, “I want to know how *shocking* North American 65+ Christians find that brand/topic/concept”, or “How *surprised* is the user/buyer/payer of my product by our last product launch. Were they *anticipating* a product launch?”… Indeed, *sentiment* is pretty useless.
– As for simple metrics, can we use a few simple emotions/moods/attitudes/values typologies and generate some good-looking matrix? I mean, there’s ton of that stuff out there, what refrain us from using it?
– When talking about analysis of the real data, don’t you think you’re limited by the simple fact that talk is cheap? When researching motivation, values, attitudes, you can’t just take what people *say* for granted, you need to use disguised techniques (I’m not even talking about qualitative research) and dig further, right?
Now, let me have a try at challenging your paper. I’ve just passed the exam of my marketing research class a few days ago, had done a project with SPSS, so I’m a enthusiastic noob š
Hypothesis 1, you are testing this:
“There is no unanimous agreement among all respondents on the sentiment score of a single response to any question within the survey.”
I’m shocked! you’re testing the absolute 0% variation in answers. How significant is that?
I’m looking at another insight. I wanted to know if responses were *consistent*, I don’t mind if 90% said positive, 5% neutral and 5% positive. For me, that’s acceptable, and it shows human appreciation is somewhat consistent. I donāt understand your six-sigma obsession of sentiment analysis š
To back up your argumentation, you said that because the range was 2, it means there is a dichotomy in every set of answer (“For every question, there was somebody who thought the statement was positive and somebody else who thought the statement was negative.”) Again, you werenāt interested in testing consistency, but simply looking at the least inconsistency, which is unfair to my opinion.
Well let’s have some fun, and look at distribution mister Berry!
My God itās 5:41 am. Iām waiting for SPSS to download on this computer.
–**–
Yeah so I still have to work my way through stats. Here is what I came up with, simple percentage distribution bar charts.
http://docs.google.com/View?id=dcjz9dh2_10gj6gznc5
It shows that cases where humans are randomly or equally distributed between negative/neutral/positive are rare. However, answers tend to be skewed on one sentiment (i.e a very few negative, some neutral, the majority positive), and rarely spread across extrema. So without going into chi-square tests and all of that fanciness, it looks pretty consistent, even if you canāt get a clear trinomial categorization, you still have a āsentimentā. Now, how is this helping me getting laid is another question!
-Why canāt I find anything on google about āTopic Bearing Word-of-mouth algorithmsā besides your blog. Are you inventing buzzword?
Nope. The term ‘topic bearing’ originates from the computational linguistics field. Word-of-Mouth’ originates from the marketing science field (or so I believe). The combination of both terms is unique to a person at that nexus. The term is distinguished from ‘sentiment bearing’ or ‘opinion bearing’. It’s not a buzz word though. Both terms exist in their in respective communities. It’s a mashup.
— Is there something out there that could answer a particular framed question rather than giving sentiment? For instance, āI want to know how *shocking* North American 65+ Christians find that brand/topic/conceptā, or āHow *surprised* is the user/buyer/payer of my product by our last product launch. Were they *anticipating* a product launch?āā¦ Indeed, *sentiment* is pretty useless.
Not to my knowledge. Not with any degree of reliability or validity yet. Such specific databases might exist somewhere, but not in a format that you or I can use. Subjective subjectivities are subjective.
— As for simple metrics, can we use a few simple emotions/moods/attitudes/values typologies and generate some good-looking matrix? I mean, thereās ton of that stuff out there, what refrain us from using it?
Have at’er.
– When talking about analysis of the real data, donāt you think youāre limited by the simple fact that talk is cheap? When researching motivation, values, attitudes, you canāt just take what people *say* for granted, you need to use disguised techniques (Iām not even talking about qualitative research) and dig further, right?
Why do people say anything online at all? What motivates somebody to generate a piece of content? Is it effective. Those are the right questions to be asking at this juncture.
-Hypothesis 1, you are testing this:
āThere is no unanimous agreement among all respondents on the sentiment score of a single response to any question within the survey.ā
Indeed. No two people agreed on the whole sequence of questions. I legitimately didn’t know if two people would match exactly. So, we tested for it. Whereas a few people agreed on the Total Score, the underlining composition of the answers on how they arrived at it were all different. In effect, there was underlining error masking the overall score. Tricky to show in SPSS, but that was the clearest way.
-It shows that cases where humans are randomly or equally distributed between negative/neutral/positive are rare. However, answers tend to be skewed on one sentiment (i.e a very few negative, some neutral, the majority positive), and rarely spread across extrema. So without going into chi-square tests and all of that fanciness, it looks pretty consistent, even if you canāt get a clear trinomial categorization, you still have a āsentimentā. Now, how is this helping me getting laid is another question!
Excellent! I’m really happy that you’re engaging with the data! This is excellent! That said – yes, the distribution is pretty normal. There’s an average there. That’s the average sentiment score. There underlining instability among the respondents is pretty interesting though. While there might be an average of averages, if you were to drill down, you’d get a distribution of responses. If your personal point of view diverges from what the plurality says – isn’t that going to generate dissatisfaction? Moreover – there’s fuzzyness there.
We’re not going to eliminate all that error within our lifetimes. It’s going to persist. At least by dragging it out from the shadows and saying “look. there. now can we go on”, we can get into some solid actionable insight. That’s my hope at any rate.
Comments are closed.