Deriving structure from the subjectively unstructured isn’t easy, but the results can be extremely useful and rewarding.

In particular, the thing about ourselves, as people, is that deficits in our own knowledge can bring about particularly bad biases. These biases are corrected for in traditional content analysis, in that a small group of people train each other to come to an agreement. That’s not the case in larger groups.

Yesterday, you heard about “Unskilled and Unaware of It: How Difficulties in Recognizing One’s Own Incompetence Lead to Inflated Self-Assessments” (Kruger and Dunning, Journal of Personality and Social Psychology, 1999 (77), 6, pp. 1121-1134.), which you should absolutely read.

They took unstructured jokes and converted them into a dependent variable.


To validate four predictions:

  • “Incompetent individuals, compared with their more competent peers, will dramatically overestimate their ability and performance relative to objective criteria.
  • Incompetent individuals will suffer from deficient metacognitive skills, in that they will be less able than their more competent peers to recognize competence when they see it – be it their own or anyone else’s. 
  • Incompetent individuals will be less able than their more competent peers to gain insight into their true level of performance by means of social comparison information. In particular, because of their difficulty recognizing competence in others, incompetent individuals will be unable to use information about the choices and performances of others form more accurate impressions of their own ability.
  • The incompetent can gain insight about their shortcomings, but this comes (paradoxically) by making them more competent, thus providing them the metacognitive sills necessary to be able to realize that they have performed poorly.” (p. 1122)

The authors, in their first test, used ‘funny’ as the benchmark of competence. A group of people took the test, rating jokes on a scale of 1 to 11. The participants were then asked how well they thought they did, relative to their peers, using percentiles. The average participant rated their ability to recognize what’s funny in the 66th percentile. The real average, by very definition, would be 50. That is to day, as a group, their collective wisdom overshot the actual result by 16 points.

The principle source of the overshot were those who were incompetent. “Whereas their actual performance fell in the 12th percentile, they put themselves in the 58th percentile…That is, even participants in the bottom quarter of the distribution tended to feel that they were better than average.” (p. 1123).

The execute three more tests – one on logic, one on grammar, and finally, a different logic problem with training. Their four predictions stand up pretty well.


The paper demonstrates how a deficit in popular knowledge can reinforce upon itself. This effect is too big to really ignore when we’re trying to bring structure to subjective data.

Traditionally, we lock five coders in a room with a coding book. They’ll get together, notice the areas of disagreement, refine their collective understanding of what a concept means, and resume coding. It’s a proxy.

You can get a much larger sample using Amazon’s Mechanical Turk, a favourite amongst data scientists for coding features. However, the unity in understanding of the coding guide won’t be there. You’ll have a much more popular opinion from a much higher number of coders. The competency of coding will also vary significantly.

That is to say, a small group of people who are able to correct competency (if prediction 4 continues to hold up). A large group is far more expensive to train, with far greater variance in competency as a result.

It’s a concern, yet, manageable.


I’m Christopher Berry.
I tweet about analytics @cjpberry
I write at