There’s a good discussion going on within data science circles.

First, a brief background:

It has been observed that complex models, generated by Machine Learning, are frequently more predictive (and developed faster), than the alternative approach of Domain Experts (Subject Matter Experts) generating a model and deploying it the field.

This observation forms a very major fault line in science more generally.

What happened:

  • An Oxford-Style debate was held at Strata, pitting Machine Learning against Domain Expertise.
  • Machine Learning won.
  • If the problem to be solved is well framed, the machine trumps domain. However, the problem has to be well framed.

What it means:

Alistair Croll crystallizes the implication in this piece. And rightly eggs us on to go forth and disrupt. I want to take what he said and elaborate just a little bit.

Kuhn, in a wonderful book called The Structure of Scientific Revolutions, observes that evidence against a dominant theory will build up for a long time, until the weight of new evidence is too great to ignore. In other words, knowledge isn’t really as fluid as scientists make it out to be.

Knowledge is inertia.

What makes Machine Learning so disruptive is that a generalist with a well framed question can go in and really mess with a domain. They can really go in and wreck the place up. The superiority of their predictions is force.

Are domain experts really the problem?

Domain experts in themselves are not problems. Their collective behavior is. It’s been that way since at least Copernicus.

Can they use Machine Learning to challenge their own assumptions? Or, will they allow ML’ers to come in?


I’m Christopher Berry.
I tweet about analytics @cjpberry
I write at

See this paper, on two cultures in statistical modeling, as one of the branches on this tree of thought.