You can’t keep everything in the model. Elements have to go.

Context is important.

Modeling does not happen in a vacuum. Most analytical models must be actionable and inform a decision. As a result, measurability and controllability is important. A concrete example will help.

Suppose the purpose of a model is to understand the key levers of customer retention for a Software-as-a-Service (SaaS) firm. Specifically, the CEO wants to understand how he can increase his customer base by retaining existing customers.

A creative modeler has derived eighty salient variables – gathered from the marketing sciences, OR data stores like FogBugz, web analytics, and SalesForce. Nobody, not even a very good modeler, is going to be able to explain 80 variables, even if were possible to create a mega-table containing all of these disparate sources of data. Choice must be made.

  • Which variables are accurately measurable?
  • Which variables correspond to levers that a CEO can pull?
  • Which variables are accessible? (Time horizon, inter-organizational political BS around silos, etc).

And so, a list of 80 ideas can be whittled down to 20 for statistical testing, and perhaps as many as 3 different models tested. Selecting the variables with an actionable end in mind is the editorial exercise.