There appears to be a belief that facts have two sides to them.

It makes the:

  • marketing scientist in me smile
  • public policy quant in me rage
  • scientist in me flip the table

Jimmies status: rustled.

  • Stories may have two sides
  • Models may have two sides
  • Ideologies have two sides

Facts do not have two sides.

And yet, there’s been a few folks coming out of the woodwork lately:

  • Unskewed Polls in the Presidential Race
  • And many, many others.

 A fact is defined as something that actually happened, exists, or is reality.

We can experience facts first hand, or observe them though instrumentation. In analytics, there are multiple types of instrumentation that lay on a continuum from ‘sampled’ to ‘total’.

At the extreme edge of sampled is the survey. A researcher constructs a survey (an instrument) and randomly samples a population to generate observations. If the sample is truly random, the observations can be said to be representative of the entire population within a degree of certainty. That degree of certainty is expressed as a confidence interval associated with a confidence level. Although they contain an expression of confidence, they are facts.

At the other extreme edge are machine produced sensor and transactional data. At 06:33:21 a credit card registered to Jane Smythe was used to buy $11.44 from the Gulp-n-Blow located at 43.649899, -79.380722.

It’s a continuum though.

Take, for instance, how facts about a website are generated.

Javascript that is used from tracking is typically fired last upon a page load. A page load also generates a whole bunch of server logs. The javascript doesn’t always fire, in part, because of page load times, and, some devices do not run javascript. Most bots (like Google’s) do not fire javascript. However, bots do generate server logs.

For these reasons, javascript has much more in common with survey methodologies. Server logs are generally not samples of what happens on the website, whereas javascript almost always returns a sample[1].

Both types of instrumentation generate facts.

Whether or not you find the facts useful depends on what it is that you want to understand, the argument that you want to make, or the ideology you’re trying to defend.

If you’re generating an argument about website technical performance, the server logs are a great source. If you’re generating an argument about website business performance, the javascript is the way to go.

Likewise, if you’re generating an argument that the economy isn’t really all that good, I’m going to base my argument on the participation rate, and I’m going to ignore the unemployment rate. It’s a fact that the participation rate has fallen and the unemployment rate has fallen. The onus is on Jack to make his argument on the facts as presented. Attacking the instrumentation, in this case the BLS survey, is ineffective.

Instrumentation really matters[2]. But a fact can’t be manipulated to have a different side once the instrumentation is settled. A fact actually happened. Facts don’t have to sides.

[1] Web Analysts are generally not trained to think of confidence intervals or levels when dealing with javascript instrumentation, however, it is possible.
[2] Issues with cleaning and confidence intervals caused the claim that a particular bacteria uses Arsenic in its DNA is one such example. Facts can be refuted if the instrumentation falters.


I’m Christopher Berry
I’m at Authintic.