Intelligence means selective ignorance.
Imagine how intelligent and ignorant we used to be as a people, just 120 years ago.
Some of the first uses of sampling techniques in quantitative methods centered around the use of alcohol in society. They really didn’t have very much machine readable data back then (the first use was for the 1890 US Census), so, the practices of data mining weren’t possible. The entire purpose of sampling, and sample statistics, was precisely because no machines could be used to quantify the entire population against some policy question.
You try calculating Chi Square on a very large dataset without a calculator or a spreadsheet!
Indeed, sampling continues to be used to this day as a cost effective way of asking the world intelligent questions.
Abolitionists, or ‘Smashers’ as they were called in my home province, were looking for evidence to prove their point of view. They were a bunch of convenient reasoners. For instance, they sought to prove that a child’s height and weight were inversely correlated to household alcohol consumption. They sought to use quantitative evidence to support their point of view. (Inb4 correlation isn’t causality.)
Yet, fundamentally, the idea of making public policy decisions based on evidence, to produce better outcomes for all of society, is a good one. Upon seeing the awesome power of applied statistics to do tremendous evil (second year university), I imagined the tremendous power for good. What if we didn’t make policy decisions based on gut? What if we didn’t leave things to chance. What then?
Indeed, the abolitionists were right about a few things. After all, we later deduced fetal alcohol syndrome using evidence based medicine. Public policy officials then took remedial action to exhort the population as a whole not to encourage pregnant women to drink. To be sure, some women continue to drink and do harm to their child, yet the risks have been mitigated for so many.
Ordered, machine readable information used to be rare. It was very difficult for Ontario proponents of abolition to rally evidence to support their position. It’s different now.
Today, a database containing the medical records of 10.6 million people (people die and do leave the province…). It’s easier, with a very high degree of certainty, to understand the impact of alcohol. And many other types of diseases for that matter.
There are pitfalls to be sure. In one of my favourite studies, Peter Austin revealed just how easy it was to be fooled by false positives, and it demonstrates that there are pitfalls involved in using evidence.
We are literally drowning in the data. And it’s easy to get really fooled by randomness because of it.
The Smashers, in spite of their ill-guided attempts to restrict the range of freedom within society, knew what they were trying to prove. They knew what they were trying to build, and knew what sort of evidence was needed. They weren’t all these evil people bent on telling others how to live their own lives. Some of them were genuinely good people who wanted a better outcome for all.
They had a focus.
The danger of so much cheap data is that it’s very easy to lose our own focus.
What do you think? Do you think it’s important to try to quantify as much as possible to mitigate the risk of not being able to know something, or do you think it’s more important to focus on an interesting problem (profitable problem) and examine additional questions deliberately – (and mitigating the risk of getting fooled by false positives)?
Why know so little about so much when you can know so much about what likely matters?