How knowledge, and data, is(n’t) shared
Einstein had it wrong.
The only thing that moves faster than light through a vacuum is bullshit through a social medium.
Marketing scientists in academia communicate to each other through journals and presentations. The peer reviewed journal innovation was necessitated by Sir Isaac Newton, who somehow had a drawer that was filled with the ideas of others. “Oh, that’s a neat idea, too bad you didn’t come up with it first. I had written about it five years before you, see, it was in my desk all along.” He wasn’t very popular. Academic findings are loosely peer reviewed (see: biology). There’s a form of moderation there. It varies, but it’s there.
Industry marketing scientists talk over beer, blogs, conferences, pdf’s and the odd slideshare. There is little open data, and even less public peer review. Industry findings, released into the wild, are also rarely peer reviewed. There’s also a soft form of moderation there in the comments, but it’s significantly more subtle than publishing rejection. As such, short-form knowledge transmits rapidly.
It can be tough to tell what’s good knowledge and what’s bad knowledge. How can you tell?
Surprising findings may be true, even after a good amount of skepticism is applied, but how do you really tell?
Usually, you go to the data. Trust but verify.
Both academics and industry professionals have a common problem in that little of the underlining data is shared.
The underlining data is tell is rarely available.
There are good reasons for that:
- For one, there is no professional standard of attribution in industry analytics. Plagiarism happens.
- For two, many data scientists are engaged in an arms race, and in some instances cooperation is zero-sum.
- For three, those who have figured out a strategic reason for doing so, and their business model that stores, codes, and transmits that data, are still cutting through some pretty thick brush.
- For four, good, clean data is expensive to produce, and generally, we don’t all work for free
- For five, we’re dealing with huge volumes of data that can’t be easily shared by way of a FTP
Some data is power. The ability to enrich that data requires a lot of power.
There are companies, startups like Figshare, on the academic side, that are making it way easier for researchers to share information. They’re right on trend. The United Kingdom has decreed that all academic research must be shared publicly by 2014.
The problem is generated by people using machines, so it can be solved by people using machines. It’ll happen.
And we may be able to slow the proliferation of bad knowledge, even if it’s back to just the speed of light.
***
I’m Christopher Berry.
Follow me @cjpberry
I blog at christopherberry.ca