On the challenges involved in open data
The original intent of the D-LID project, the Design Lab for Interpreted Data, was to generate facts about the way different people interpreted digital analytics data. It was to be a website with a few treatments of the same dataset. Participants would be watched to see which treatments they found useful. With some help from Bayes, we’d put some hard core facts on the table about data design in the context of different audiences. We’d make the data available to members of the DAA for a year, and then open it up to the public thereafter.
After it was all scoped out, the median estimated price tag was too much. In talking to partners, we got that figure down to a more manageable range, but, it was still too much. In short – it was a lot of money…around $10/response…to get the information and assemble the output. Too much to bear.
Generating a large volume of good, reliable, and relevant data is still a bit expensive. There’s loads of general, open data, that’s out there. And that’s useful in certain creative contexts. No dataset exists to answer this particular query though.
You can certainly approach partners to gain access to secondary datasets. Indeed, that approach has really paid off. We do have a huge volume of facts on the table from secondary sources.
The cost of collecting relevant observation data will come down. We should look forward to that.
For the time being, cost remains a challenge.