This is the first in a series of five posts about Reddit and Analytics. The complete thread will be posted at the end of the week. So who keeps on downvoting you on Reddit? We’ll find out. But first – three notes: You may be familiar with Reddit. If you’re not – you can read this explanation about what Reddit is. To answer that question, I downloaded a dataset that was built in early 2011 or very late 2010. The dataset is a 29MB gzip compressed and contains 7,405,561 votes from 31,927 users over 2,046,401 links. You can read about the methodology here. The file contains three columns – a vote, a userid, and a link. Only people who had[…]
Month: February 2012
Kurt wrote an excellent post about building a data science team. It’s excellent and it’s worth reading. To expand off his points: The first 90 days provide fuel for the subsequent 180. The 180 days after are far muddier, because what was scaling in very unsophisticated interfaces require a lot more work to become elegant solutions. Data scientists should evangelize evidence and do what they can to develop interfaces that democratize the data. The math is a means to the end. Own reflections: I’m extremely thankful for my years of experience with Information Architects and Designers – as now – when I go into a room and they’re not around, I actively think about that end state. I’m glad I’ve[…]
A fellow data scientist and I were debating how to answer a very specific question that is asked all the time by others. How would we answer it? I grabbed a piece of paper and drew a histogram. A histogram: Plots a single variable along the X-axis. Plots the occurrence, or frequency of a given variable along the Y-axis. Is used by statisticians and analysts to understand the frequency distribution of a given variable. I said: “This is how I would want to see the data. This is how I answer the question today. This is what I would want to compare,” Then paused. Reflected. And added, “I am not the end user.” The end user isn’t a statistician, marketing[…]
“Don’t Make Me Think” by Steve Krug is one of my favourite books. I strongly recommend it to web analysts and data scientist. In that spirit – here are a few of my favourite interfaces: pinterest.com rdio.com imgur.com Commonalities: Real choices about what to put in and leave out were made – in other words – they are designed. They were not assembled. Not every surface is crammed with stuff. Just because nature abhors a vacuum doesn’t mean you need to cram something into every pixel. It’s obvious what everything does. Simple can be functional. What are your nominations? *** I’m Christopher Berry.I tweet about analytics @cjpberryI write at christopherberry.ca