Two pieces of information and one editorial to share coming out of NY Strata. The next release of ggplot2 will be done in D3 and called r2d3 (thanks to Hadley Wickham and team) A few data scientists argued about things that didn’t matter to anybody. Privacy Analytics won big at Strata. Check them out. The ggplot news is great. It’ll help us to produce nicer graphics faster instead of uglier graphics. Privacy Analytics is a great concept. As for the debates – some were intensely important and very relevant. Some of them were just a fight about words and unimportant. There was much more good than bad at this one. Thanks to the organizers for doing such a great job.[…]
Author: Christopher Berry
There’s going to be a lot more rhetoric next year about decisions, big data decisions, and decision automation. I think we’re going to hear a lot, and, that’ll fall into three big categories of rhetoric. The first is to argue ‘this-changes-everything!’ Panderer’s are gonna pander. And, it’s a really great line to say that it’s paradigm shift. You know, in ten years, farmer’s aren’t going to farm anymore, they’ll have just fields and fields of big data servers and fields and fields of crops. A farmer isn’t really going to be a farmer as we think of them today, but they’ll be much more like a data manager. Every technological disruption brings in a crop of these folk, and that’s[…]
Have you read about Gartner’s 232 Billion Big Data prediction? Wow. Equally amazing – the 4.4 billion dollar social media analytics prediction, by 2016. Wow. Gartner is projecting a 45% per year average growth rate for social media, social network analysis and content analysis from 2011 to 2016. Hype? Unlikely. This is revenue growth in spite of a massive trough of disillusionment coming up. A majority of this growth in the post-trough slope of enlightenment. That’s big growth. You have a choice. You stand on the sidelines barking that there’s no such thing as contagion effect. You can chose to ignore the most recent findings in the August edition of ‘Science’ on network homophily. That’s your choice. Or you can put on[…]
Sarah Lacy wrote a paragraph that resonated: “One of our most popular stories all week has been David Holmes’s report about how Tumblr wants to pay for journalism. And not just cat pictures, re-written press releases, or 300 word snark-fests by junior reporters paid $12 a post. This isn’t another content farm. They want real, actual New Yorker-style long form journalism.” Then she says, that’s great. Who’s going to write it. She describes an upcoming talent cliff in a few years. Our society isn’t generating people capable of long form journalism or storytelling. It’s a great article, and really worth reading. There’s tremendous incentive, during this era, to communicate in extremely tiny units. If a thought can’t be expressed in[…]
Regular readers of this space know about the BLS and all the neat nuances that go along with the data. I wrote a five part series in July on How Americans Live using Bureau of Labor Statistics data. (It wasn’t a very popular series because it wasn’t a popular topic. A few of you liked it.) And then Jack Welch stepped in it this week. He made news. Whether your a Tastycrat or a Fingerlican, as analytics folks, you have to be intrigued by what Jack Welch is saying and how he’s thinking. His second tweet on the topic appeared to suggest that somehow Obama had something to do with modifying the BLS Unemployment report, so as to make the[…]
Models are amazing. And they’re really necessary if you’re going to excel in Analytics. Consider the model below. It’s how I think as a marketer and a marketing scientist: I have one major goal this quarter. It’s the # of Qualified Leads. This is my optimization objective. It doesn’t have to be yours. Your will vary. However, for this quarter, I’m all about the # of Qualified Leads. I get qualified leads when people complete goals on the site. And I get people to the site from Paid and Earned sources. And the two primary levers I have to pull are paid budget and owned budget. Let’s unpack. There are channels that are owned and channels that are paid. If[…]
Facebook hit 10^9, or one billion monthly active users, this morning. Pundits are gonna pun and haters are gonna hate. I’m still impressed. Facebook manages a massive, sparse, graph. It’s a massive amount of data, using technology that didn’t even exist just five years ago. They have data scientists inventing products that yet to exist. And these products have to scale to one billion MAU’s. What do you do that scales to one billion? Cynicism about the trough aside – that’s a huge figure. In many ways, to scale from 10^6 to 10^9 was the hardest part. To scale from 10^9 to a multiple of that, say, 2 billion, that isn’t, comparatively, is hard. Congrats to all the folks who[…]
Reddit is the successor of Digg. It’s a crowdsourced news aggregation utility with commenting. It contains communities of sorts, hidden deep inside. This is the useful stuff. Reddit has a few pretty good utilities that are written on top of their API. One of them is stattit. Another one is the Reddit Enhancement Suite (RES). As with any site, there’s the head, and then there’s the long tail. And the long tail is long on Reddit. Based on figures from stattit, the largest subreddit (sub) is r/funny, a forum dedicated to things that are funny. It peaks at ~27,000 online users per hour, with some ~2.5 million subscribers. The tail is very long. The 5000th sub is r/realtimestrategy, with 3[…]
You may have heard of App.net It’s like Twitter, only that you own your data and it’s dev friendly. And, because the product isn’t free, you aren’t the product. It costs money to join. (Shocker?) 100 bucks a year if you’re developing on it. 50 bucks for mortals. Why mention it now? The brightest devs I know have hopped on board and are developing for it. When I see innovators doing real things on the platform, especially that cynical bunch, I know there’s optimism. As for me? Soon. *** I’m Christopher Berry.I’m at Authintic.
Andrew and I wanted to harmonize all our fonts across all our marketing collateral. We were uncertain if the font we wanted to use was supported by all browsers. I googled it and learned that this one font wasn’t supported by Safari on iOS. I opened up the google analytics and learned that 20% of the traffic to the site was from Safari on iOS. We decided to leave the font as is. The entire experience, from preference to evidence to decision, took four minutes. Not a day goes by without using data to inform small decisions as well as big decisions. We take it for granted that we can complete extremely simple inquiries on our own. It’s efficient because[…]