The ability of statistics to accurately represent the world is declining. In its wake, a new age of big data controlled by private companies is taking over – and putting democracy in peril.
begins William Davies tale of woe in the Guardian. Unfortunately, he confuses credible statistics with modern state-istics*; and seems impervious to the idea that Joe Sixpack has wised up to the fact that there are “lies, damned lies, and statistics,” and that most of these are peddled by the Leviathan State and its corporate cronies. Usually to Joe’s detriment.
Statistics in industry and scientific research is doing quite well, thank you. The Big Data movement is still immature and riddled with snake-oil salesmen; it will eventually spot them, possibly by applying its methodologies reflexively.
Tip from that same O’Reilly Newsletter. Finally, I got on a sucker list that’s interesting!
Wonderful article here about the Mosteller and Wallace analysis of the twelve Federalist Papers, the ones of disputed authorship–was it Madison or Hamilton who wrote them? With a nice, easy-to-understand explanation of the Bayesian methodology they used.
…can be done in 10 minutes or less, using the Jadad score. There’s a full explanation in the original paper, but suffice it to say, it’s pretty easy to identify sketchy studies using this method. Aaron Carroll, writing in the New York Times, shows how this affects the credibility of nutrition research. For those who want to try this at home, here’s the scorecard from the paper:
Was the study described as randomized? (YES/NO)
Was the study described as double blind? (YES/NO)
Was there a description of withdrawals and dropouts? (YES/NO)
Give 1 point for each YES, and 0 points for each NO, with no partial credit. Then assess these
For question 1, GIVE 1 additional point if the method to generate the sequence of randomization was described and it was appropriate (table of random numbers, computer generated, etc.) Otherwise, DEDUCT 1 point if the method to generate the sequence of randomization was described and it was inappropriate (patients were allocated alternately, or according to date of birth, hospital number,etc.)
For question 2, GIVE 1 addtional point if the method of double blinding was described and it was appropriate (identical placebo, active placebo, dummy, etc.). Otherwise, DEDUCT 1 point if the study was described as double blind but the method of blinding was inappropriate (e.g., comparison of tablet vs. injection with no double dummy).
Update: Dirk Eddelbuettel just released tint 0.0.3 (tint is not Tufte) with some nifty examples. I wanted to try it out, so I’ve updated the example using tint and added two margin plots to illustrate the Simpson’s Paradox situation. Tip from R Bloggers.