Category: statistics

Science is getting exciting!

Five very interesting articles recently popped up on the web, suggesting that current science is much more interesting than the average Joe might think:

    • At FiveThirtyEight*, Christie Aschwanden’s Science Isn’t Broken gives a great exposition on scientific fraud, p-hacking, and why science is much more difficult than most folks realize.
    • Robert Matthews, writing in UAE’s The National, says Lone researchers with radical ideas may hold the keys to science’s unanswered questions.  One of those “loners” is “Eleonora Troja, an astronomer at NASA’s Goddard Space Flight Center who studies X-rays, had hoped for years to detect the light from a neutron-star merger, but many people thought she was dreaming.”  
    • FiveThirtyEight’s Rebecca Boyle,  in Two Stars Slammed Into Each Other And Solved Half Of Astronomy’s Problems. What Comes Next?, describes that dream coming true and a revolution in astronomy that occurred in just 3 weeks this past August.
    • In The Serial-Killer Detector, the New Yorker’s Alec Wilkinson tells the story of Thomas Hargrove’s one-man Big Data project to categorize and analyze murders in the United States (751,785 since 1976) with the goal of tracking down serial killers.  From the description, is appears Hargrove has done yeoman’s work combining Small N and Big Data techniques with great success. “Hargrove thinks … that there are probably around two thousand serial killers at large in the U.S.”  Yikes!
    • Want to get in on the action?  At ScienceAlert.com, Mike McRae tells how Now You Can Build Your Very Own Muon Detector For Less Than $100, and possibly contribute to a Big Data project supporting stellar astronomy.

*ESPN’s website that analyzes sport statistics, election polling, and (apparently) anything else that catches their analysts’ eyes.

 

Advertisements

When all you have is a hammer…

…everything looks like a nail.

Daniel Lakens, the 20% Statistician, takes a rare but easy shot at statisticians and null hypothesis significance testing.

Our statistics education turns a blind eye to training people how to ask a good question. After a brief explanation of what a mean is, and a pit-stop at the normal distribution, we jump through as many tests as we can fit in the number of weeks we are teaching. We are training students to perform tests, but not to ask questions

He defines

…the Statisticians’ Fallacy: Statisticians who tell you ‘what you really want to know’, instead of explaining how to ask one specific kind of question from your data.

My favorite is the two-tailed test of the difference of two means, which can provide evidence that the two are different, but not that they are (nearly) the same.  My runners up are goodness-of-fit tests, which do no such thing.  Sometimes I feel like I’m selling the researcher’s version of Snake Oil, rather than teaching sound data analysis and interpretation.

Lakens closes with an excellent addendum, a reference to David Hand’s Deconstructing Statistical Questions,  which goes into much more detail.

Seven Pillars

Wisdom hath built her house, she hath hewn out her seven pillars.  –Proverbs 9:1

I just finished Stephen Stigler’s The Seven Pillars of Statistical Wisdom, and I’m daunted–and embarrassed that I waited so long to read it.  Stigler gives us a structure and taxonomy to statistical thinking* that gives us the “big picture” of statistics.

StiglerSevenPillars

Quite a difference from the descriptives-to-inference-to-models approach that most textbook authors follow.  This is making me rethink how I approach my introductory courses, especially those for statistics majors.  I’m starting with a baby step: adding the (inexpensive, paperbound) book as a required reading in my statistical research methods class.

*the 7 pillars: aggregation, information, likelihood, intercomparison, regression, design, and residual (and that’s just the table of contents!)

Stanford Invents AI Gaydar, Flubs Write-Up

Yilun Wang and Michal Kosinsksi, researchers at Stanford’s Graduate School of Business, have developed a neural-net classifier that purportedly detects sexual orientation (in caucasians).
FacialRecognition
The authors report an avalanche of experimental results, and claim the classifier can “correctly distinguish between gay and straight men 81% of the time, and 74% for women.”  OK, that’s the sensitivity of the gadget.  What about specificity, i.e. how well does it correctly distinguish folks who are not-so-gay?  Without that second number (as well as an estimate of prevalance), it’s not possible to estimate the false positive and false negative rates for this thing.  Very important, if some of the more Orwellian applications mentioned by the authors come to pass.
I give the authors a “C,” for incomplete work.
Update: Dan Simmons, writing at the Andrew Gelman blog, writes a rambling, fascinating takedown of this “research,” from both the scientific and MSM points of view.  Based on just the statistical problems, I’m changing the grade to a “D-.”

Reports of its death are greatly exaggerated

The ability of statistics to accurately represent the world is declining. In its wake, a new age of big data controlled by private companies is taking over – and putting democracy in peril.

begins William Davies tale of woe in the Guardian.  Unfortunately, he confuses credible statistics with modern state-istics*; and seems impervious to the idea that Joe Sixpack has wised up to the fact that there are “lies, damned lies, and statistics,” and that most of these are peddled by the Leviathan State and its corporate cronies.  Usually to Joe’s detriment.

Statistics in industry and scientific research is doing quite well, thank you.  The Big Data movement is still immature and riddled with snake-oil salesmen; it will eventually spot them, possibly by applying its methodologies reflexively.

Tip from that same O’Reilly Newsletter.  Finally, I got on a sucker list that’s interesting!

*Where did you think the word came from?

Update:  Briggsy holds much the same opinion as I do, but expresses it more eloquently.