Our National Blind Spot

Want to save the planet?  How about starting by saving the birds.  Here’s a Pareto graph that gives a strong hint of where to start:


That’s right, get the cat population under control.  Eradicate feral cat colonies, and euthanize cat collections (oh, and institutionalize obsessive cat ladies).  The whole country needs to grow up and get that “cute little kitty” lie out of their heads, and replace it with something more realistic, like “bird murderer.”

Tip from Bird Note, by way of Sarah Hoyt at the Instapundit.


Science is getting exciting!

Five very interesting articles recently popped up on the web, suggesting that current science is much more interesting than the average Joe might think:

    • At FiveThirtyEight*, Christie Aschwanden’s Science Isn’t Broken gives a great exposition on scientific fraud, p-hacking, and why science is much more difficult than most folks realize.
    • Robert Matthews, writing in UAE’s The National, says Lone researchers with radical ideas may hold the keys to science’s unanswered questions.  One of those “loners” is “Eleonora Troja, an astronomer at NASA’s Goddard Space Flight Center who studies X-rays, had hoped for years to detect the light from a neutron-star merger, but many people thought she was dreaming.”  
    • FiveThirtyEight’s Rebecca Boyle,  in Two Stars Slammed Into Each Other And Solved Half Of Astronomy’s Problems. What Comes Next?, describes that dream coming true and a revolution in astronomy that occurred in just 3 weeks this past August.
    • In The Serial-Killer Detector, the New Yorker’s Alec Wilkinson tells the story of Thomas Hargrove’s one-man Big Data project to categorize and analyze murders in the United States (751,785 since 1976) with the goal of tracking down serial killers.  From the description, is appears Hargrove has done yeoman’s work combining Small N and Big Data techniques with great success. “Hargrove thinks … that there are probably around two thousand serial killers at large in the U.S.”  Yikes!
    • Want to get in on the action?  At ScienceAlert.com, Mike McRae tells how Now You Can Build Your Very Own Muon Detector For Less Than $100, and possibly contribute to a Big Data project supporting stellar astronomy.

*ESPN’s website that analyzes sport statistics, election polling, and (apparently) anything else that catches their analysts’ eyes.


When all you have is a hammer…

…everything looks like a nail.

Daniel Lakens, the 20% Statistician, takes a rare but easy shot at statisticians and null hypothesis significance testing.

Our statistics education turns a blind eye to training people how to ask a good question. After a brief explanation of what a mean is, and a pit-stop at the normal distribution, we jump through as many tests as we can fit in the number of weeks we are teaching. We are training students to perform tests, but not to ask questions

He defines

…the Statisticians’ Fallacy: Statisticians who tell you ‘what you really want to know’, instead of explaining how to ask one specific kind of question from your data.

My favorite is the two-tailed test of the difference of two means, which can provide evidence that the two are different, but not that they are (nearly) the same.  My runners up are goodness-of-fit tests, which do no such thing.  Sometimes I feel like I’m selling the researcher’s version of Snake Oil, rather than teaching sound data analysis and interpretation.

Lakens closes with an excellent addendum, a reference to David Hand’s Deconstructing Statistical Questions,  which goes into much more detail.

Seven Pillars

Wisdom hath built her house, she hath hewn out her seven pillars.  –Proverbs 9:1

I just finished Stephen Stigler’s The Seven Pillars of Statistical Wisdom, and I’m daunted–and embarrassed that I waited so long to read it.  Stigler gives us a structure and taxonomy to statistical thinking* that gives us the “big picture” of statistics.


Quite a difference from the descriptives-to-inference-to-models approach that most textbook authors follow.  This is making me rethink how I approach my introductory courses, especially those for statistics majors.  I’m starting with a baby step: adding the (inexpensive, paperbound) book as a required reading in my statistical research methods class.

*the 7 pillars: aggregation, information, likelihood, intercomparison, regression, design, and residual (and that’s just the table of contents!)

Houston, We Have a Solution


Long-time south Texas residents swear by the H-E-B grocery chain for value, selection, quality, and always being well-stocked.  These guys are supply-chain ninjas; we see groceries, they see a logistics network.  And they always step up in emergencies; Houston may be their finest hour to date.

Tip from American Digest.

Stanford Invents AI Gaydar, Flubs Write-Up

Yilun Wang and Michal Kosinsksi, researchers at Stanford’s Graduate School of Business, have developed a neural-net classifier that purportedly detects sexual orientation (in caucasians).
The authors report an avalanche of experimental results, and claim the classifier can “correctly distinguish between gay and straight men 81% of the time, and 74% for women.”  OK, that’s the sensitivity of the gadget.  What about specificity, i.e. how well does it correctly distinguish folks who are not-so-gay?  Without that second number (as well as an estimate of prevalance), it’s not possible to estimate the false positive and false negative rates for this thing.  Very important, if some of the more Orwellian applications mentioned by the authors come to pass.
I give the authors a “C,” for incomplete work.
Update: Dan Simmons, writing at the Andrew Gelman blog, writes a rambling, fascinating takedown of this “research,” from both the scientific and MSM points of view.  Based on just the statistical problems, I’m changing the grade to a “D-.”

R Tutorial: the non-linear equation solver

Need a numerical solution to simultaneous non-linear equations?  The nleqslv package is just what you’re looking for!  The coding required is minimal; just define the equations you want solved in a function, set some initial values, and let ‘er rip.

Here’s an example that uses the method of moments to estimate the parameters of a beta-binomial distribution.