Houston, We Have a Solution


Long-time south Texas residents swear by the H-E-B grocery chain for value, selection, quality, and always being well-stocked.  These guys are supply-chain ninjas; we see groceries, they see a logistics network.  And they always step up in emergencies; Houston may be their finest hour to date.

Tip from American Digest.


Stanford Invents AI Gaydar, Flubs Write-Up

Yilun Wang and Michal Kosinsksi, researchers at Stanford’s Graduate School of Business, have developed a neural-net classifier that purportedly detects sexual orientation (in caucasians).
The authors report an avalanche of experimental results, and claim the classifier can “correctly distinguish between gay and straight men 81% of the time, and 74% for women.”  OK, that’s the sensitivity of the gadget.  What about specificity, i.e. how well does it correctly distinguish folks who are not-so-gay?  Without that second number (as well as an estimate of prevalance), it’s not possible to estimate the false positive and false negative rates for this thing.  Very important, if some of the more Orwellian applications mentioned by the authors come to pass.
I give the authors a “C,” for incomplete work.
Update: Dan Simmons, writing at the Andrew Gelman blog, writes a rambling, fascinating takedown of this “research,” from both the scientific and MSM points of view.  Based on just the statistical problems, I’m changing the grade to a “D-.”

R Tutorial: the non-linear equation solver

Need a numerical solution to simultaneous non-linear equations?  The nleqslv package is just what you’re looking for!  The coding required is minimal; just define the equations you want solved in a function, set some initial values, and let ‘er rip.

Here’s an example that uses the method of moments to estimate the parameters of a beta-binomial distribution.

Reports of its death are greatly exaggerated

The ability of statistics to accurately represent the world is declining. In its wake, a new age of big data controlled by private companies is taking over – and putting democracy in peril.

begins William Davies tale of woe in the Guardian.  Unfortunately, he confuses credible statistics with modern state-istics*; and seems impervious to the idea that Joe Sixpack has wised up to the fact that there are “lies, damned lies, and statistics,” and that most of these are peddled by the Leviathan State and its corporate cronies.  Usually to Joe’s detriment.

Statistics in industry and scientific research is doing quite well, thank you.  The Big Data movement is still immature and riddled with snake-oil salesmen; it will eventually spot them, possibly by applying its methodologies reflexively.

Tip from that same O’Reilly Newsletter.  Finally, I got on a sucker list that’s interesting!

*Where did you think the word came from?

Update:  Briggsy holds much the same opinion as I do, but expresses it more eloquently.

Multiple Comparisons, Made Easy

Adrian Colyer at the morning paper, takes a stab at explaining the problem with p-values and multiple comparisons.  He shoots!  He scores!  The crowd* goes wild!


Tip from an O’Reilly Daily Newsletter, which I found languishing in Clutter purgatory.

*OK, the crowd of two or three statistics lecturers who struggle to explain the multiple comparison problem.

When Bayesian Statistics Broke into History

Wonderful article here about the Mosteller and Wallace analysis of the twelve Federalist Papers, the ones of disputed authorship–was it Madison or Hamilton who wrote them?  With a nice, easy-to-understand explanation of the Bayesian methodology they  used.

Aaron Burr insured that Hamilton took the secret to his grave.

Tip from Real Clear Life.

An end run around an impossible integral

Ever-insightful polymath John Cook shows how to integrate the Gaussian PDF, in less time than it takes to make breakfast.  The trick?  Coordinate transformations and the Jacobian are your friends.


A suitably-embellished version of Cook’s post will appear in my lecture notes in the Spring semester.  Thanks, J.C.