Data to Viz

For data analysts using R, this is huge.  Find out how to generate the graph you need for the data you have with just a few clicks. Yes, you’ll find some fine print explaining that the site is not comprehensive.  BUT, it still has a trove of graph types and accompanying R and python code to generate them. Tip from the R Bloggers Update: from the O’Reilly Data Science Newsletter, I learn that the Data to Viz site has a CAVEATS page, showing many of the most common “worst practices” of data visualization, whether confusing, misleading, or downright deceptive. I’m … Continue reading Data to Viz

Our National Blind Spot

Want to save the planet?  How about starting by saving the birds.  Here’s a Pareto graph that gives a strong hint of where to start: That’s right, get the cat population under control.  Eradicate feral cat colonies, and euthanize cat collections (oh, and institutionalize obsessive cat ladies).  The whole country needs to grow up and get that “cute little kitty” lie out of their heads, and replace it with something more realistic, like “bird murderer.” Tip from Bird Note, by way of Sarah Hoyt at the Instapundit. Update:  One Dallas suburb is infested with feral cats, protected by a well-connected … Continue reading Our National Blind Spot

R Tutorial: Correlation

Fisher’s iris dataset is the basis for this extended example in the calculation and visualization of correlations.  The ggpairs() function gives an impressive coded scatterplot matrix.  And an old friend makes a last-minute cameo appearance. Update:  Dirk Eddelbuettel just released tint 0.0.3 (tint is not Tufte) with some nifty examples.  I wanted to try it out, so I’ve updated the example using tint and added two margin plots to illustrate the Simpson’s Paradox  situation.  Tip from R Bloggers. Continue reading R Tutorial: Correlation

Love the message, hate the graphic

Meg McLain tells a great story about the relative risk of being killed by terrorists in the US.  Unfortunately, she comes up with this baffling graphic which appears to use the sort of number scales beloved of President Obama’s budget speechwriters: Sure, there’s a scale problem, when the multipliers range from 6 to 17,600, but generations of scientists and engineers have handled that with a logarithmic scale: This still doesn’t give the compressed range that MM’s chart shows.  Aha!  Perhaps she’s using the little-known log-log scale (beloved by statisticians who deal with generalized linear models)–let’s see: Pretty close.  But how … Continue reading Love the message, hate the graphic

Bump charts get renamed as SLOPEGRAPHS

Charlie Park has a nice post describing Tufte’s slopegraphs (old chart, new name). Kaiser Fung likes these a lot; he’s been calling them Bump charts. I introduce these to my undergrads when we discuss the paired t test. Tip from Update (16 July).  James Kierstead publishes an R implementation. Continue reading Bump charts get renamed as SLOPEGRAPHS