Fisher’s iris dataset is the basis for this extended example in the calculation and visualization of correlations. The ggpairs() function gives an impressive coded scatterplot matrix. And an old friend makes a last-minute cameo appearance.
Update: Dirk Eddelbuettel just released tint 0.0.3 (tint is not Tufte) with some nifty examples. I wanted to try it out, so I’ve updated the example using tint and added two margin plots to illustrate the Simpson’s Paradox situation. Tip from R Bloggers.
This is an old chestnut in Bayesian statistics, using the conjugate beta prior to find a beta posterior distribution for a proportion. If you’re unfamiliar with the calculation of the posterior distribution, there’s a link in the tutorial.
Azzalini and Bowman’s Old Faithful geyser data provides fodder for a lot of data exploration in R (scatterplots, ggplot2, simple regression, kmeans clustering, and Markov chain estimation). All the really interesting stuff in the tutorial happens if you click through to Analysis > Models > Standardized Cluster Model. (The standardized clustering approach is not given in the original paper.)
After a long, slow start, R is catching on with statisticians and (some) scientists at UTSA. The Biology Department has asked that I use R in teaching biostatistics, and many of the courses for statistics majors are using R rather than SAS (a UTSA tradition). Students have not been idle; the statistics club has asked me to present an occasional series of R tutorials to get their members up to speed. Here are the first two tutorials:
These tutorials are all HTML files, generated with RMarkdown. Students who attend the presentations are also provided with the markdown source files, so they can tweak the code during the presentation.