One of the many nice things about summer is the time and space it allows for blogging. And, after a very stimulating SRTL conference (Statistics Reasoning, Teaching and Learning) in Rotorua, New Zealand, there’s lots to blog about. Let’s begin with a provocative posting by fellow SRTL-er Tim Erickson at his excellent blog A Best Case Scenario. I’ve known Tim for quite awhile, and have enjoyed many interesting and challenging discussions.
This last weekend I helped Danny Kaplan and Kathryn Kozak (Coconino Community College) put on a StatPREP workshop. We were also joined by Amelia McNamara (Smith College) and Joe Roith (St. Catherine’s University). The idea behind StatPREP is to work directly with college-level instructors, through online and in community-based workshops, to develop the understanding and skills needed to work and teach with modern data. Danny Kaplan ponders at #StatPREP One of the most interesting aspects of these workshops were the tutorials and exercises that the participants worked on.
Part of the reason why we have been somewhat silent at Citizen Statistician is that it’s DataFest season, and that means a few weeks (months?) of all consuming organization followed by a weekend of super fun data immersion and exhaustion… Each year that I organize DataFest I tell myself “next year, I’ll do [blah] to make my life easier”. This year I finally did it! Read about how I’ve been streamlining the process of registrations, registration confirmations, and dissemination of information prior to the event on my post titled “Organizing DataFest the tidy way” on the R Views blog.
Last year I was awarded a Project TIER (Teaching Integrity in Empirical Research) fellowship, and last week my work on the fellowship wrapped up with a meeting with the project leads, other fellows from last year, as well as new fellows for the next year. In a nutshell Project TIER focuses on reproducibility. Here is a brief summary of the project’s focus from their website: For a number of years, we have been developing a protocol for comprehensively documenting all the steps of data management and analysis that go into an empirical research paper.
A few weeks ago I gave a two-hour Introduction to R workshop for the Master of Engineering Management students at Duke. The session was organized by the student-led Career Development and Alumni Relations committee within this program. The slides for the workshop can be found here and the source code is available on GitHub. Why might this be of interest to you? The materials can give you a sense of what’s feasible to teach in two hours to an audience that is not scared of programming but is new to R.
In one of our previous posts (Halloween: An Excuse for Plotting with Icons), we gave a quick tutorial on how to plot using icons using ggplot. A reader, Dr. D. K. Samuel asked in a comment how to use multiple icons. His comment read, ...can you make a blog post on using multiple icons for such data year, crop,yield 1995,Tomato,250 1995,Apple,300 1995,Orange,500 2000, Tomato,600 2000,Apple, 800 2000,Orange,900 it will be nice to use icons for each data point.
In my course on the GLM, we are discussing residual plots this week. Given that it is also Halloween this Saturday, it seems like a perfect time to code up a residual plot made of ghosts. The process I used to create this plot is as follows: Find an icon that you want to use in place of the points on your scatterplot (or dot plot). I used a ghost icon (created by Andrea Mazzini) obtained from The Noun Project.