For some reason I was compelled to update my Mac’s OS and R on the same day. (I know…) It didn’t go well on several accounts and I mostly blame Apple. Here are the details. I updated R to version 3.0.2 “Frisbee Sailing” I updated my OS to 10.9 “Mavericks” When I went to use R things were going fine until I mistyped a command.

Continue reading

Daniel Kaplan and Libby Shoop have developed a one-credit class called Data Computation Fundamentals, which was offered this semester at Macalester College. This course is part of a larger research and teaching effort funded by Howard Hughes Medical Institute (HHMI) to help students understand the fundamentals and structures of data, especially big data. [Read more about the project in Macalester Magazine.] The course introduces students to R and covers topics such as merging data sources, data formatting and cleaning, clustering and text mining.

Continue reading

I’m often on the hunt for datasets that will not only work well with the material we’re covering in class, but will (hopefully) pique students' interest. One sure choice is to use data collected from the students, as it is easy to engage them with data about themselves. However I think it is also important to open their eyes to the vast amount of data collected and made available to the public.

Continue reading

It is time for the NCAA Basketball Tournament. Sixty-four teams dream big (er…I mean 68…well actually by now, 64) and schools like Iona and Florida Gulf Coast University (go Eagles!) are hoping that Robert Morris astounding victory in the N.I.T. isn’t just a flash in the pan. My favorite part is filling out the bracket–see it below. (Imagine that…a statistician’s favorite part of the whole thing is making predictions.) Even President Obama filled out a bracket [see it here].

Continue reading

ggplot ggoldy

One of my graduate students worked some ggplot magic and created an almost Light Bright-esqe plot of our very own Goldy Gopher. She also, thoughtfully, published a tutorial on her blog. Read and enjoy! [visit Rita’s blog here]

Continue reading

Happy Birthday Florence Henderson

As a celebration of Florence Henderson’s 79th birthday (on February 14), I have created this scatterplot to use in my regression course. The plot depicts the relationship between time spent on mathematics homework outside of school (expressed as z-scores) and mathematics achievement scores (expressed as T-scores, M=50, SD=10) for 200 8th-graders taken from the 1988 National Education Longitudinal Study. The color–in a display of very poor data science–is just randomly applied to the observations rather than meaning anything substantial.

Continue reading

ggplot2 Pinterest

I don’t understand the website Pinterest, but it looks pretty (especially on the iPad), and an undergraduate student said it was the greatest thing since Facebook, so I thought I would give it a shot. The idea is that Pinterest “lets you organize and share all the beautiful things you find on the web.” You organize beautiful things by creating a “board” (a page), and then adding “pins” (links to websites).

Continue reading

Author's picture

Citizen Statistician

Learning to swim in the data deluge