With fewer than two weeks left till the US presidential elections, motivating class discussion with data related to the candidates, elections, or politics in general is quite easy. So for yesterday’s lab we used data released by The Federal Election Commission on contributions made to 2012 presidential campaigns. I came across the data last week, via a post on The Guardian Datablog. The post has a nice interactive feature for analyzing data from all contributions.

Continue reading

Citizen Scientists in Space...

The L.A. Times had an interesting article about how a pair of ‘citizen scientists’ discovered a planet with four suns. I would say that a more accurate term for the pair would be ‘citizen data miners’, because essentially the astronomy community crowd sources data mining by providing reams of data for anyone to examine. It seemed timely for me, following a seminar at the UCLA Center for Applied Statistics by Kiri Wagstaff on automated procedures for discovering interesting features in large data sets.

Continue reading

TheCurrent Population Survey (CPS) is a statistical survey conducted by the United States Census Bureau for the Bureau of Labor Statistics. The data collected is used to provide a monthly report on employment in the United States. Although the CPS data are available, to this point it has really only been easy to deal with for SPSS, Stata, or SAS users. A new blog is also making it easy for R users to obtain and analyses these data.

Continue reading

More Fitbit

Simply Statistics lists some data analysis projects. Skewing towards the intermediate rather than novice student. But still useful in many ways. And—some FitBit ideas! http://simplystatistics.org/post/32881133740/statistics-project-ideas-for-students-part-2

Continue reading

TV Show hosts

A little bit ago [July 19, 2012 — so I’m a little behind], the L.A. Times ran an article about whether TV hosts are pulling their own weight, salary wise. (What is the real value of TV stars and personalities?) I took their data table and put it in a CSV format, and added a column called “epynomious”, which indicates whether the show is named after the host. (This apparently doesn’t explain the salary variation.

Continue reading

In reading one of the many blogs that I read, there was a suggestion to use the Baltimore’s parking citation data to see if some makes/models of cars get citations more than others. Now parking citations are very near and dear to me since I get at least one (n ≥ 1) parking citation a year parking near the University of Minnesota–which most often also leads to my car being towed since you only have so many hours to move your car after they ticket it.

Continue reading

Author's picture

Citizen Statistician

Learning to swim in the data deluge