My Year of Reading in Review

Two years ago, I made a New Year’s Resolution to read more books. At that point I joined GoodReads to hold myself accountable. I read 47 books that year (at least that I recorded). In 2012, I didn’t re-make that resolution, and my reading productivity dropped to 29 (really 26 since I quit reading 3 books). While the number of books is lower, I did some minor analyses on these books based on data I scraped from GoodReads and Amazon.

Continue reading

Big Data for Little People

On New Year’s Eve, All Tech Considered, had a segment looking ahead to interesting technology in the coming year. One of the themes was Big Data, but I particularly liked the way they sold it: “Big Data For Little People.” The basic idea being that much of big data is owned by big companies, which crunch the data for their own purposes. But the NPR folks are seeing a trend for applications that crunch big data and bring the results to your smartphone or other app.

Continue reading

Gun deaths and data

This article at Slate is interesting for a number of reasons. First, if offers a link to a data set listing names and data of the 325 people known to have been killed by guns since December 14, 2012. Slate is to be congratulated for providing data in a format that is easy for statistical software to read. (Still, some cleaning required. For example, ages include a mix of numbers and categorical values.

Continue reading

Data Privacy for Kids

The L.A. Times ran an interesting article about the new Federal Trade Commission(downloads) report, “Mobile Apps for Kids: Disclosures Still Not Making the Grade”, followed up on a February 2012 report, and concluded that “Yes, many apps included interactive features or shared kids’ information with third parties without disclosing these practices to parents.” I think this is issue is intriguing on many levels, but of central concern is the fact that as we go about our daily business (or play, as the case may be), we leave a data trail, sometimes unwittingly.

Continue reading

One of the themes of this blog is to make statistics relevant and exciting to students by helping them understand the data that’s right under their noses. Or inside their ears. The iTunes library is a great place to start. For awhile, iTunes made it easy to get your data onto your hard drive in a convenient, analysis-ready form. Then they made it hard. Then (10.7) they made it easy again.

Continue reading

Big Data and Privacy

The L.A. Times today (Monday, November 19) ran an editorial about the benefits and costs of Big Data. I truly believe that statisticians should teach introductory students (and all students, really) about data privacy. But who feels they have a realistic handle on the nature of these threats and the size of the risk? I know I don’t. Does anyone teach this in their class? Let’s hear about it! In the meantime, you might enjoy reading (or re-reading) a classic on the topic by Latanya Sweeney: k-Anonymity: a model for protecting privacy.

Continue reading

Data Sets: A List in Flux

After my Pinterest post, I got a little bit hooked, mostly because I realized that it was a visual way for me to see my bookmarks. This makes it easier for me to find the information I am looking for quickly. One problem is that it requires an image, so I quickly realized that the links for data sets wouldn’t work so well on Pinterest. Then I remembered that I have used my personal blog as an organized reminder list (see this post where I remind myself how to re-set features on my computer after disaster), and thought I could do the same here, but with data sets that others could also use.

Continue reading

Author's picture

Citizen Statistician

Learning to swim in the data deluge