We’re discussing data visualization nowadays in my course, and today’s topic was supposed to be mapping. However late last night I realized I was going to run out of time and decided to table hands on mapping exercises till a bit later in the course (after we do some data manipulation as well, which I think will work better). That being said, talking about maps seemed timely, especially with Hurricane Irma developing.

Continue reading

Structuring Data in Middle School

Of the many provocative and exciting discussions at this year’s Statistics Research Teaching and Learning conference in Rotarua, NZ, one that has stuck in my mind is from Lucia Zapata-Cardona, from the Universidad de Antioquia in Columbia. Lucia discussed data from her classroom observations of a teacher at a middle school (ages 12-13) in a “Northwest Columbian city”. The class was exciting for many reasons, but the reason that I want to write about it here is because of the fact that the teacher had the students structure and store their own data.

Continue reading

After reading this review of a Theaster Gates show at Regan Projects, in L.A., I hurried to see the show before it closed. Inspired by sociologist and civil rights activist W.E.B. Du Bois, Gates created artistic interpretations of statistical graphics that Du Bois had produced for an exhibition in Paris in 1900. Coincidentally, I had just heard about these graphics the previous week at the Data Science Education Technology conference while evesdropping on a conversation Andy Zieffler was having with someone else.

Continue reading

On the first day of an intro stats or intro data science course I enjoy giving some accessible real data examples, instead of spending the whole time going over the syllabus (which is necessary in my opinion, but somewhat boring nonetheless). One of my favorite examples is How to Tell Someone’s Age When All You Know Is Her Name from FiveThirtyEight. As an added bonus, you can use this example to get to know some students' names.

Continue reading

The ASA’s most recent curriculum guidelines emphasize the increasing importance of data science, real applications, model diversity, and communication / teamwork in undergraduate education. In an effort to highlight recent efforts inspired by these guidelines, I organized a JSM session titled Doing more with data in and outside the undergraduate classroom. This session featured talks on recent curricular and extra-curricular efforts in this vein, with a particular emphasis on challenging students with real and complex data and data analysis.

Continue reading

Ten years after Ioannidis alleged that most scientific findings are false, reproducibility – or lack thereof – has become a full-blown crisis in science. Flagship journals like Nature and Science have published hand-wringing editorials and revised their policies in the hopes of heightening standards of reproducibility. In the statistical and data sciences, the barriers towards reproducibility are far lower, given that our analysis can usually be digitally encoded (e.g., scripts, algorithms, data files, etc.

Continue reading

A few weeks ago I gave a two-hour Introduction to R workshop for the Master of Engineering Management students at Duke. The session was organized by the student-led Career Development and Alumni Relations committee within this program. The slides for the workshop can be found here and the source code is available on GitHub. Why might this be of interest to you? The materials can give you a sense of what’s feasible to teach in two hours to an audience that is not scared of programming but is new to R.

Continue reading

Author's picture

Citizen Statistician

Learning to swim in the data deluge