A little over a year ago, we decided to propose a data visualization course at the first-year level. We had been thinking about this for awhile, but never had the time to teach it given the scheduling constraints we had. When one of the other departments on campus was shut down and the faculty merged in with other departments, we felt that the time was ripe to make this proposal.

Continue reading

One of the many nice things about summer is the time and space it allows for blogging. And, after a very stimulating SRTL conference (Statistics Reasoning, Teaching and Learning) in Rotorua, New Zealand, there’s lots to blog about. Let’s begin with a provocative posting by fellow SRTL-er Tim Erickson at his excellent blog A Best Case Scenario. I’ve known Tim for quite awhile, and have enjoyed many interesting and challenging discussions.

Continue reading

Part of the reason why we have been somewhat silent at Citizen Statistician is that it’s DataFest season, and that means a few weeks (months?) of all consuming organization followed by a weekend of super fun data immersion and exhaustion… Each year that I organize DataFest I tell myself “next year, I’ll do [blah] to make my life easier”. This year I finally did it! Read about how I’ve been streamlining the process of registrations, registration confirmations, and dissemination of information prior to the event on my post titled “Organizing DataFest the tidy way” on the R Views blog.

Continue reading

Ten years after Ioannidis alleged that most scientific findings are false, reproducibility – or lack thereof – has become a full-blown crisis in science. Flagship journals like Nature and Science have published hand-wringing editorials and revised their policies in the hopes of heightening standards of reproducibility. In the statistical and data sciences, the barriers towards reproducibility are far lower, given that our analysis can usually be digitally encoded (e.g., scripts, algorithms, data files, etc.

Continue reading

Last year I was awarded a Project TIER (Teaching Integrity in Empirical Research) fellowship, and last week my work on the fellowship wrapped up with a meeting with the project leads, other fellows from last year, as well as new fellows for the next year. In a nutshell Project TIER focuses on reproducibility. Here is a brief summary of the project’s focus from their website: For a number of years, we have been developing a protocol for comprehensively documenting all the steps of data management and analysis that go into an empirical research paper.

Continue reading

Check out my guest post on the Simulation-based statistical inference blog: Teaching computation as an argument for simulation-based inference If you are interested in teaching simulation-based methods, or if you just want to find out more why others are, I highly recommend the posts on this blog. The page also hosts many other useful resources as well as information on upcoming workshops as well.

Continue reading

A few weeks ago I gave a two-hour Introduction to R workshop for the Master of Engineering Management students at Duke. The session was organized by the student-led Career Development and Alumni Relations committee within this program. The slides for the workshop can be found here and the source code is available on GitHub. Why might this be of interest to you? The materials can give you a sense of what’s feasible to teach in two hours to an audience that is not scared of programming but is new to R.

Continue reading

Author's picture

Citizen Statistician

Learning to swim in the data deluge