Skip to main content

Do Political Scientists Care About Effect Sizes: Replication and Type M Errors

Reproducibility has come a long way in political science. Many major journals now require replication materials be made available either on their websites or some service such as the Dataverse Network. Most of the top journals in political science have formally committed to reproducible research best practices by signing up to the The (DA-RT) Data Access and Research Transparency Joint Statement.

This is certainly progress. But what are political scientists actually supposed to do with this new information? Data and code availability does help avoid effort duplication--researchers don't need to gather data or program statistical procedures that have already been gathered or programmed. It promotes better research habits. It definitely provides ''procedural oversight''. We would be highly suspect of results from authors that were unable or unwilling to produce their code/data.

However, there are lots of problems that data/code availability requirements do not address. Apart from a few journals like Political Science Research and Methods, most journals have no standing policy to check the replication materials' veracity. Reviewers rarely have access to manuscripts' code/data. Even if they did have access to it, few reviewers would be willing or able to undertake the time consuming task of reviewing this material.

Do political science journals care about coding and data errors?

What do we do if someone replicating published research finds clear data or coding errors that have biased the published estimates?

Note that I'm limiting the discussion here to honest mistakes, not active attempts to deceive. We all make these mistakes. To keep it simple, I'm also only talking about clear, knowable, and non-causal coding and data errors.

Probably the most responsible action a journal could take when clear cut coding/data biased results have been found would be to directly adjoin to the original article a note detailing the bias. This way readers will always be aware of the correction and will have the best information possible. This is a more efficient way of getting out corrected information than relying on some probabilistic process where readers may or may not stumble upon the information posted elsewhere.

As far as I know, however, no political science journal has a written procedure (please correct me if I'm wrong) for dealing with this new information. My sense is that there are a series of ad hoc responses that closely correspond to how the bias affects the results:

Statistical significance

The situation where a journal is most likely to do anything is when correcting the bias makes the results no longer statistically significant. This might get a journal to append a note to the original article. But maybe not, they could just ignore it.

Sign

It might be that once the coding/data bias is corrected, the sign of an estimated effect flips--the result of what Andrew Gelman calls Type S errors. I really have no idea what a journal would do in this situation. They might append a note or maybe not.

Magnitude

Perhaps the most likely outcome of correcting honest coding/data bias is that the effect size changes. These errors would be the result of Gelman's Type M errors. My sense (and experience) is that in a context where novelty is greatly privileged over facts journal editors will almost certainly ignore this new information. It will be buried.

Do political scientists care about effect size?

Due to the complexity of what political scientists study, we rarely (perhaps with the exception of specific topics like election forecasting) think that we are very close to estimating a given effect's real magnitude. Most researchers are aiming for statistical significance and a sign that matches their theory.

Does this mean that we don't care about trying to estimate magnitudes as closely as possible?

Looking at political science practice pre-publication, there is a lot of evidence that we do care about Type M errors. Considerable effort is given to finding new estimation methods that produce less biased results. Questions of omitted variable bias are very common at research seminars and in journal reviews. Most researchers do carefully build their data sets and code to minimise coding/data bias. Many of these efforts are focused on the headline stuff--whether or not a given effect is significant and what the direction of the effect is. But, these efforts are also part of a desire to make the most accurate estimate of an effect as possible.

However, the review process and journals' responses to finding Type M errors caused by honest coding/data errors in published findings suggest that perhaps we don't care about effect size. Reviewers almost never look at code and data. Journals (as far as I know, please correct me if I'm wrong) never append information on replications that find Type M errors to original papers.

Prescription

I have a simple prescription for demonstrating that we actually care about estimating accurate effect sizes:

Develop a standard practice of including a short authored write up of the data/code bias with corrected results in the original article's supplementary materials. Append a notice to the article pointing to this.

Doing this would not only give readers more accurate effect size estimates, but also make replication materials more useful.

Standardising the practice of publishing authored notes will incentivise people to use replication materials, find errors, and publicly correct them.

Comments

Popular posts from this blog

Dropbox & R Data

I'm always looking for ways to download data from the internet into R. Though I prefer to host and access plain-text data sets (CSV is my personal favourite) from GitHub (see my short paper on the topic) sometimes it's convenient to get data stored on Dropbox . There has been a change in the way Dropbox URLs work and I just added some functionality to the repmis R package. So I though that I'ld write a quick post on how to directly download data from Dropbox into R. The download method is different depending on whether or not your plain-text data is in a Dropbox Public folder or not. Dropbox Public Folder Dropbox is trying to do away with its public folders. New users need to actively create a Public folder. Regardless, sometimes you may want to download data from one. It used to be that files in Public folders were accessible through non-secure (http) URLs. It's easy to download these into R, just use the read.table command, where the URL is the file name...

Slide: one function for lag/lead variables in data frames, including time-series cross-sectional data

I often want to quickly create a lag or lead variable in an R data frame. Sometimes I also want to create the lag or lead variable for different groups in a data frame, for example, if I want to lag GDP for each country in a data frame. I've found the various R methods for doing this hard to remember and usually need to look at old blog posts . Any time we find ourselves using the same series of codes over and over, it's probably time to put them into a function. So, I added a new command– slide –to the DataCombine R package (v0.1.5). Building on the shift function TszKin Julian posted on his blog , slide allows you to slide a variable up by any time unit to create a lead or down to create a lag. It returns the lag/lead variable to a new column in your data frame. It works with both data that has one observed unit and with time-series cross-sectional data. Note: your data needs to be in ascending time order with equally spaced time increments. For example 199...

A Link Between topicmodels LDA and LDAvis

Carson Sievert and Kenny Shirley have put together the really nice LDAvis R package. It provides a Shiny-based interactive interface for exploring the output from Latent Dirichlet Allocation topic models. If you've never used it, I highly recommend checking out their XKCD example (this paper also has some nice background). LDAvis doesn't fit topic models, it just visualises the output. As such it is agnostic about what package you use to fit your LDA topic model. They have a useful example of how to use output from the lda package. I wanted to use LDAvis with output from the topicmodels package. It works really nicely with texts preprocessed using the tm package. The trick is extracting the information LDAvis requires from the model and placing it into a specifically structured JSON formatted object. To make the conversion from topicmodels output to LDAvis JSON input easier, I created a linking function called topicmodels_json_ldavis . The full function is below. To...