In a recent paper on Federal Reserve inflation forecast errors (summary blog post, paper) I wanted a way to easily compare the coefficients for a set of covariates (a) estimated from different types of parametric models using (b) matched and non-matched data.
I guess the most basic way to do this would be to have a table of columns showing point estimates and confidence intervals from each estimation model. But making meaningful comparisons with this type of table would be tedious.
What I ended up doing was creating a kind of stacked caterpillar plot. Here it is:
I think this plot lets you clearly and quickly compare the confidence intervals estimated from the different models. I didn't include the coefficient point estimates because I was most interested in comparing the ranges. The dots added too much clutter.
I have a link to the full replication code at the end of the post, but these are the basic steps:
I used the
confintcommand to find the 95% confidence intervals.
I did some cleaning up and rearranging of the confidence intervals, mostly using Hadley Wickham's
meltfunction in the reshape package. The basic idea is that to create the plots I needed a data set with columns for the coefficient names, the upper and lower confidence interval bounds, what parametric model the estimates are from, and whether the data set was matched or not. I removed the
sigma2estimates for simplicity.
I made the graph using ggplot2. The key aesthetic decisions that I think make it easier to read are: (a) making the lines a bit thicker and (b) making the bands transparent. I liked making the bands transparent and stacking them rather than showing different lines for each set of estimates because this halved the number of lines in the plot. Makes it much crisper.
The full code for replicating this figure is on GitHub Note: this code depends on objects that are created as the result of analyses run using other source code files also on the GitHub site.