Skip to main content

Posts

Standardise Country Names For Stata Data

If you regularly put together data sets for cross-country analysis, you'll probably know that it's a real pain to standardise country names so that you can merge together files from different sources. For example, you want to merge two data sets: A and B . In data set A the country Bosnia and Herzegovina is referred to as "Bosnia-Hertz" and in B  it is called "Bosnia-Herzegovina". To merge them into one file that you can use for data analysis you have to find this discrepancy and then change at least one of the names so that they both are the same. This is really tedious to do across multiple data sets with tens or hundreds of countries. Over the years I've created a Stata Do-file that standardises country names and attaches their IMF country codes . You can find the file here .  It clearly only standardises country name variations that I've come across. An easy way to check if a country name has not been standardised is to see if the do...

Reproducibility in Research

This post by Mario Pineda-Krch complains about the woeful lack of reproducibility in computational sciences. This reminded me of Jake Bowers 's good piece in the Political Methodologist from earlier this year about how to do reproducible computational political science. The article actually inspired me to completely switch over all of my new writing to Sweave . Sweave allows you to combine your R code and LaTeX documents. If you make your Sweave document and data available to readers they can completely reproduce everything in your article: the models, the table, the graphs, everything.  RStudio makes using Sweave really easy (though I still use a text editor for writing much of the code since RStudio doesn't do spellcheck).  Political economy and political science journals don't seem to have been keeping up with these developments. In fact, poli sci journals often require MS Word documents and don't allow you to submit Sweave documents. Few journ...

Scrappy Scapers

In an earlier post I presented some R  code for a basic way of collecting text from websites. This is a good place to start for collecting text for use in text analysis.  However, it clearly has some limitations;  You need to have all of the URLs already stored in a .csv file. The method of extracting the text from the downloaded HTML code using <gsub> is a bit imprecise. It doesn't remove the text from common links such as "Home" or "About". Both of these problems can be solved in R  with a bit of work. But I think for bigger scrapping projects it is probably a good idea to use other languages such as Python or Ruby .  ProPublica has an excellent little series on scraping  that covers how to gather data from online databases and PDFs. This is a really good public service and enables something sadly unusual in journalism: reproducibility. Their  chapter on using Ruby and Nokogiri for scraping the Phizer's doctor payments disclos...

Even More Reason To Pay Attention

Remembering back a few posts , I discussed how it looked like a number of US financial regulators and the Departement of Justice seemed to be credibly committing to bad supervision. This is especially worrying given this recent summary of how Dodd-Frank limits the powers of the Fed/Treasury/FDIC to respond to financial crisis. Though the idea may be to limit moral hazard by credibly committing to not give 2008-style bailouts, I have a hard time believing in this credibility. My initial thought is that no democratically elected government would actually not respond if their economy was collapsing because of a financial crisis. So, if a major crisis hits, these Dodd-Frank provisions will merely slow down the inevitable bailouts (may of the powers can be enacted with congressional approval). There is still moral hazard feeding potential crises, but crises responses will be slower. As the Economist rightly points out, regulators have even more imperative to prevent a crisis. But to do...

Incredible

Just researching the policymaking behind the Irish 2008 "Guarantee Everything" policy and found this nugget. In the one page statement announcing the plan they cite the "international market" turmoil twice as the cause of the 2008 crisis in Ireland (US subprime induced credit crunch -> tightening liquidity markets, yada yada yada). Not once is the massive domestic real estate bubble mentioned! Sure this doesn't reveal policymakers' total knowledge (they could just not mention the problem, while knowing it exists), but still.

Automated Academics

This WSJ piece on the US income gains over the past decade (summary: unless you have a PhD or MD, you didn't have any income gains) got me thinking: I'm actually pretty cautious about that number, I would be more interested in the range of the distribution, I think the percent change is being pulled up by all of those physics PhDs who went into finance. Then again, considering in the that over the past few weeks I've been learning how to automate the collection of data that used to be done by people with masters degrees, maybe PhDs are going to be the ones who automate all of the former undergraduate and masters level work out of existence, keeping the productivity gains for ourselves (conditional on the tax structure). (see also Farhod Manjoo's recent series on this issue in Slate.) One thing I gleaned from a talk given by the Governor of the California Board of Education last night was that academics largely doesn't even need PhDs (at least at all levels e...

World Bank Visualizations with googleVis

Building on the last post: I just put together a short slideshow explaining how to use R to create Google Motion Charts with World Bank data. It uses the packages googleVis and WDI. It mostly builds on the example from Mage's Blog post . I just simplified it with the WDI package (and used national finance related variables).

Simple Text Web Crawler

I put together a simple web crawler for R . It's useful if you are doing any text analysis and need to make .txt files from webpages. If you have a data frame of URLs it will cycle through them and grab all the websites. It strips out the HTML code. Then it saves each webpage as an individual text file. Thanks to Rex Douglass, also.  Enjoy (and please feel free to improve)

Recommended -- Mid-October

Here are three articles that I've found pretty interesting over the past few days: Finance: A fairly insightful blog post about the changing view of management, share holders, and corporate cash. Journalism: The Guardian sticks it to Murdoch, again. Science : This is a great article on symmetry in physics. The highly speculative ending is at the very least fun. I hadn't really known much about symmetry and larger Group Theory until reading Alexander Masters' excellent biography of the eccentric mathematician Simon Norton the other day. Also highly recommended.

Real Inflation? (Part 1)

At a recent lunch the conversation turned to how most American's real income hasn't change since the 1970s when we adjust for inflation (see here for some decent graphs). One of the people at the lunch (a person who has written considerably on monetary policy) contested this. His argument is that we are actually very bad at measuring inflation. Prices may rise, but the quality of the goods that we buy is much better now than it was in the seventies. The iPad I buy now is much better than the 1970s TV or radio or all the other things that it replaced in my life and probably cheaper than all of these things combined. On this line of reasoning, inflation is actually overestimated. There is one obvious flaw with this argument: it misses much of the point. If we were really terrible at measuring inflation in this way, then yes maybe most peoples' income has actually increased. But the bigger issue is that the top sliver of the income distribution has made steady gains since...

Berlin Elections Update

A post election update to my previous post : The Pirate Party did surprisingly well taking 8.9% of the vote and gaining 15 seats. Der Linke is no longer a contender to be part of the governing coalition. Basically everyone else remained the same from the previous election except the liberal FDP, who were basically wiped out. For full results see this Der Spiegel graphic:

Berlin Election Posters

For my political science readers out there, I've put together a short slide show of election poster photos I've taken recently (scroll down and click). These are for the upcoming Berlin state elections  (Battle for the Rotes Rathaus).  A few things I thought were interesting: The top poster in the first photo is for current SPD mayor Klaus Wowereit. Though the SPD is a social democratic party, in Berlin it is in coalition with Die Linke ("The Left" party created from pieces of the old East German Socialist Unity Party), and this is Berlin which votes pretty strongly for left parties, they only have a small blob of red on their poster. The major colour is blue? I asked a German coworker (and a researcher for an SPD MP) about this. She explained "they chose blue because they want to look cool". Oddly, the centre-right CDU (bottom picture 3 and 5) use read for their letters, but stick to blue otherwise. The second poster is for a Pirate Party cand...

US Publishing Dominance?

I ran across this data on science publications by country from the World Bank . Some quick thoughts: It seems that the EU, contrary to popular wisdom, has maintained a slight lead over the US as the academic science publishing centre for a bit more than a decade. Of course the US (pop. ~ 307 million ) is still publishing above its population adjusted weight relative to the EU (pop. ~ 501 million ). However, assuming that universities are places where resources are transfered from teaching (i.e. students) to research and given the incredible rise in US student debt (see my previous post ) I would have expected to see a larger increase in US publications because presumably US universities would have more resources. Of course there are many different reasons that student debt can increase without an increase in university resources, but an essentially flat absolute number of publications over the entire period is kind of strange.  Finally, what countries are pro...

Bank of Korea MPC Diorama

For anyone interested in central banking, you might find this photo amusing. I recently took it at the Bank of Korea Museum  (website includes virtual 3D tour for those not traveling to Seoul anytime soon). It shows the BoK's Monetary Policy Committee. I'm not sure how you would code this for a project on central bank transparency. Also, I'm not sure which of the puppets is supposed to be the Ministry of Strategy and Finance's " observer ". Final note: this reminded me of a perhaps an excruciating plan for an multi-holiday theme. Central bank museum tours. Then again, the museums are usually free, the Bank of England Museum is actually pretty interesting, and it can't be worse than touring all of the major league baseball stadiums .  

Bubbles, Bubbles, Bubbles

This graph from The Atlantic's Daniel Indiviglio is pretty astounding (click to enlarge). A good deal has already been written about it. Particularly the fact that the growth in student debt exceeded the growth in other debt during what is now know to have been a massive housing bubble. I don't want to double up on too much of what others have written, but I had some thoughts. The graph (and the article) (and current students) make a compelling case that we are in a student loan bubble. So, it might be sensible given the obviously high cost of the previous bubble to begin to at least draw down the growth in student debt. This likely means some combination of (a) reducing government support for student loans (removing guarantees, tightening lending standards, etc.), (b) providing grants to students, (c) instituting some sort of price controls on universities (tuition caps). All three of these measures are politically difficult: Reducing Loan Support: Much of the increa...

Rebuilding Haiti?

After some weeks of teaching a summer course at Peking University, I'm back at the blog. (For the curious, virtually all Google-hosted sites are blocked in China, including this one.) I had intended to start off the new series of posts with something on European bank guarantees, perhaps inspired by this FT editorial . However, before getting to that I just wanted to point your attention to Janet Reitman's Rolling Stone article on reconstruction, or the lack thereof in Haiti. (Coincidentally, this is the second Rolling Stone article this week suggested by longform.org that I've liked. The other was Matt Taibbi's piece on how the SEC disposes evidence from preliminary investigations. I would blog about that article too but I think I need to just sit down and write the "credibly committing to bad information" paper that I've already mentioned .) I actually don't have too much to add to the discussion, especially considering Felix Salmon's ni...

Who Will Be Telling the Truth: Greece, the EFSB, or National Regulators?

Recently the EU set up the European Systemic Risk Board (ESRB). Ok, this is kind of old news (the enacting legislation went into effect in December 2010). Why am I writing about it now? Well, just the other day EU leaders put together another rescue package that included guarantees and lose-sharing with banks (a partial default). In all of the discussion surrounding the future shape of a sustainable system of EU government financing ( here or here for example) there has been little discussion of the need for good information about what is really going on. In research I'm currently putting together (and mentioned in previous posts ) I've found that in order for policymakers to actually choose the level of bank (and I suppose government debt) guarantees that they want they need good information about economic fundamentals (not a huge surprise). But there is a good chance that the ESRB won't be able to give good information. Or more precisely, any information they give wi...

Fake Apple Store, Real Hysteria.

The NY Times website recently published a story about "The Rise of the Fake Apple Store" . Um, there are "fake" Apple Stores everywhere, including in the US. There is even a "fake" store up the street from my Dad's house in Erie, Pennsylvania. The real story isn't "Asians are Slavishly Copying American Creativity", but "Local Entrepreneurs Meet Demand for Apple Retail Experience when Apple Doesn't". Basically, even in places where Apple doesn't set up shop like Erie, PA, Kunming, and Seoul (which I know also has plenty of Apple Store-like stores) there is still a latent demand for well designed modern places to try and buy Apple products. Look-a-like stores are just filling this demand. Since (all the ones I've ever been to) sell actual Apple products what is the harm in this? However, the comments on both the NY Times site and at Slate (where it is largely reprinted) have largely picked up the "Slavish Asians...