Using Data, Networks and Complexity to Study Trade, Aid, Economics

I’ve been looking into whether Google’s “search result count” facility can help direct the focus of my research a little bit, by recording the result count for a series of search strings relating to different countries, in different years, and for different development project types (e.g. education, health etc.)

Although the result count is only an estimate (and famously a rather poor one) I think that comparing result counts against each other should be an at-least-reasonable heuristic. I assume that although they’re wrong, they’re not randomly so, and hence a higher result count for one of a set of structurally similar search terms should say something about the number of pages found.

The other assumption that lies behind using result counts in this way is that the amount of ‘stuff’ on the internet is a good measure of how much the English-speaking developed world is interested in a given topic. I assume that if “Kenya development project water” comes up with more results than “Kenya development project malaria” then more English-speaking internet users are ‘interested’ in water projects than in malaria projects. I then make the leap of faith that this implies these projects are happening more. Debatable? Most certainly. I’d be interested to try and defend this against a well-informed doubter. Comments below!

Since this assumption, if true, would be more accurate post internet-era, I’ve restricted my searching to the years 2000 to 2012. I’ve both included the year in the search term, and restricted the search results to only those pages from that year.
Google Custom Range
Methodological quibbles (or more) aside, I was impatient to start looking at the results of this “Google-harvest” and have analysed the numbers for a subset of African countries (namely Burkina Faso, Congo, Egypt, Eritrea, Gambia, Ghana, Guinea, Guinea-Bissau, Liberia, Mali, Mauritania, Morocco, Mozambique, Sierra Leone and Tunisia (I realise that any search results for Guinea-Bissau will show up in those for Guinea as well. I also realise that “Congo” is two countries. I’ll gloss over these details for now).

Although this represents only a small number of all the countries in Africa, the results are already worth commenting on. Here’s a graph showing how the total result count across all project types was divided between the project types.

Graph of Google result counts

Graph showing the time trends for Google search results across 15 African countries for different types of development project. Search terms were, for example, “Niger 2001 Secondary School project development aid”


The sharp-eyed amongst my readers will have spotted something odd about this graph. In 2006, there are huge spikes education projects (18.3% up to 20.8%) and agriculture projects (10% to 11.8%) at the apparent expense of water, AIDS and malaria projects. So the mystery is this: what happened in 2006 that led to a huge (but temporary) increase in interest in education and agriculture projects, at the expense of interest in health projects? And will this trend still be visible once all the results are in? Only time will tell….

p.s. apologies to G.F. Handel for the title of this post.
Interesting econometrics to follow, this is just the before-party.

NO COMMENTS
Post a comment