After a lot of data cleaning and number crunching, we are able to present the following three maps of the geographies of Wikipedia in the UK using brand new November 2010 data. Looking at the first map (total number of articles in each district), we see some interesting patterns. With a few exceptions, it is rural districts in Scotland, Wales and the North of England that are characterised by the highest density of articles.
What we're likely picking up on is the fact that large districts simply have more potential stuff to write about. If we normalise the map by area we see an entirely different pattern. The map below displays the number of articles per square KM.
We see that most of the large urban conurbations in the UK are covered by a dense layer of articles. Most sparsely populated areas in contrast have a much thinner layer of virtual representation in Wikipedia. There are, however, some notable exceptions. Parts of Cornwall, Somerset and the Isle of Wight all have a denser layer of content than might be expected for such relatively rural parts of the country. On the other hand, one might expect a higher density in the districts surrounding Belfast (in fact almost all of Northern Ireland is characterised by very low levels of content per square KM).
Finally, we can look a the number of articles per person in each district:
Here some more surprising results are visible. All major urban areas have relatively low counts of article per person (with the exception of central London). In contrast, many rural areas (particularly areas containing national parks) have high counts per person.
There are obviously a range of ways to measure the geographies of Wikipedia in the UK. We see that some areas are blanketed by a highly dense layer of virtual content (e.g. central London and many of the UK's other major conurbations). These maps also highlight the fact that some parts of the UK are characterised by a paucity of content irrespective of the ways in which the data are normalised. Northern Ireland in particular stands out in this respect.
We'll attempt to upload similar analyses of other countries in the next few months. In the meantime, however, we would welcome any thoughts on the uneven amount of virtual representation that blankets the UK.
p.s. many thanks to Adham Tamer for his help with the data extraction.
Very inreresting...
ReplyDeleteIs that an awkward request if I ask: would you perform the same analysis on France.
Arno,
A french reader