But until #badideaswithfireworks becomes a trending hash tag, we thought we'd use Twitter to explore some of the regional differences that
So in honor of the 4th of July, we selected all geotagged tweets[1] sent within the continental US between June 22 and June 28 (about 10 million in total) and extracted all tweets containing the word "church" (17,686 tweets of which half originated on Sunday) or "beer" (14,405 tweets which are much more evenly distributed throughout the week). See below for more technical details[2] or just go straight to the map below to see the relative distribution of the tweets in the U.S.
Relative Number of Tweets containing the terms "church" or "beer" aggregated to the county level, June 22-28, 2012
Of course, since these are tweets, the content is decidedly less spiritual than one might expect given the focus on beer and church. For example, the most common example of a "church" tweet was simply a report such as "I am at _______ church". More amusing are what we characterize as "competitive church going" when one person replaces another as the Foursquare "mayor" of a church. "I just ousted Jef N. as the mayor of Dallas Bible Church on @foursquare! 4sq.com/5hNW6x"
This of course echoes the Sermon on the Mount and the famous verse, "Blessed are those who check in for they shall inherit the badges of righteousness." Another common category were politically related tweets such as "#ICantDateYou If You Dont Go To Church" or "@____ you're right. It's like separation of church and state. But they really shouldn't be separated. #twitterpolitics".
Given the cultural content of the "church" tweets, the clustering of relatively more "church" than "beer" content in the southeast relative to the north-east suggests that this could be a good way to identify the contours of regional difference. In order to quantify these splits, we ran a Moran's I test for spatial auto-correlation which proved to be highly significant as well.[3] Without going into too much detail, this test shows which counties with high numbers of church tweets are surrounded by counties with similar patterns (marked in red) and which counties with many beer tweets are surrounded by like-tweeting counties (marked in blue). Intriguingly there is a clear regional (largely north-south split) in tweeting topics which highlights the enduring nature of local cultural practices even when using the latest technologies for communication.
We also note that this map strongly aligns with the famous 'red state'/'blue state' map from the 2000, 2004, and 2008 elections with a strong "religious right" component in the Southeastern United States (see also The Virtual 'Bible Belt') and a more liberal, or at least beer-tweeting, Northeast and upper Midwest (see also The Beer Belly of America).
In any case, happy 4th of July to our American readership. We hope you enjoy your beer in the north, or your church service if you are tweeting from the south.
----------------------
[1] It is important to note that geotagged tweets are somewhat of an oddity among tweets, as only one to three percent of tweets (depending on the country) are geotagged. Still a small percentage of a very large number (the total number of tweets) results in a LOT of data.
[2] There are a number of technical issues tied to the validity and scale of geography associated with tweets which we won't go into here but it is worth mentioning that we are NOT using user profile locations. This data is limited to geographic information associated with each tweet, often drawn from a GPS capable device. While the relevant scale at which analysis can be done differs between tweets about 90 percent of the tweets in this sample are accurate on the city level or lower which works well for this analysis.
[3] Based on IDW matrix for 2.34 decimal degrees (Euclidean distance), this test achieved a z-score of 14.34, implying there is a less than 1% likelihood that this high-clustered pattern could be the result of random chance.
[1] It is important to note that geotagged tweets are somewhat of an oddity among tweets, as only one to three percent of tweets (depending on the country) are geotagged. Still a small percentage of a very large number (the total number of tweets) results in a LOT of data.
[2] There are a number of technical issues tied to the validity and scale of geography associated with tweets which we won't go into here but it is worth mentioning that we are NOT using user profile locations. This data is limited to geographic information associated with each tweet, often drawn from a GPS capable device. While the relevant scale at which analysis can be done differs between tweets about 90 percent of the tweets in this sample are accurate on the city level or lower which works well for this analysis.
[3] Based on IDW matrix for 2.34 decimal degrees (Euclidean distance), this test achieved a z-score of 14.34, implying there is a less than 1% likelihood that this high-clustered pattern could be the result of random chance.
1) This is an awesome map
ReplyDelete2) Though the data on a regional basis seems to make sense (correlating with red/blue states from elections, the Bible Belt vs. the Beer Belly), on a smaller level it doesn't seem to make much sense. Largely secular places like DC and Philly have "Much more church" whereas my conservative and religious hometown area of Lancaster County, PA has more beer tweets. It just doesn't seem to make sense....
DC and Philly are heavily African-American, are they not? And African-Americans don't tend to be "largely secular" at all...
DeleteTo folllow up on my other reply, it's not just DC and Philly either ... Cook County (Chicago), Cleveland, Cincinnatti - all places with a high African-American population and "much more church" tweets.
DeleteYou can sort of see the "black belt" (running through the Carolinas, central Georgia, central Alabama and curving up into northern Mississippi) show up with a reddish emphasis too.
So that does kind of make sense ... you just have to replace "Beltway pundit" as stereotypical DC resident in your mind's eye with the more demographically correct image of a lower-income black family.
You should also keep in mind that the data reflects those who take to social media. A particular place may be largely secular, or largely religious, but that may not necessarily reflect on those users who are also on twitter, talking about it.
DeletePhilly is actually shown there in blue. That big red spot just southwest of Philly is Delaware County. In contrast to Philly, Delco is by vast majority white and a traditional conservative stronghold.
DeleteYou may be assuming that the population in DC is both homogenous, and similar to our stereotype of the DC dweller (beltway types). This group may not be the majority of Church tweeting twitter users. Aditionaly beltway types may self censor, and limit twitter mentions of Beer, while enhancing their public profile with respect to Church.
ReplyDeleteConversly, in Lancaster County, it may be just the Beer types who feel they have to brag about it.
Larry,
DeleteBut if there is perceived cultural/social pressure regarding the content of tweets, isn't that a pretty good indicator of some kind of cultural gap? Hence it seems your argument reinforces the notions of the cultural contour laid out in the second graph.
Do references to Juke Joints count towards "beer?"
ReplyDeleteCool idea, but I've never understood people making maps of things like tweets or internet votes on the US county level. If some of the biggest cities in the US (SF & Dallas) only had ~250 tweets in total, how few did all the hundreds of smaller counties have?? As an example, how many tweets did Allen Parish Louisiana have? With a population of less than 30,000, I seriously doubt you had enough tweets that you can confidently say that county is predominantly "more beer" than church.
ReplyDeleteAnother similar issue - I don't like quantitative-based choropleth maps of the US by county either. Why should counties that are huge area-wise but tiny population-wise get a bigger representation on the map? For example, why should Loving County, TX with a population of 82 be shown as 30x as large as New York County (Manhattan), while it has almost 20,000x the population? I think it would be better to at least show a cartogram as an alternative representation.
Good point about the sample size.
DeleteAs for the quantitative-choropleth problem, this data is not quantitative, but ratio, as it is number of beer tweets/number of church tweets. Although choropleth maps, as a general rule, are not supposed to be used for count data because they can be misleading exactly for the reason you described (despite the fact that you, an informed viewer of the map, are aware of the problem), this map doesn't fit in to that category.
Have a question on the Moran's I test: It seems like the test is fairly useless at the county level once you go west of the Rockies and have counties the size of small states with lower population. Would it by possible to run the same analysis by zip code, or do the tweets only resolve tot eh county level?
ReplyDeleteAt the zipcode level, sample sizes may become to small to statistically infer anything. But I hear you about the Moran's-I... the size of the counties themselves are spatially autocorrelated. It would be better to use a method that is based on contiguity rather than distances between centroids.
DeleteAlthough it isn't called out in the legend, one might assume the gray color in the Moran's I map is being used for counties with No Data, as in the first map.
ReplyDeleteHowever, there seem to be more gray counties in the Moran's map. How come?
What limiting distance did you use for your analysis? If you used ArcMap and left the threshold variable blank, it uses a distance that guarantees each point has a neighbor. This biases the analysis towards clustering (where clustering may not exist).
ReplyDeleteData collected at the local (person) level should be analyzed at the local level. Aggregation averages out the variance, which I'm sure you know. If you must aggregate, a smaller aerial unit (nearest neighbor distance x nearest neighbor distance?) would provide a better understanding of the clustering.
Also, comparing the tweet locations to all of America, where tweeting doesn't occur, gives alot of background noise to this analysis. If a tweet never occurs at location X, is it really viable to compare it with locations that do tweet?
Still, I think it's a great concept to identify political/social issues using social media messages and content analysis.
Brilliant post. If I might ask, what program did you use to create the maps? I have some similar data I have been experimenting with mapping on Google Maps, but this format is much more appealing.
ReplyDeleteHow much work at providing context for the quotes was there? Do quotes about Eric Church count? You may have mistaken country music fans for churchgoers.
ReplyDeleteHello!
ReplyDeleteI am the Watercooler editor at Before It's News (beforeitsnews.com). Our site is a rapidly growing people-powered news platform currently serving over 3 million visitors a month. We like to call ourselves the "YouTube of news."
We'd love to republish your RSS feed on our site, with a link back to yours. Our visitors would enjoy your content and getting to know you.
It's a great opportunity to spread the word about your work and reach new fans. Posting on Before It's News is 100% free.
Looking forward to hearing from you!
Best regards,
Sebastian Clouth
SClouth@beforeitsnews.com
Hey, could you guys display this data in an area cartogram?
ReplyDeleteHi all,
ReplyDeleteThanks for your comments. This is simply an aggregate of tweets that include the words "beer" and "church" normalized to a 0-1 scale without any contextual indicators. So, if somebody tweeted the word "brewski" instead of "beer," the tweet was not counted, and if they used the word "church" to refer to a musician, they were counted for "church." Similarly it's likely that Burlington, Vermont shows more "church" because of tweets referring to the "Church Street Marketplace" in downtown Burlington, having nothing to do with churchgoers.
As for the spatial statistics questions--the Moran's-I is less useful of a measure when there are giant gaps between areal units, so using zip codes would have resulted in many 0-values for zip codes with no tweets. The size of counties is spatially autocorrelated, as is population density in the United States. As there were not very many tweets in the western United States during the one-week sample, limiting analysis to a spatial regime of the east coast would be more explanatory. The Moran's-I was considering a threshold of 2.34 decimal degrees (roughly 75-miles in mid-latitudes) to determine clustering.
ESRI ArcMap, Adobe Illustrator, and GeoDa were used for this analysis.
And yes, we could display this data in a cartogram, thanks for the suggestion @captainentropy.
I hope that helps, please comment if you have more questions!
-Monica
Fantastic stuff, Monica. You and your flock have made some righteous maps.
ReplyDeleteGood on you for using a week's worth of data. It'd also be interesting to see the temporal component in an animated map. Do Southerners tweet about beer on Friday night, I wonder, or is it all Jack Daniels & coke? Do Bostonians, with so much good beer available, tweet about church, too, on Sundays?
Keep up the good work.
What happened to Hawaii and Alaska?
ReplyDeleteTwitter is banned there.
ReplyDeleteSo this is up on CNN right now.
ReplyDeletehttp://religion.blogs.cnn.com/2012/07/09/study-people-tweet-more-about-church-than-beer/?hpt=hp_t2
It seems a bit odd that "church," a general term for a place of worship traversing a few religions such that it might be used by non-Christians, is compared to "beer," a specific term for one particular kind of alcohol. Regional variations may persist in types of alcohol and may provide further input in the alcohol versus place of worship mentions - q.v. http://www-958.ibm.com/software/data/cognos/manyeyes/datasets/3871bf4c231d11e095f3000255111976/versions/1 regarding stark differences in beer and wine consumptions by state (of course, consumption may not correlate well with mentions).
ReplyDeleteAlso in some regions brand names for beer may be much higher and may account for some important numbers - e.g., "Bud" may be more common in the south and midwest, for all we know, and might boost "beer" mentions dramatically.
It's a fun little exercise but the terms used are not really directly comparable in their "width" and what we can surmise is therefore rather limited, if at all useful.
Being a lifelong Geographer (U of TN/Knoxville 78) I stumbled across you site and have really enjoy some of the stuff you have done to make maps fun, which they should be.
ReplyDeleteWhat do LL, LH, HL, and HH stand for?
ReplyDeleteWe have a "Churchkey Beer" Microbrewery in Ontario...how would that register?? Cool concept, though...
ReplyDeletehttp://www.churchkeybrewing.com/
The southern beer drinkers must not be using Geotagging because they are afraid of being caught by the church-goers.
ReplyDelete► Very interesting study...
ReplyDeleteWe have republished the Merica Dan's post on CNN and redirected this post to our friends. Congratulations!
► http://www.adventistreport.com/2012/07/study-people-tweet-more-about-church.html
As a pastor of a Lutheran "church", and knowing that Lutherans as a whole are OK with beer (hey, Luther's wife Katie ran the brewrey), I can't help but wonder...
ReplyDeleteHow many tweets mentioned both church AND beer, if any?
Silly curiousity, at least!
In defense of Dallas....there is a nightclub called the "Church".
ReplyDeleteThe best beers in the world are made by churches!
ReplyDeleteWhat do LL, LH, HL, and HH stand for?
ReplyDeletefacebook Emoticons
Interesting note: 2 of the most liberal areas in the country on here are red - Burlington, VT area and Boulder, CO area. They both have 'church streets' which are where a ton of bars and restaurants are.
ReplyDeleteAnd what about those of us who worship at the church of beer??
ReplyDeleteI live in Burlington, VT, a town with a lot of college kids and a good number of bars per capita in a state with some good microbrews and a lot of non religious liberal types. Made me wonder why its county is red.
ReplyDeleteThen I realized that a lot of these bars in burlington are on church street. ;)
Very nice map and analysis. The only thing I would suggest is that when doing any spatial analysis that uses distance to use a projection that uses a linear unit (i.e. feet or meters) since as you note decimal degrees change with latitude. It probably wouldn't effect these results too much, but it can matter if you are using euclidean distance in your analysis.
ReplyDeleteSomething interesting...not sure if you have heard this yet, but my county (Chittenden, VT, home of Burlington) has more church mentions because the main drag in town is Church St., not for religious reasons!
ReplyDeleteI just hope the beer tweets were more for microbrews, IPAs, pilsners, etc rather than the generic types (Bud, Miller, Coors)....that would be progress! - can you do further research and publish that graph please?
ReplyDeleteI imagine that the results might have been quite different for Napa County, CA if the comparison had been "wine vs. church" instead.
ReplyDelete