Saturday, May 15, 2010

Kirkuk

Last week I was trolling through the Facebook page of "Geography at West Point" and came across this interesting blog entry from WeatherSealed.com.  The authors took two large samples of Facebook posts, one from the Northeast (New Jersey to Maine) and one from the Deep South (Louisiana to the Carolinas), and extracted the most frequently used words of each sample relative to the other.  The results were then fed into Wordle, a web application that creates word clouds from text samples, to create an image entitled, "A Tale of Two Regional, Multi-State Areas."  Apparently Yankees like to use the F-bomb and chat about books, while Southerners post about mom, Wal-Mart, and church.

This got me thinking about a similar project I did last year, in which I investigated the international media coverage of the Iraqi city of Kirkuk .  I looked at two sets of newspaper articles, one from USA Today and one from the English-language website of the Turkish newspaper Hurriyet.  I took every on-line article published by the two sources from 2007 to 2008 and analyzed word frequencies using a free piece of software called Yoshikoder, then compared the results.  Going in, I hypothesized that Turkish coverage of Kirkuk would have more negatively-connoted words than American coverage, based on the fact that Turkey adamantly opposed the inclusion of Kirkuk in the Kurdish Autonomous Region.  I was surprised to find, however, that USA Today's coverage had fewer positive words and more negative words than the Turkish sample.

Inspired by the WeatherSealed post, I decided to run the data through Wordle.  You can see the results below.  If you're interested in Wordle, you might also have a look at Tagxedo, which allows you to fit your word clouds to predefined shapes.

Word cloud based on USA Today.

Word cloud based on Hurriyet.

1 comment:

soknitpicky said...

Welcome back. That's very cool and a surprising finding on Kirkuk coverage.

Thanks for the link to Tagxedo. I love Wordle and had never seen that one.