A news story that flitted across the headlines earlier this year reported on a study called "The Geography of Happiness," in which researchers in Vermont subjected 10 million geotagged tweets to sentiment analysis (touched on briefly in the Lounge last summer) and correlated their findings successfully with annually-surveyed characteristics of people all 50 United States, including nearly 400 urban populations. Their object was to arrive at a metric for the relative happiness of people in a place. The study is a goldmine for data junkies and I urge anyone with an interest in its findings to look at the whole study (on the link above), not just the digested and spun bits of it that have appeared in the news media. "The Geography of Happiness" breaks new ground in the analysis of digital-age linguistic data, while also raising interesting questions about the limits of obtaining reliable results from algorithm-driven research on big bags of words.

Journalists in search of a story have found news hooks of many shapes and sizes in the study and different aspects of its findings are still being reported in the popular press and mulled over in blogs. Commentators can adjust their focus for close-up or wide-angle because there's material here for everyone. One of the more popular findings of the study was the somewhat coarse-grained happy state/sad state map, reproduced here:

Happier states are shown in red, sadder states in blue, neutral states are gray. No one is surprised that Hawaii seems to be the happiest state. It also fits a popular stereotype that two states in the Deep South, Louisiana and Mississippi, where quality-of-life indices lag far behind the national averages in many areas, are the least happy.

Anyone who reads the study can nitpick, and if you read pundit blogs or trawl the comments section of any of the online news coverage about the study, this is what you find. Are people in economically depressed Nevada really happy, or are the happy tweets coming from besotted revelers in Las Vegas? Why were tweets in Spanish not surveyed as well as ones in English? Is it really fair to characterize Louisiana as a sad place because of the abundance of profanity that is tweeted there? If wine is considered a happy word, is it any wonder that Napa, California is the happiest city in the country? Does it not imply a strong and unscientific prejudice against fat people that the researchers even looked for a correlation between word use and obesity?

The researchers did not actually "read" 10 million tweets, ponder their meaning, and rank them on a happiness scale. They were looking at words, not sentences. They (or rather, their minions: see below) ranked the happiness quotient of a set of individual words and then measured the relative frequency of happy and sad words across all geographical areas. So, for example, rainbow is a rather happy word and hate is a sad word. The important question that many linguists raise here is this: Is it valid to assess the meanings of words divorced from their context? Breakthrough seems like a pretty happy word on the face of it, but it isn't if you're talking about toilet paper. The researchers' assertion, however, is that when you're looking at millions of words, you can still obtain valid results while ignoring context. Their work follows on others who have looked to Twitter as the pulse on which to place a finger in order to discover what people are thinking and feeling, such as this frequently cited paper from 2010 does.

A core flaw of the happiness study may be its use of Amazon's Mechanical Turk service to arrive at the "average happiness" of each word examined in the study. Mechanical Turk is a massive crowdsourcing/outsourcing labor market in which simple, highly repetitive tasks are performed by Internet-connected workers for small payments. It's interesting to note that a separate, independent study of Mechanical Turk found that

certain homogeneous aspects of the [Mechanical Turk] population, such as education level and nationality, may impose limits on the appropriateness of Turkers as a target community for some interventions or research areas. An awareness of the demographics and behaviors of Mechanical Turk workers is important for understanding the capabilities and potential side effects of using this system.

So in other words, there may be a compromising overlap between the small minority of people who tweet, and the even tinier minority of people who work as Mechanical Turks, making "The Geography of Happiness" in a highly reduced view simply a snapshot of what hip and savvy 20-something Americans think happiness is in relation to themselves and others that they know little about.

One wall that this study comes up against is that happiness is a subjective state, and so attempts to measure it objectively will never rise above all objections. But what must not be overlooked in the study is the huge volume of data that was studied, as well as the great care taken to minimize bias from the peculiarities of measuring characteristics of language when it is separated from almost any meaningful context except its point of origin. The fact that the linguistic analysis correlates so well with survey-based analysis that is also aimed at measuring the elusive quality of happiness is the best validation of the researchers' approach.

The study was intriguing to me personally for a couple of reasons. Until very recently I lived for 20 years in Maryland, one of the saddest states according to the study. Now I live in Niwot, Colorado, a very sweet spot that is on an edge of a small triangle formed by three of the happiest cities in the nation, according to "The Geography of Happiness": No. 2 Longmont, No. 9 Lafayette, and No. 12 Boulder, Colorado. This suggests that I should be tripping over my own happy feet every day. Yet more interestingly, my father (may he rest in peace) was born in Shreveport, Louisiana and spent his childhood Beaumont, Texas: two cities that rank among the five most long-faced in the nation. So of course my first inclination was to consult my own experience to see if it correlated with the findings of the study.

I can report unequivocally that I am happier in Colorado than I was in Maryland, though that is partly due to the fact that I grew up here and now I feel like I am back home after many years away. I can report anecdotally that I am surrounded here by much happier people than I was in Maryland. The default way that you greet a stranger in this part of Colorado is to make eye contact, smile, and say something nice. That rarely happened in Maryland, and it is not the rule in any of the other places I have lived. As for my father's sad legacy: I only visited his neck of the woods once, long after he had died but well within the time frame coincident with the data assessed in "The Geography of Happiness." On leaving there, I will only say that I praised my late grandmother's fortitude and courage for moving her family out of the swamp in the 1940s and bringing it to Colorado!

As a non-tweeter, there's one nagging doubt about "The Geography of Happiness" that I can't address but that I would like to see investigated: how reliably do people actually tweet what they feel? There is an overwhelming pressure in American culture to be upbeat and "accentuate the positive," and I wonder how much this plays into the phenomenon of feelings that get tweeted, as opposed to ones that merely fester.

Rate this article:

Click here to read more articles from Language Lounge.

Orin Hargraves is an independent lexicographer and contributor to numerous dictionaries published in the US, the UK, and Europe. He is also the author of Mighty Fine Words and Smashing Expressions (Oxford), the definitive guide to British and American differences, and Slang Rules! (Merriam-Webster), a practical guide for English learners. In addition to writing the Language Lounge column, Orin also writes for the Macmillan Dictionary Blog. Click here to visit his website. Click here to read more articles by Orin Hargraves.

Join the conversation

Comments from our users:

Monday July 1st 2013, 5:26 AM
Comment by: Chandru S. (Chaska, MN)
a more relevant question is 'who' tweets? are the tweeters important to gauge happiness index? do common people tweet? do they have the time or the interest to tweet? do all the regions tweet in equal proportion for a fair study? i feel that a fair amount of subjectivity has gone into the conclusions. in fact this study applied on a global context is bound to throw up much more subjectivity. so, i would take such studies with more than a pinch of salt!
Monday July 1st 2013, 8:19 AM
Comment by: Roberta M. (Redmond, WA)
Grand article, though not in the usual scope of VT offerings.

I wonder about all sorts of things in this. For instance the sentence "The fact that the linguistic analysis correlates so well with survey-based analysis . . . is the best validation of the researchers approach." I'd like to know HOW it correlates and the sourcing of such correlation. Is it known to all in the community but me? I never heard of this stuff before.

The question of the 'averageness' of tweeters is very interesting to me also. I have never tweeted in my life (so far).

I don't trust surveys at all. I NEVER answer them, and there must be many like me. What do we do to the results of a survey? Should we be discounted because we're not 'playing the game', as certain odd results are thrown out of a medical study because they don't reflect the general trend and are seen to skew the result. (Another decision that bothers me.)

Oh well. Much to think about. Thank you, Mr. Hargraves, for letting me know this entire study exists!
Monday July 1st 2013, 9:58 AM
Comment by: laurie M.
Down here in New Orleans, and south of here to the gulf, it wouldn't surprise me if people were unhappy...there have been so many BIG problems that have affected more people per capita, maybe ( my guess) than in many other regions over the last 7 years. Some of us are still trying to get our lives back together after Katrina- yes, the other storm..
Also, to note the demographic changes- we have so many (especially 20-30 something) newcomers to the area who may not be "happy" here due to humidity, termite swarms, flooding streets and all the other everyday and seasonal occurrences we natives take in stride. Oh, and one other idea- who else tweets? Lots of disgruntled high school students- were they still in school when the study was done :) So, some interesting observations- and if "cursing" words counted, I can guarantee that if the New Orleans' Saints football scandal of last season fed into the word list, well, that would be a damn shame.
Monday July 1st 2013, 12:52 PM
Comment by: Susan C.
Very interesting! With a little research, I've learned that the study notes that "Hawaii emerges as the happiest state due to an abundance of relatively happy words such as ‘beach’ and food-related words." Ditto the slant on "wine" as a very happy word and thus skewing the results for Napa Valley. We like our food and libation to come with warm weather and leisure, no doubt.

57% of Mechanical Turkers, who determined the average happiness score for words, live in the US., 32% live in India and the rest are spread out over a handful of other countries. 55% are women vs. 45% men. 40% are ages 18-24; 42% have a bachelors degree. 27% earn less than $10K USD a year with only 23% earning over $50K/yr.[ http://www.ics.uci.edu/~jwross/pubs/SocialCode-2009-01.pdf] So it might be fair to say that a typical Turker might be a young, low-paid female college graduate from the US or India, which definitely affects the happiness work rankings to begin with.

It would be fair to say that many who tweet, including myself, are not reporting on personal states of being at any given moment. Contrary to popular belief, Twitter is not all about what you had for breakfast nor limited to narcissists. Many tweet humour, others news, others report on articles of interest. When you find your tweeps--like-minded people you follow or who follow you--it's like having curators who serve up tailored information based on your personal interests. Sort of like Amazon suggestions or other preference algorithms but you shape your stream yourself based on who you follow.

So, factoring in the "news" element of which there are many accounts and followers who retweet items, I would guess that "earthquake"--which is a maximum unhappiness word--and other natural disasters, which I would put termite swarms into, would play a large part in shaping these happiness profiles.

I also thought this sentence from the study was very interesting:

"Overall, the main factor driving the relative happiness scores for each
city appears to be the presence or absence of key words such as ‘lol’, ‘haha’ and its variants, ‘hell’, ‘love’, ‘like’ and the negative words ‘no’, ‘don’t’, ‘never’ and ‘wrong’, as well as profanity."

Again, the presumption of Twitter as a conversation rather than a bulletin board seems to miss how many are using it differently. Of course, hell, I could be wrong for never using LOL in my tweets. Haha.

[Interesting that "hell" is in the happy words and not the unhappy profanity words, don't you think?]

Do you have a comment?

Share it with the Visual Thesaurus community.

Your comments:

Sign in to post a comment!

We're sorry, you must be a subscriber to comment.

Click here to subscribe today.

Already a subscriber? Click here to login.