They Also Serve Who Only Kvetch and Tweet : Language Lounge : Thinkmap Visual Thesaurus

Have you ever sent a really EPIC tweet? There are different ways to answer that question: I'll proceed with one way that probably doesn't occur to you. The EPIC tweets under the microscope here are tweets that are of interest to Project EPIC — that is, Empowering the Public with Information in Crisis. It's a National Science Foundation-funded project, carried out jointly by researchers at the University of Colorado at Boulder (full disclosure: my day-job employer) and the University of California, Irvine.

EPIC is developing computational techniques that use information people generate using social media in disrupted situations. That's not disrupted like when your usual bus is running late this morning, but disrupted like you get when there is a hurricane, tornado, earthquake, explosion, invasion by aliens, or other disruptive emergency.

Ultimately EPIC wants to create a set of applications intended for use by members of the public that will help people pool online information and information sources, whether official ones or unofficial. The best information should enable people to make the best decisions in critical situations, and making optimal decisions for whatever unexpected disaster you find yourself in the middle of might make a big difference — even between life and death.

EPIC depends on two things that are fairly reliable: our communications network, which seems to have enough built-in redundancy to withstand major disruptions; and human nature, which can be depended upon to speak up when something is not right. Twitter, Facebook, and other social media enable everyone to speak with a slighter louder voice than ever existed before the Internet.

Lest you start to think that now is the time to raise your hand to be an EPIC first-responder, that job is already taken. It's a job that belongs to everyone — at least, everyone who uses social media — and EPIC is usually not much interested in individual tweets, posts, and status updates; it's interested in dozens or hundreds or thousands of them (geotagged ones are particularly helpful) that can be processed computationally to develop a clear picture of what is actually happening on the ground, drawn by the people who are there. Reliable, peer-generated information sources for EPIC are emergent — you don't know who the EPIC heroes will be until the event happens.

That sounds really cool, you may be thinking, but how is it going to work, and what has it got to do with language? The toolbox for EPIC is mostly a collection of natural language processing (NLP) techniques, some of which we have talked about in the Lounge before. One of the chief tools is sentiment analysis, a widely-used NLP tool that attempts to get a grip on what people are thinking and feeling on the basis of what they say on social media. We talked about it a bit in the Lounge from July 2013, in discussing a study that made judgments about the happiest places in the United States.

In a time of crisis or emergency, happy talk will be of little interest to anyone who needs critical information, but the same tools that can detect pockets and patterns of happy talk can also zero in on feelings at the other end of the scale: anxiety, fear, anger, panic, and the like. People express such feelings in language with an identifiable set of words in every language and by analyzing the usage patterns of such words (consider, for example, scared, afraid, disaster, catastrophe, danger, or emergency) in geographically identified areas, scientists can develop at least a fuzzy picture of where the most dire things may be happening.

Another promising area of research for EPIC is event detection, another topic we touched on in a Lounge from last year. People like to talk about events: the ones they plan, the ones that are happening, or the ones that they fear may happen. The poster-child of event detection is the verb. Hardly anything goes down that cannot be talked about with verbs, and researchers can analyze the usage and frequency of the many verbs in tweets that may indicate that something has gone terribly wrong, or is about to.

Humans, like many other organisms, have an instinctive fight-or-flight response. What sets us far apart from other species who enjoy this adaptive behavior for situations of acute stress is that we have language as well, and when we have the opportunity, we are likely to use language to notify our flight decisions to others. Twitter and other social media enable modern hominids to broadcast their decisions like never before, and the collection of such shout-outs is another valuable data source for EPIC.

Long before social media came along, scientists were interested in what motivates people to make a protective decision — that is, a decision to act in order to avoid risk or hazard. Early research in this area had already determined that people consider severity, susceptibility, barriers, cost variables, and quality of information in deciding whether, for example, they should evacuate an area threatened with a hurricane or nuclear disaster. Findings from this early research are being integrated into the corpus of language that the Twitterverse provides in order to zero in on what people think about before deciding whether to stay or go.

When a forewarned disaster looms, an inevitable news story accompanying it is the interview with the stalwarts who choose to ignore evacuation orders. The media has long given sensationalist attention to such individuals, who seem to enjoy the opportunity of a microphone thrust in their faces in order to spout clichés about their defiance. Of much greater interest to authorities are those on the flipside of the media darling: the individuals who respond to an evacuation order by leaving immediately. The developers of EPIC are hopeful that they can develop a predictive model of who will evacuate, who won't, what factors help them to decide, and how long it will be before they make up their minds.

Computers, like people, love to learn by examples; computers in fact are not very good at learning except by examples, and for this, they need training data — that is, data that can be used to discover predictive relationships among discrete data points. Disasters in the modern world come along at a fairly steady clip, and so today, scientists can collect data from recent disasters in which people tweeted and posted their fears, anxieties, and decisions, to build a model of what the relationship is between what people say and what they do, or are likely to do.

Project EPIC is beginning to look at geo-tagged Twitter data from Hurricane Sandy and trying to identify evacuators based on their location and what language they used to signal their imminent departure. The "tweets before the storm" constitute a goldmine for researchers to discover nuggets of information that they hope will prove useful in the management of future disasters.

Language, Linguistics, Online

Click here to read more articles from Language Lounge.

Orin Hargraves is an independent lexicographer and contributor to numerous dictionaries published in the US, the UK, and Europe. He is also the author of Mighty Fine Words and Smashing Expressions (Oxford), the definitive guide to British and American differences, and Slang Rules! (Merriam-Webster), a practical guide for English learners. In addition to writing the Language Lounge column, Orin also writes for the Macmillan Dictionary Blog. Click here to visit his website. Click here to read more articles by Orin Hargraves.