Some months ago in the Lounge (last August, to be exact) we talked about the results of a project on collocations that yielded up some interesting features of English. Collocations are hotter than ever in English and other languages; they're one of the most promising avenues for producing data that can be used in a number of natural language processing (NLP) applications, and reams of data about them are being generated in the hope that computers might someday read, comprehend, summarize, and even translate language as efficiently and accurately as the good old human brain does.

At the moment we're pursuing a fairly extensive survey of English collocations and we've been struck by an odd phenomenon: there are quite a large number of collocations in English that would not be statistically significant were it not for their appearance in fiction. In other words, certain fairly natural word combinations do not appear with sufficient frequency in print (or in pixels) to register on collocational radar if their appearances in fiction are discounted.

The reason this strikes us as being odd is this: isn't a great deal of fiction supposed to be a reflection of what really goes down? Does art imitate life, or doesn't it? Of course it is not surprising that particular collocations should appear mainly in genre fiction — warp speed and cryogenically frozen, for example, immediately conjure science fiction, and collocations like lace bodice and fiery passion more or less define the romance genre. But the startling finding is that a number of ordinary collocations seem to find a home far more frequently in fiction generally than in any other genre of writing. We'll illustrate a few examples from the letter B, the spot in the alphabet where we were recently parked for a couple weeks.

The verb brush, as its wordmap illustrates, is an extremely busy one in English. It's probably safe to say that all of its many verbal uses are rooted in the original noun sense — that is, an implement with bristles set in a handle. Brush as a transitive verb can accommodate a wide range of objects. The top five in terms of salience (that is, roughly speaking, in terms of statistical significance) are: teeth, hair, strand, lock, and lip. Three of these collocates, notice, are about hair. Two of them (teeth and hair) exemplify the sense of brush that means "apply a brush to"; three of them (hair again, as well as strand and lock) are the sense of brush that means "remove by sweeping with the hand," as in "He brushed a lock of his brown hair aside." The last collocation, lips, is the sense that means "touch gently": "Armand lightly brushed her lips with his."

Now, here's the curious thing: were it not for their appearances in fiction, the only one of these collocations of brush that hits the billboards for English generally is teeth. Doctors, journalists, bloggers, scientists and others (as well as fiction writers) all write about brushing teeth with some regularity. But the other main collocates of brush — hair, strand, lock, and lip — appear overwhelmingly in fiction — up to 150 times more frequently than in any other genre. The frequency of these collocations in fiction is so great that it completely skews the statistics for brush generally. Why is this?

Lips is the standalone and probably the easiest one to decipher: the frequency of "brush lips" in fiction may simply be a reflection of two things:

  1. sex sells, and a modern novel without someone's lips brushing up against someone else's (or up against someone else's cheek, ear, shoulder, etc.) is a dry novel indeed.
  2. while brushing of lips may be quite a frequent human behavior, it is mainly associated with a sort of intimacy that is not widely written about in factual genres.

For the hair-related collocations, the evidence points to a different, irrefutable conclusion: fictional characters cannot stop playing with their hair. Fictional heroines constantly brush theirs, often as a backdrop to pondering weighty questions or holding tête-à-têtes with their confidantes. Fictional personages of both sexes seem obliged to spend a great deal of time getting hair out of the way, whether it's their own or others': they brush it back, they brush auburn locks off their foreheads, they brush blond-streaked strands out of their eyes.

The other conclusions that might be drawn from this depend on your view of fiction. It's possible that people are, in fact, always playing with their hair, and that it only ever gets written about in fiction. This conclusion is consistent with the "art imitates life" thesis, and is certainly supported by the fact that grooming of self and others is a basic mammalian behavior: have hair, will groom. On the other hand, you may conclude that hair manipulation is merely a device, a convention that fiction writers use to represent, emblematically, any number of motives on the part of their characters. Finally, and less charitably, you might conclude that all this stuff about hair is mere fictional cliché: something fiction writers throw in because Fiction 101 says the characters must be doing something, and so much of the time, when they're nattering on or laying the internal-dialog groundwork for some startling insight, there really isn't anything else for them to do.

It's rather early days in our study of collocations (the rest of the alphabet still lies waiting), and we expect that by the end of it we'll have a more definite view of the rather skewed patterns of English that are represented in fiction. For now, we offer for your perusal some other collocations of words beginning with b that we found appear mainly — sometimes overwhelmingly — in fiction. We have taken the liberty of arranging them in an impromptu narrative. The frequent (in fiction) collocations are in italics:

A powerfully built man knocked and entered the basement. It was Jayde's new bodyguard! I sat bolt upright. Hadn't I bolted the door?
"Just checking up on you," he said, blowing a cloud of smoke in my direction. I noticed a bruise on his bronzed skin.
"What's the fruit basket for?" I blared loudly. I blamed myself. I bit my lip. I drew a breath. Would he think that I was growing bold?
"Just thought we might grab a bite," he blurted out.

This book chapter gives a good background in collocations:
http://tangra.si.umich.edu/~radev/papers/handbook00.pdf

There's a nice PowerPoint presentation here, summarizing approaches to the study of collocations and why they're important:
www.cs.tau.ac.il/~nachumd/NLP/Collocations.ppt


Rate this article:

Click here to read more articles from Language Lounge.

Orin Hargraves is an independent lexicographer and contributor to numerous dictionaries published in the US, the UK, and Europe. He is also the author of Mighty Fine Words and Smashing Expressions (Oxford), the definitive guide to British and American differences, and Slang Rules! (Merriam-Webster), a practical guide for English learners. In addition to writing the Language Lounge column, Orin also writes for the Macmillan Dictionary Blog. Click here to visit his website. Click here to read more articles by Orin Hargraves.

Join the conversation

Comments from our users:

Thursday May 1st 2008, 8:02 AM
Comment by: valeria B.
Thank you for the very intresting article, are you going to publish the final results and an extended description of the research process?
Thank you for the presentation too!
Thursday May 1st 2008, 9:49 AM
Comment by: Thomas S.
How about "Brush with death" and "brushed him off" and "bristling with hatred?" I think I read too much mystery fiction. Really interesting article. I've got to brush up my sensitivity to the word.
Thursday May 1st 2008, 10:57 AM
Comment by: Mary Lee M.
How about brushed nickel? I am surprised that did not show up in the non-fiction arena, since it a common finish for things like door knobs and lighting fixtures. It's also used to refer to finishes in jewelry.
Thursday May 1st 2008, 11:20 AM
Comment by: Maria D.
Does art imitate life, or doesn't it? Art is for sure related to life, and it mirrors all different thoughts, ideas and actions of human existence. Art is thoughtfully created and life is simply lived, so, maybe, here lies the difference. Words in a piece of fiction writing are layered; they are heavy in meaning. Differently than the spoken daily words that are superficial.
It is so difficult for me to imagine that one day "computers might someday read, comprehend, summarize, and even translate language as efficiently and accurately as the good old human brain does." Language understanding requires feelings and instict. Unless human essence becomes tangible and "transplanted" into computers.
I have to say that I lovveeee visual thesaurus. It is a wonderful tool to explore the "world of the words."
Thursday May 1st 2008, 12:52 PM
Comment by: Mary S.
After stumbling upon this website about two months ago - - -I'm HOOKED
Thursday May 1st 2008, 5:16 PM
Comment by: Anne G.
The comment "doesn't art imitate life" is as old as criticism (Aristotle?). It's that idea of the mirror and the lamp--does it reflect life or does it, instead, light the path for us. M. H. Abrams wrote the seminal book, "The Mirror and the Lamp," on these distinctions back in 1971.
Friday May 2nd 2008, 1:08 AM
Comment by: Mattie D.
As a former English as a Foreign Language teacher I am aware of the difficulty involved in translating collocations, but more surprising is how easily understood they are by students of the language.
Friday May 2nd 2008, 9:50 PM
Comment by: Josefina B.
I think Art can camouflage what's real in life and it can also show the truth that life wants to hide. Wonderful wonderful article. Why am i not a lexicographer? Where do you order a book that lightly storytells about words? yummy!
Saturday May 3rd 2008, 7:18 AM
Comment by: Orin Hargraves (CO)Visual Thesaurus Contributor
Answers and comments:

Thomas S: "brush with death" is actually most common in journalism, but fiction holds the second-place spot. You're right that "brush him (or her/me/you/etc.) off" is more frequent in fiction than other genres. "Bristle with hatred" turns out to be not all that frequent: top honors go to "bristle with indignation/energy/confidence"

Mary Lee M: "brushed nickel" is just frequent enough to make the charts and turns up mainly in "lifestyle" journalism.

Valeria B: It's likely that quite a lot of literature will appear in forthcoming years about collocations in English (and other languages) as the tools to analyze them become more sophisticated. All of the data we work with in the Lounge is proprietary but if we can engage the sympathies of its owners we will likely publish something. As for understanding the "research process": it's pretty straightforward and actually mind-numbing at times -- just looking at screens and screens of data and drawing conclusions about it. Our chief tool is Adam Kilgarriff's Word Sketch, which you can read about here: http://trac.sketchengine.co.uk/wiki/SkE/DocsIndex

Thanks to all for your comments, which are always much appreciated.

Monday May 5th 2008, 9:37 PM
Comment by: valeria B.
Thank you once again. I really don't know where my interest in language wants to take me. I'm a sort of psychologist who works for an IT Company, and I find it quite a stimulating mix!
Disclaimer for bad english: I'm Italian! :-)
Monday May 12th 2008, 2:06 PM
Comment by: Karin E.
we could just brush it off . . .
Thursday June 5th 2008, 9:06 PM
Comment by: Wood F.
This is really interesting. I have a hunch that many collocations appear in fiction more than anywhere else because they have become cliches that automatically come to mind when writers are trying to think of a way to describe something, but don't want to work too hard. It is so easy for overused phrases to sneak into fiction writing unnoticed and, unfortunately, unedited. The more frequently they appear, the more likely they are to be used again by lazy writers, and soon you have an overwhelming preponderance of these phrases in fiction as opposed to anywhere else.

Although I have to say I'm puzzled by 'fruit basket.'
Monday September 1st 2008, 11:25 AM
Comment by: Sheree C.
This is hysterical.
Friday May 1st 2009, 8:25 AM
Comment by: David S. (Teaneck, NJ)
Why fiction...

Because these are textual meems that are representing visual content...

We don't say "fruit basket", we point to it...

Nobody says "Why do you always brush your hair back?"... they say "Why do you do that?" and point...

In fiction, words have to paint the picture that normally we just "see"

-David S.
Thursday June 2nd 2011, 8:38 AM
Comment by: Jan T. (Cincinnati, OH)
cool

Do you have a comment?

Share it with the Visual Thesaurus community.

Your comments:

Sign in to post a comment!

We're sorry, you must be a subscriber to comment.

Click here to subscribe today.

Already a subscriber? Click here to login.

Operative Words
- 10 Comments
Orin dives into a database of collocations and emerges with all manner of clichés and idioms.
Behold the Corpus
- 1 Comment
Ben Zimmer explains how massive corpora, or collections of texts, are transforming dictionaries and thesauruses.
Nancy Says...
Contributor Nancy Friedman recommends useful websites about readin' and writin'.