While tirelessly displaying wordmaps for inquiring minds around the world, the Visual Thesaurus has another, equally demanding life that takes place behind the scenes. The dictionary that underlies the VT is widely used in linguistic research, mainly as a tool in various NLP applications: the art and science of getting computers to deal effectively and efficiently with huge bucketfuls of human language.
In this connection, this Loungeur has been beavering away in a quiet corner over the last few months on a project for the Linguistic Computing Laboratory. The task was to map the words in a large database of common English collocations to word senses in the VT in order to discover and document patterns. For example: if you see the verb-noun combination snap fingers or the verb-noun combination snap photo, your knowledge of English tells you that two different meanings of snap are in use. Similarly, when you hear fumbled snap and brandy snap, you recognize that two different noun senses of snap are being used.
Elementary as this may seem, teaching a computer to recognize distinctions such as these in a natural context is a time-consuming and laborious process - and that process is a growth industry these days, with researchers around the world hammering away at English, coaxing it yield up its secrets so that computers can process terabytes of text at lightning speed.
An interesting sidelight of looking at collocations by the cartload is that it easily identifies clichés and idioms. If you hear the noun-verb combination wheel grind, you probably mentally supply to a halt before you hear the words. Collocations like wheel grind that occur with some frequency in a database can be the basis for teaching a computer to recognize chunks of text at a glance, the way humans do: it is statistically safe to bet that when wheel grind (in any of its various inflections) is encountered in text, it is an instance of this idiom, whether literally (i.e., to describe a vehicle coming to rest) or figuratively (to describe a process abruptly ending, as in "the wheels of justice ground to a halt"). In other words, the wheel in wheel grind nearly always denotes the wheel of a (real or imaginary) vehicle and not, e.g., the wheel of mill or a steering wheel; and the grind in wheel grind has a meaning that is more or less limited to this idiom and does not mean, for example, "reduce to powder" or "dance by rotating the pelvis." (Again, elementary, but how would a computer know this?)
For ordinary speakers and writers of English, the idioms and clichés embodied in a database of collocations is both a gold mine and a mine field. You have the opportunity to check your words against thousands who have gone before you, to see how your words stack up. For a learner of English, your question may simply be "can you actually say that in English?" For a native speaker, your question may be "is this such a tired and worn cliché that using it is going to mark me as an unimaginative hack?"
Take, for example, the noun-verb collocation note creep. At first glance, it did not occur to us that these words ever belonged together in English. But as it turns out, notes are constantly creeping: into peoples, voices, that is. Here are some examples:
Another collocation that stymied us at first sight was skill desert. But in fact, people's skills desert them all the time: particularly, one assumes, when this is least desirable:
In both of the above cases, there is no question that you can use the noted words in combination: the question is, do you want to? Is it going to make you sound like a cheesy second-rate wannabe Hemingway, or is there enough life left in the expression that it will do the job you intend?
When we saw the collocation face float we were immediately transported to the finale of the great Busby Berkeley musical "The Gang's All Here," in which the singing heads of the film's main characters float around the silver screen on Technicolor backgrounds. But it turns out that for many writers, a floating face is merely a literary (or hackneyed -- you decide!) device to suggest that the thought of one person has entered the mind of another -- often in an admonitory fit of conscience:
The noun face figures pretty high in the collocation stakes and it is nearly always the sense that first comes to your mind: the expressive front of a person's head. We found, for example, that not only do faces float: they also flame, twist, and harden, though these events are hardly ever a Good Thing:
Writers have firmly established that when something untoward has happened, you can signal it with a face (or a smile) that freezes:
In all of these cases, the verb freeze can be safely nailed as the sense that means "become immobilized" rather than the "change to ice" sense, although the notion of coldness that lives inherently in freeze is certainly present in the case of frozen smiles. Devotees of English literature will be pleased to see that the first quote in the group above is from no less a figure than Charlotte Brontë (the line is from Jane Eyre), and that by using this figure of speech, you will be standing on the shoulders of a giant. On the other hand: after 160 years of exercise, is it time to put this old chestnut out to pasture?
At the other end of the temperature scale, we found that collocations suggestive of higher temperatures usually indicate that either love or anger is splashing onto the scene. Metaphors equating heat with love or anger are probably as old as language itself. It's interesting, however, that collocations around the verb melt -- something that you expect to happen when temperatures rise -- are only about love and passion. Who knew, for example, that both flesh and bones are subject to melting when a young man's (or woman's) ardorometer goes off the charts?
Usages such as these tend to make our flesh crawl rather than melt since we are not normally readers of this genre (the cites are all from romantic fiction). A group of such examples, however, tell the writer all he or she needs to know about the aptness of a particular expression and whether it is likely to be just the right thing, or horribly out of place, in whatever it is you're writing.
We recommend the study of collocations to all: they are a great place to observe many important patterns of English that go largely unnoticed in dictionaries. The data we used in our research project is based on the British National Corpus. It is accessible at a website called "Just the Word":
The Linguistic Computing Laboratory is a delightful haven for big-brained word wizards who toil tirelessly at the rockface between language and computers:
The tricky business of acquainting computers with the wiles of polysemous words comes under the heading of Word Sense Disambiguation (WSD in the trade), which you can read about here:
Finally, if you have never seen "The Gang's All Here" and its spectacular finale, the "Polka-Dot Polka," it's never too late. Faces don't float until rather late in the number, but the ride there is enjoyable. We should warn that there is the potential for serious time-wasting here and that you should only click the following link if you have an office with a door that closes!