The index asserts nothing; it only says "There!" It takes hold of our eyes, as it were, and forcibly directs them to a particular object, and there it stops.
— C. S. Peirce

A new plague of time-squandering has descended on the Lounge; it has easily pushed out earlier rivals, like playing Scrabble with friends on Facebook and watching old Patsy Cline videos on YouTube. Now the Loungeurs are in the grip of  Google Image Labeler. The devil himself could not have devised a better method of timesucking: you can easily incorporate it into any multitasking scheme; it asks, nominally, only two minutes out of your day;  and it's all about words!

The initial thrill of playing the game was the challenge of coming up with perfect labels: words that most succinctly describe a small image appearing on your screen. This thrill quickly disappears, however, because you find that perfect labels don't win points, and points are what it's about: not only because everyone likes scoring, but because you cannot move on to a new image until you and a perfect stranger who is your playing partner in cyberspace land on the same label for the current picture (and thus each score). So you evolve very quickly from trying to find the perfect label (an activity that you could almost justify to yourself since it exercises the depth and breadth of your imagination and vocabulary) to trying to find the lowest common denominator label: the word most likely to come into the mind of your partner, whom you develop a composite picture of after a few games. The composite picture that develops of the typical Google Image Labeler, in our view, is a twenty-something male, steeped in contemporary pop culture but perhaps not much else, who is probably multislacking while at work, or perhaps taking a break from gaming online. Is this who should be entrusted with the monumental task of indexing images on the Internet?

Google Image Labeler, will (according to Google) help "improve the relevance of image search for users like yourself." While in the thrall of it we are often put in mind of Charles Saunders Peirce: he's a late-19th/early-20th century American philosopher, much beloved in the Lounge not only for his clear thought but because he was also a lexicographer: he contributed more than 5,000 definitions to the Century Dictionary. Peirce devoted a lot of thought to signs, by which he meant something intermediary between an object and the mind. His most famous classification of signs is threefold: the ikon, the index, and the symbol. Ikons resemble their objects; any photograph, image, or realistic drawing or painting is an example of an ikon. Indices bear a real relationship to their objects, such that a change in the object would be reflected in a change in the index: an index is an indication of another thing, in the most literal sense. Symbols bear an arbitrary relationship to their objects, and are connected to them only by virtue of usage and convention; most words are symbols.

In Google Image Labeler, players examine ikons. They assign symbols to them (words, or "labels"), with the ostensible view of generating an index. This would be an index in two senses: the conventional one, being an organized list of items with systematic reference to another thing; and a Peirceian index, in that the index as a whole would bear a real relationship to the set of images labeled.

Here's the problem: Google Image Labeler is currently designed to attract a maximum of inferior symbols (labels, in Google's terminology), and a minimum of good ones. A couple of examples:

sample image*

labels that would helpfully index the image

labels typically rewarded in Google Image Labeler

Brandenburg Gate

Brandenburg Gate, Brandenburg Tor, Berlin,
Germany, European Architecture

stone, sky, people, tourists, arches, blue, summer, tourist, walking, outside, sun

Brigitte Bardot

Brigitte Bardot, Brigitte, Bardot, French actress, actress, portrait, pose, glossy photo, 20th century figures

blonde, lips, hair, blue, skin, dress, hot, sexy, woman, girl, babe, chick, face, eyes


The second picture here would also attract (and Google would reward) two other labels that we have not listed, referring to features of Ms. Bardot's anatomy — that's the typical level of cognition where minds meet in Google Image Labeler.

The value of an index — as any user of a reference book can attest — is the thoroughness, aptness, and granularity of its contents. Indexers work in probably even more obscurity than lexicographers, but the service they provide is immeasurable: they enable us to find a needle in a haystack. A good index effectively imposes a numbered, three-dimensional grid on the haystack and tells us which numbered box to look in.

Indexing a book, though it is a highly specialized skill, has an inbuilt simplicity: it equates like with like, that is, words with words: words found in an index are overwhelmingly also found in their reference, and thus a book index actually bears an ikonic relationship with its object. Indexing something other than words (images, smells, sounds, and so forth) is inherently more complex: it requires the assignment of words to things that are not words, and mainly do not contain words. This relationship can be symbolic only. Surely then, this is a job that, to be done properly, requires even more specialized skill. So it seems a pretty far stretch to think that Google is going to succeed in generating a valuable word index by crowdsourcing the job to anyone willing to have a crack at it: the results obtained from their labeler seem better suited to collect noise than signal, and everyone who uses search engines can attest that excess noise is already a big part of the problem in trying to find information online.

The proviso in our observation is that we don't know what Google is going to do with the data it generates, and of course it is possible, if not likely, that the marvelous minds there have anticipated or discovered the shortcomings we note and have found a way to deal with them. One thing is clear: they will have no shortage of data, because they have cleverly entrained an army of volunteers who will feed their datastream 24/7. 

Some good introductions to Peirce's semiotics, which we recommend as being just as stimulating and much more edifying than labeling images, can be found here:

http://www.helsinki.fi/science/commens/dictionary.html

http://plato.stanford.edu/entries/peirce-semiotics/#Int

Google Image Labeler is an example of a gwap: a "game with a purpose." If you're into that sort of thing, you will be able to waste (or put to good use! it all depends on your view) a considerable amount of time here:

http://www.gwap.com/gwap/

Luis von Ahn, Assistant Professor at Carnegie Mellon University, developed a game he called the ESP Game, which is the basis for Google Image Labeler. He gave a fascinating talk about his work to Google employees, in which he addresses some of the questions we raise, while leaving others a bit dangling:

http://video.google.com/videoplay?docid=-8246463980976635143&hl=en

* Images from WikiMedia Commons, but typical in size and resolution of ones that appear in Google Image Labeler.

Click here to read more articles from Language Lounge.

Orin Hargraves is an independent lexicographer and contributor to numerous dictionaries published in the US, the UK, and Europe. He is also the author of Mighty Fine Words and Smashing Expressions (Oxford), the definitive guide to British and American differences, and Slang Rules! (Merriam-Webster), a practical guide for English learners. In addition to writing the Language Lounge column, Orin also writes for the Macmillan Dictionary Blog. Click here to visit his website. Click here to read more articles by Orin Hargraves.

The Stuff of Fiction
Ever wonder why fictional characters are always brushing their hair and biting their lips?
The Data Cloud
The Language Lounge tries to grasp the nebulous metaphor of the "data cloud."
Further thoughts on the "cloud" and other techno-jargon.