I've been in the technical-writing world long enough to remember when graphical user interfaces — that is, the Macintosh and Windows — arrived, and when we had to learn how to describe this interesting new way of interacting with computers. We don't think about it much now, but there was a time when terms like point, drag, click and double-click, and maximizing a window were all new terms and concepts.
We're undergoing a similar terminological evolution again, this time to help us describe the latest shift in how we interact with our devices: using touch. For most people, this style of interaction is probably familiar from smartphones like the iPhone and from the new generation of tablets like the iPad. But what the Microsoft style guide refers to as contact gestures have been sneaking into our computers for some time in the form of touchpads on laptop computers, and Microsoft recently introduced a Touch Mouse that lets people sitting at ordinary desktop computers use their fingers to interact with the screen.
As the technology evolves, so does the vocabulary for it. For example, you don't click something on a phone or tablet or touchpad; you tap it. Likewise, you double-tap it instead of double-clicking it. With a mouse, you can click something, drag it, and then drop it. With your fingers, though, you only drag things; there's no more dropping, since there's no longer a mouse button to let go of in order to finish.
One of the most intuitive touch gestures is to flick, which both the Microsoft style guide and Apple style guide (PDF) are careful to distinguish from scrolling. Flicking to skip through options or to "turn" the "page" of an ebook is so natural a gesture that it's hard to imagine a touch-based interface without it. An important part of the flick gesture is that it's inertial — the response to your flicking gesture involves virtual physics, and the momentum of the thing you're flicking lets it continue to move even after you lift your finger.
A distinct term is to swipe, defined as sliding fingers across the screen from item to item. In contrast to flicking, swiping is implicitly more deliberate —in fact, much the way you swipe a debit card through a reader. (I personally use swiping most often when I play "Boggle"-inspired games on the phone, where you connect tiles to make words.)
Using a finger, you can also pan, in which you hold your finger on the screen and then move it to drag the screen around. Panning became an indispensable gesture to move around in online maps, and is now available for all sorts of elements on phones and tablets.
Particularly interesting are the terms that don't have real analogues with mouse-based gestures. To shrink something on the screen, you can pinch it. To expand it (that is, to zoom it), you can use those same two fingers to stretch it (Microsoft) or pinch it open (Apple). Pinching and stretching have already become gestures that I miss sorely when I want to resize an image and I'm not using a touch-enabled device.
Illustration of pinching and stretching from the Microsoft style guide.
There is a kind of full-circle aspect to the gestures we use on phones and tablets. When the windows metaphor was devised, it included virtual representations of physical controls like buttons. In earlier days, we couldn't interact directly with these controls, so we used a mouse and pointer. But now that we have touch technology, we're returning to a more natural way to tell the computer what to do. For example, even though buttons on a screen are virtual, you can press them with your finger, and as I say, few things are as intuitive on a touch-based computer as flicking through pages in an ebook.
A great thing about this terminology is that from the user's point of view, it doesn't actually need much defining. We don't have to have a glossary (although we do) that explains exactly what we mean by tap and swipe and pinch. In an app that follows the guidelines for touch-based user interface, users will intuitively understand how to manipulate the app with their fingers. I'm sure countless programmer hours were expended in designing this touch-based interface, and many more editorial hours were spent figuring out how to refer to the gestures. (In the computer world, it's a truism that the easier something is to use, the harder it was to design.)
The possibilities for new ways to interact with computers keep growing. At the moment most people still prefer a physical keyboard to using the virtual keyboards that are displayed on the screen, but there might come a day when tapping away at a QWERTY device will seem quaint. The Microsoft Kinect doesn't even require you to physically touch anything — you can control the device just by moving your hands in the air, which might (who knows?) spawn technical definitions for terms like wave and jump. And how many of us are waiting for the computer industry to perfect voice commands?
When these new forms of interaction become common, those of us who write about computers will need to think carefully about what terms to use and how to define them. But if we do our work right, users won't have to think about the gestures at all.