In order to observe interactive technology use in the wild, I staked out a pair of document scanning stations in the NYU library. These stations, with a scanner attached to a specialized computer kiosk, allow for scanning books, chapters, articles, notes, or any of the other paper information one finds in the library. I assumed that the users (as a sample of University students and faculty) would not be experts, but would be more computer literate than the general public. I assumed the system would be used mostly for books and multi-page scans, and that it would support various digital file transfer systems (given that it wasn’t a photocopy system).
Having used many frustrating scanners in the past, I didn’t have high expectations for usability, so I was pleased that most of the people I observed could operate the mechanism with relative ease. One tap on the touch screen, and you’re prompted to pick a delivery mechanism (email, Google drive, etc). Another tap and you’re faced with a page of scan settings. Here users diverged, between those who immediately accepted the defaults and those who scrutinized their options at length. The deliberators were often those who were scanning large documents and wanted to perfect the settings at the outset (a tricky task without visual indication of the effect of a given setting). Once settings were approved, then the user pressed one more button and the machine scanned the page, automatically cropped and rotated it, and displayed it for the user to edit further. On one extreme, I saw a user approach and successfully scan a single-page document in less than 90 seconds.
At this point, I noticed one major problem with the workflow. With a scanned page open, it was unclear for many people whether the “scan” button in the corner (which they’d initially pressed) would now replace the current scan with a new scan, or add a new scan as a second page. Assuming the former, I saw multiple people instead push the “Next” button, only to backtrack when they were confronted with an interface for emailing their scanned document. In fact, the workflow for multiple pages was quite tidy—keep hitting scan on the same screen and it keeps adding pages. Once they settled into this rhythm, users generally took no more than 10 seconds a page, but the interface did little to suggest that this flow was possible.
This process constitutes a very clear interaction by Crawford’s definition: the user instructs the computer to scan a page, the computer processes and outputs an image of a scanned page, the user evaluates this image and responds to the computer accordingly. On the large scale, I felt like all the users understood the interaction. However, on the small scale, certain details caused problems. In particular, I think the touchscreen interface is lacking. It worked pretty well early on in the process, when one must choose one large button from four or six. Inevitably, though, the user wishes to actually save the scanned file, and this means typing an email address, or password, or file name on a vertically-mounted touchscreen keyboard. The users I observed were typing maybe ten words per minute, in one of the most inefficient parts of the whole process. Often, the button wouldn’t even register the tap. Whatever visual feedback was supposed to work here had failed, causing the user to wait (in hopes that the computer was stalled) before trying to press the button again. Without the physical affordances of buttons, these virtual buttons gave little indication of whether they’d been pressed, let alone what angle, force, etc., was required to effectively activate them.