Rethinking user testing

David Gillis
Jan 3 3

beakers

Expanding the horizons and expanding the parameters,
Expanding the rhymes of sucker MC amateurs

~ The Beastie Boys, The Sounds of Science, 1989

I’ve always thought it’d be cool to be a scientist—a real scientist, with the lab coat and the beakers and whatnot. You could win friends and influence people (and pwn enemies) anytime, anywhere.

Sometimes I get the impression that I share this secret ambition with web and UI designers at large. After all, making design decisions that are “only” based in a team’s collective experience, thoughtfulness, observation, trial and error, etc. leaves those decisions open to critique. But grounding/couching your work in some sort of rigorous-sounding, quantifiable, testable result: that’s science…you can’t beat that!

Now, don’t get me wrong. Testing your design in appropriate ways can be invaluable (I’ll talk more about this in another post). But I really take issue with the idea that User Testing, per se, leads to great design. I’ve seen just the opposite happen.

I think this is the case because we’re trying to appropriate a tool that loses its power and actually becomes counter-productive when used outside of the context it was designed for. We’re borrowing from the experimental design paradigm in cognitive science, which is a scholarly discipline closely related to the applied field of HCI. But have you ever seen an actual experiment in cog sci? When I performed and ran a few of these back in school, they usually worked something like this:

  • Sit someone down in front of a computer in a small room (maybe there’s a video camera or some kind of monitoring equipment set up)
  • Have them stare at a dot in the middle of the computer monitor and hit A, B, or C as soon as they recognize some sort of visual stimuli presented to the screen
  • Record the timing and number of errors
  • Repeat the experiment with a bunch of different people, changing up one of two “explanatory variables” that you’ve guessed will have an impact on performance.
  • Run statistical analyses on the results and use these to draw conclusions.

Now this kind of experiment is obviously very narrowly focused in its scope, and necessarily so. There are a bunch of reasons why, but the two main ones are these: reliability and validity. For an experiment to be reliable means that repeating it over and over again yields the same result. For an experiment to be valid means that it gives cogent answers (even if they’re only partial answers) to the questions you asked in the first place.

Interactive experiences like websites are complex phenomena. They don’t naturally lend themselves to the kind of experimental protocol described above because user performance varies greatly from person to person and from session to session. There are so many potentially confounding variables in play that reliability suffers. (The model experiment above tries to eliminate this problem by paring down the user’s task to a few basic actions.) You’re measuring 10th order effects and it becomes nearly impossible to establish causal connections between design characteristics and user performance.

If we do streamline things so that we’re just measuring one or two explanatory variables vs. 50 (say, by temporarily removing elements from the design), the experiment becomes more reliable, but less valid. That is to say, the results—while repeatable—can’t really be generalized to answer the type of questions that we want to ask in the first place (questions like ‘is this design easy to understand and use’), because we’re not truly testing the design.

Sometimes usability researchers will employ something called a “talk aloud protocol” to try and tap into the cognitive processes underlying user performance in a given scenario. This involves asking users who are testing a given design to explain what’s going through their heads as they move through some sort of task flow.

Again, I have real problems with this pseudo scientific approach to evidence-based design. For one, the act of talking about what you’re doing changes the nature of that experience. But more importantly, most people can’t accurately report on why they do what they do—that’s why there are such a fields of inquiry as cognitive science and psychology in the first place!

I don’t want to be overly cynical here, but do want to caution usability professionals and interaction designers in general: user testing can be helpful, but also misleading. It can also be a powerful political gambit or rhetorical expedient. If you’re going to test something, make sure you’re asking the right kinds of questions. User testing can be used to tune or optimize design, but cannot and should not substitute for creativity or thoughtful trial and error.

3

Comments

Jan 3 12:03 pm
Jon Lax said:

Early in my career an account executive was complaining about being in focus groups for two days straight. He complained how useless they were and said “If you want to study gorillas do you go to the zoo and watch them?”. That always stuck with me.

Lab based testing puts people in a zoo and then tries to extrapolate real world behaviors from them. With technology and adaptive systems it is far better to take the money you would spend on testing and use it to make 2 or three monthly upgrades to the product after it launches. Nothing is perfect on launch. Accept it and move on.

Jan 4 1:07 pm
Dan Dickinson said:

David, thank you. This articulates better than I ever could why it’s always seemed such a waste of time to test two dozen users in a cage.

Jon, I’m giddy with anticipation about the first time I get to use that zoo quote…

Jan 9 11:38 am
John said:

As a designer who has completed numerous types of user research (ethnographic, focus groups, formal lab studies and informal studies), I think the key word your missing is insight. No process ensures success but getting actual users in front of the product provides an opportunity to gain insight that you normally may not have from you previous experience.

You are right that the set up needs to be properly done and the methods need to evolve. I think that’s the problem – Most companies don’t take the time to set up the research or they have the wrong people doing it. Done correctly, there are more opportunities to obtain insight and good insight leads to innovation.

For the most part, it’s not necessary for companies in the marketing communication industry to focus on detailed research since your output has a shelf life of 8 months or even less…why invest in this? However, there are other IT industries where it’s extremely important. Next time your frustrated with your applications or on-line services, think about the research or usability report that was probably ignored by the designer.

In closing, there are very few great products but with user research there can be many more better products- Watch what they do, not what they say.

Post Your Comment




Client Login Access our review area to see the great work we're doing. Login
Why Choose Us? Our 5 minute presentation will give you 5 good reasons. View the Presentation
labs.teehanlax.com A showcase of our ideas + executions outside of everyday client work. Enter the Lab