Usability Testing

The newest version of this chapter is coming soon! Subscribe to get the latest content.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Note to the reader:

This part of the field guide comes from our 2019 version of the UX Research Field Guide. Updated content for this chapter is coming soon!

Want to know when it's released?
Subscribe to our newsletter!

For most researchers, usability testing is deeply ingrained in what we do. It’s the process of seeing how “usable” your product is. Sometimes it means seeing how easy it is for website visitors to find a button or a piece of information. Sometimes usability has meanings that are more complex or esoteric, depending on what you’re building. But testing for usability will always involve exploring how well (or not) your target audience is able to accomplish the things you want them to be able to do, which (hopefully) are more or less the same things they want to accomplish.

How does usability testing work?

Generally, researchers will ask participants to attempt a series of assigned tasks, and then observe and record how well they can accomplish those tasks. Their ability to do the things you want them to do informs your next steps—namely, whether to make changes to your design or not, and which changes to make.

Usability testing can have both qualitative and quantitative elements, which we’ll get to below.

You don’t need a special lab or facility to conduct a usability test, though you can use one if it feels like the best choice. Lots of folks use a conference room or even run them remotely from their home office. We’ll get more into the nitty gritty of this below as well.

Usability tests are most often conducted by a moderator. The moderator comes ready with a list of pre-determined tasks they’d like the participants to try to execute in order to put the product or prototype through its paces. If the tester can't do the thing, or has trouble doing the thing, it means your design needs some work.

Usability testing can test any aspect of your product's function, but that doesn't mean you want to do it all in one go. Each usability test, for clarity of results and organization of ideas, should be organized around specific questions. This will keep your test and its results manageable.

After the test is done, you’ll gather and analyze data, and then make choices about the best next-steps.

When does usability testing make sense?

You'll want to run usability testing when you have a specific question that a test can help to answer. There are many different types of questions you can ask.

For example:

  • We want users to be able to find the information they’re looking for through several different paths—through search, through navigation, through tags. Does each path work effectively?
  • We want users to be able to complete a purchase—from product to payment—in under 4 clicks.
  • Uploading is the most important action, and so we want the upload button to be the most obvious thing on the page.
  • In the case of a game: We want users to be able to jump and crouch on this level.
  • In the case of a service: After users are drawn to our touch-screen kiosk, we want them to be enticed to first tap on the “start” feature.

Before you leap into testing, you should already have a functional versions or prototypes of your product. This means that usability test isn’t an appropriate method for the discovery phase. However, it is appropriate for just about every stage after discovery. It’s often employed at all levels of a product’s release, just as soon as you can interact with it, until long after its initial launch, in order to improve and optimize.

quantitative vs. qualitative by matt p lavoie
A quick illustration of Quantitative vs. Qualitative by Matt P. Lavoie

Qualitative vs. quantitative usability testing

Usability testing can be qualitative, quantitative, or both. The difference lies in the questions you need answered.

Quantitative

Quantitative data is numeric. Analysis is primarily statistical and can be entirely objective; an average success rate for a given task will be the same no matter who calculates the average. Research subjects (in this case, your test participants) are a representative sample of a larger population, so the results can be applied to the entire population. For example, quantitative testing can tell you that X% of users have trouble with a certain function.

Qualitative

Qualitative data is not numeric, but may be narrative or descriptive. Analysis involves extracting usable information from the data in a way that minimizes bias. For example, the raw data might be narrative descriptions of a tester attempting a series of tasks. The analyzed results would then consist of a report on which aspects of which tasks the tester had trouble with and what kind of trouble it was. Qualitative analysis yields results that can't be applied to the entire population, but instead provides insights that help explain the quantitative results. Qualitative data can tell you why users might have trouble, for example.

Qualitative results in usability studies

There is rarely such a thing as a quantitative-only usability test. You can’t help but observe nuance as you watch a user interact with your product. Observing people try to do a thing for the first time that you already know how to do is bound to engender observations that are amusing, frustrating, surprising, the whole gamut. This sort of qualitative material can help you get at the why behind your quantitative results.

The main challenge of qualitative data in usability studies will be to avoid drawing conclusions that would require quantitative data and statistical analysis. For example, if all seven of your testers sail through all their tasks easily, it would be natural to conclude that your product is perfect because "everyone" can use it, no problem. The human mind instinctively draws conclusions about what is normal or likely based on what you experience and what you hear about from others. But the reason we invented statistics is that our instinctive conclusions are sometimes very wrong.

Conducting a qualitative usability test

Conducting usability testing isn't difficult, but there can be a lot of moving parts, so proper planning and attention to detail are a good idea, especially when you’re new to it!

Planning for the test

Not every moderator is great at thinking on the fly, or knows the product well enough to be able to ad-lib. Planning for various eventualities can be a roadmap for the conversation. A clearly-written plan will also help you explain the goals of the test to your team, and help you achieve buy-in, if you need that. There are a number of points your plan should include, from the scope of the test to the number of testers, and we outline them all right now.

Define the scope and purpose

What do you want to learn from this test and how many of your product’s qualities and features do you need to probe to get that information?

To figure out what parts of your product to prioritize for testing, you might wish to explore...

When testing a prototype:

  • What concerned the development team during the design and build process?
  • Are there features that involved compromise or conflict between teams or key players during the build and design?
  • Are there aspects of the design you’re just not sure about?
  • Have you reinvented the wheel somewhere to try to be innovative and cool, but you’re just not sure it’ll work?

When testing a live product:

  • Which features get the most use from users enjoying the product?
  • Which features contribute most directly toward your product’s mission or end goal?

In a redesign/rebuild:

If you’re getting lost in the soup as you define test goals, return to your product goals. That’s ground zero and can continue you to guide your research goals for usability testing and beyond.

Location and equipment

You can conduct your test in a lab, in a meeting room or office, anywhere else that will make for a good place to observe users interacting with your product. You can go to them, or they can come to you, or you can observe remotely.

Many usability tests are done in a lab setting where observers (your team) sits out of the way—sometimes behind police-station-style two-way mirrors, and the moderator sits in the testing area with the participant. The participants generally know they’re being watched, but they can’t see or hear the team behind the glass.

Remote/online usability tests are also a great option, especially if your business caters to niche audiences that can be spread across the world. Remote tests can save time and lots of money. They might sacrifice some of the intimacy the lab-setting will afford, but the ease of remote testing more than makes up for anything that might get lost.

Once you decide on the best approach, any other logistical requirements will, for the most part, become obvious.

The test might require nothing besides your tester, the device with your product or prototype on it, and a couple of clipboards and pens so your observers can take notes. Or, you might need multiple recording devices (if raw results will need to be shared with a distributed team, say), screen capture software, refreshments….

Scheduling

You’ll choose which days and times work best for your teams and testers. You can plan your day/s of testing ahead of time if you figure out how long each test will take, how many tests you’d like to run per day, and how much time to leave between testers.

A great starting point to figure your schedule can be: 90 minute testing sessions, 30 minute breaks between, and as many participants as you can fit into your day, especially if you’re renting a special space.

90 minute sessions tend to be long enough to really allow a tester to meander through their process without overwhelming or boring the tester, moderator, or the observers. If you feel a shorter or longer session might benefit the test, adjust as needed.

Staff

Your testing staff will include a moderator and several observers. There may also be other assistants on hand, such as note-takers or camera crew. The moderator should be someone with good people skills and enough insight to steer the conversation where it needs to go. They should be conscious of tactics they can use to avoid bias during the interview. Most of all, they should be a comfortable and skilled interviewer.

You might hire someone for the task—a professional moderator who you educate about your product and goals. Or, it could be someone on your team who’s good with people and also already knows the product. It might be you! Or it might be one of your researchers. Whoever it is will set the tone for the study.

The observers won’t interact with the tester, so you can bring anyone in to observe who wants to benefit from the research. Observers can also guide the moderator during or between sessions, so key stakeholders should be included.

Metrics

To know where to look for data, just look at your test goals. What’s the purpose of the test?

You will have a guide with a list of the tasks you want your tester to accomplish, and any other relevant questions, and you’ll track the tester as they move through their tasks. For example:

What do you expect to happen when the users click here? What actually happened? The metric then becomes: the user successfully completed the task 55% of the time.

At the end, you might want to ask what their overall experience was like? Fun? Frustrating? The metric becomes: 25% of users were frustrated, 25% of users had fun, and 50% rated the experience neither fun nor frustrating.

Scenarios

Consider whether to break your test up into sub-tests. For example, if your product is a game, you might want to test a mobile version and a desktop version. If your user base is diverse, you might prefer testers from multiple demographic categories.

Testers

When crafting a plan for testers, consider questions like:

  • How many testers do you want overall?
  • How many subjects will you have time for?
  • What criteria do we need for our participants?
  • How are we going to recruit them?
  • How long will the recruitment process take?
  • How will we compensate each tester, and is it in the budget?
  • Do we need them to sign a non-disclosure agreement?
  • Other details like: Location, directions, parking

Recruiting Testers

If you have access to a pool of people typical of your users—like an established customer base, for example—you can recruit among them. You can also recruit participants from outside of your established pool. Either way, you can use User Interviews to find and organize your participants.

For a qualitative-only usability test, five to ten testers overall makes sense. More than ten, and you’re not likely to reveal anything brand new. Ten people's qualitative data is about as much as you can probably handle anyway. Remember that if you’re doing your own recruiting, some people won't show up—usually 10-20% depending on the methods you’re using to recruit—and others might not complete the test. You'll want to have a pool of a few extra, say 5-10% potential fill-ins.

usability testing

Usability testing is all about learning how your users use your product, not telling them how you want them to use it.

Moderating the Test

The moderator is in charge of the test itself, from setting the tone, to asking questions, to providing the tester with all necessary information—and no unnecessary information.

Types of Moderation

A good moderator will encourage each tester to share their thought process, to think aloud as it were. There are four basic approaches for doing so, each with its own advantages and disadvantages. Pick the one that suits your needs best.

  • Concurrent Think Aloud (CTA) means asking the tester to think aloud as they work. The only big drawback is that you won't find out how long the task takes when the user isn't talking. Also, not everyone is equally aware of their own thought processes.
  • Retrospective Think Aloud (RTA) means asking the tester to verbally repeat their thinking after the fact. You can show a video of the test as a prompt. The risk is that the tester might not remember their thought process well.
  • Concurrent Probing (CP) means asking the tester "why?" as needed throughout the test. The repeated interruptions could alter the results, but CP might not take as long as CTA.
  • Retrospective Probing (RP) means asking detailed questions after the fact. Again, recall could be an issue, but your questions may help them remember, and the task is never interrupted or delayed.

Assigning Tasks

When you give testers a task, state the goal of the task; do not provide instructions. See if the tester can accomplish the goal. Use clear, ordinary, non-technical language. If you're worried about being clear, working from a pre-written script helps. Creating numbers for tasks can help you organize your data later. If the tester does the task the wrong way, that’s information. Don’t correct them. Your product is being tested, not the tester. Though it can be difficult to watch the tester struggle with the task at hand, it's important to let them navigate through it on their own to gain the best data possible.

Taking Notes

The moderator can't take notes and moderate at the same time. You can simplify note taking by including an observer (or observers) to take notes as the test happens. If you don't have observers at your disposal, you can record your session. If you're conducting your test remotely, you can record directly from your video conferencing service. Some services, like Zoom, also include auto-transcription services, which can be especially useful for taking quotes from the interview later.

Analyzing and Reporting Your Findings

Raw test data is usually your notes from the test, occasionally you will also have recordings. Before analyzing the data, clean up and expand the notes so that they would make sense to someone who wasn't present for the test. Use the recordings, if you have them, to expand and enrich the notes. You can highlight specific moments and bring a little more of the person you were interviewing to the data. The 30 minute buffer between your sessions is a great time to do this!

Analyzing qualitative data consists mostly of going through the notes and "translating" them into clear, succinct descriptions of what the tester did and why. If two testers did essentially the same thing, the descriptions should be the same so that patterns across testers are clear. Then go through and look for patterns. Did one task stump everybody? Did it stump them all the same way, or were there a variety of stumpage types? Are there certain types of tasks that caused problems, and, if so, how and why? Record all the patterns you notice.

Based on the patterns you find, you should be able to clearly define a number of problems that the current version of your product has. Rank these by severity. Which must be fixed or the product won't work? Which are so frustrating that they could drive away users? And which are mild enough that, in a pinch, you could go ahead and market your product without fixing them?

Next, go ahead and write up a report, summarizing how you conducted the test, what your results were, and what problems (ordered by severity) you found through analysis. Recommend solutions to each problem. Include quotes and video clips from your testing sessions to show those who weren't there what your usability problems look like in action. If you are not going to be in charge of implementing your solutions, it will be up to you to use the report to make your case for your recommendations.

Usability testing is a must, not a nice-to-have

Businesses can no longer afford to skip usability testing. A product should simply not be launched without at least one round of testing to see whether users are able to interact with the products in the way you want them to.

There's always a chance that test results will expose a serious and hard-to-solve flaw and send your team all the way back to the drawing board. It's a little intimidating. But as long as you pay attention and plan properly, usability testing is not actually difficult to execute. The benefits are well-worth the effort, and in fact, it’s now a business must.

In a best case scenario, you’ll discover some aspects of your design that can be improved, and you’ll come away with an idea of how to best make those improvements. At this point, your design and development teams should draft some new prototypes and test those again, until you’re satisfied that your product is doing what it needs to do well enough to launch. There are other factors, like budget, the general urgency of your team, your pipeline of other projects, and myriad other business concerns that factor into when a product is ready to launch (or re-launch). But successful rounds qualitative usability testing should move the needle in a meaningful way toward launch or continued improvements to live products.

Remember that coaching users through a product, or trying after the fact to convince a user why the product is good, doesn't work very well and costs far more money in the end.

Recruit from our panel of over 4 million vetted participants

Sign up for free

Subscribe
X

Research like the pros

Get the full course and exclusive pro tips delivered straight to your inbox.
Thanks! See you in your inbox 👋
Oops! Something went wrong while submitting the form.