Assignment 4: Heuristic evaluation


As Carolyn Snyder writes, "Paper prototyping is a variation of usability testing where the representative users perform realistic tasks by interacting with a paper version of the interface that is manipulated by a person 'playing computer' who doesn't explain how the interface is intended to work." In this assignment, your will conduct heuristic evaluations (HEs) of another team's paper prototype. In studio, you will be assigned a group to evaluate.


The heuristic evaluations will be a way to highlight usability issues in your rapid electronic prototypes. Heuristic evaluations will follow the "How to Conduct a Heuristic Evaluation" and "Ten Usability Heuristics" readings by Jakob Nielsen. Using Nielsen's ten heuristics, you as an evaluator will list as many usability issues as possible. It is less important that the evaluators worry about exactly which heuristic the usability issues occur in. The heuristics are there to help evaluators find a variety of different types of usability issues. For each usability issue found the evaluator should capture a description of the issue written as concisely and as specifically as possible. For an thorough explanation of the heuristic evaluation process, see the course videos.

Step 1: (Team) Conduct Multiple Walkthroughs

First of all, you need to master the skill of operating your paper prototypes. Don't embarrass yourself in front of your expert evaluators by taking five minutes to find the next bit of the prototype to swap in. The smoother your paper prototypes run, the better. All team members need to learn how to act as the computer behind the prototypes, so go through a couple practice runs of each of your two prototypes. Take turns, with one of you being the evaluator and one being the computer. Practice runs like these are called walkthroughs. Walkthroughs will get you comfortable operating the paper prototype and help you identify problems with it (for example missing pieces or dead ends).

Step 2: (Team) Receive 3 Heuristic Evaluations

Now that you have prepared yourself to run your paper prototypes smoothly, you are ready to conduct a heuristic evaluation (HE) session. At least two team members should be present for all sessions. To facilitate this we will assign 3 evaluators to your team in studio, and provide time during the next lecture for you to perform back-to-back evaluations (one at a time). We highly recommend that you finish your heuristic evaluations over the weekend, so that lecture time can be dedicated to step 4, which students have found the most difficult to schedule in the past.

For each HE session, one person from your team will be the facilitator. What's a facilitator? The facilitator should greet the evaluator, explain how the session will work, and give a brief introduction to your prototypes. One (or more, if possible) person should be the computer. Once the evaluation has started, the facilitator (or more, if possible) should observe and take notes/pictures.

Make things easy for your evaluator by printing a sheet of Nielsen's heuristics. Prepare a sheet for them to fill out while evaluating. (Try using this worksheet.) A good idea is to set up a spreadsheet on a laptop that you and the evaluator can share. (This spreadsheet makes it easy for the evaluator to do his assignment and for you to improve the design.) Remember that the evaluator is the expert. Let them explore and evaluate the interface as they choose, but make sure that they go over each of your two proposed prototypes.

For mobile apps of our course's scale, each evaluator should take about 20 minutes.

Step 3: (Individual) Be an Expert Evaluator Once

As an evaluator, keep in mind that you are the expert and you should go through the interface the way that you would prefer to. Be thorough and write down all problems you can find. Don't try to be "nice" by not reporting problems, everything you find will help the team improve their interface. Since you are evaluating two prototypes, try to make your feedback comparative. For example, if a prototype had a big problem that you also found in the other prototype, make a note of it. If one prototype had a problem that, for some interesting reason or another, was solved or not an issue in the other prototype, make a note of that as well. You don't have to compare both prototypes in every sentence in your evaluation (and it's even fine if most of your feedback isn't comparative), but you should highlight enough similarities and differences between the two prototypes so that the group receiving your feedback will understand the advantages and drawbacks and each design. Your feedback should help the group understand and decide which features of which designs they should implement in the coming weeks.

Use Nielsen's heuristics as a guide and specify what heuristic(s) each problem is related to. If you come across problems that don't fall neatly into particular heuristics, mark that no heuristics apply. As long as your discussion of the problems clearly shows that you understand and are trying to apply Nielsen's heuristics, you will get full credit in the heuristics category (see the grading rubric). Getting the problem written down with a severity rating is the more important part. Use Nielsen's Severity Ratings for Usability Problems.

If you use a Google document, make sure that both you and the team you are evaluating has access to your evaluation. Go over both of the team's prototypes. You can use the comparison to inform your feedback. The evaluation will be part of your submission this week.

For mobile apps of our courses scale, each evaluator should take about 20 minutes.

Step 4: (Individual) Meet with Other Evaluators

Meet with the two (or more for uneven group sizes) other evaluators of the same prototypes and (at least) one member of the prototype's design team. Together, discuss the general characteristics of the UI you evaluated, and suggest potential improvements to address major usability problems that you identified. Aggregate your evaluation with the other evaluations for the prototypes. Did you agree on the most severe usability problems with the other evaluators? Come to an agreement on which problems are most important, then brainstorm potential solutions with the team you evaluated.

Finally, with the other evaluators, distill all of this down to one paragraph where you address the major problems you all identified, as well as the potential solutions. Include a sentence or two reflecting on what kinds of things you found heuristic evaluation valuable for, and what kinds of things it's not very useful for. All evaluators of the same prototypes will submit the same paragraph.

Step 5: (Team) Create a Home Screen

To get your feet wet with web development of your app, create a home screen. It should include the suggestions your evaluators have. It shouldn't be pretty yet: don't worry about styling.

Student Examples

Example 1 - This student did an incredibly thorough heuristic evaluation. We especially like how this student incorporated comparative feedback. Keep in mind that the this assignment was slightly different last year.

Example 2 - This is a weak example of a heuristic evaluation, as it wasn’t very thorough. The general structure of this example is decent, however, where problems are listed for each prototype and the heuristics and severity scores are identified. This example would have benefitted more from comparison of the prototypes.

Frequently Asked Questions

How many heuristic violations should we find?

We're not requiring an exact number of heuristic violations-- you can follow the assignment examples as a guideline for what we are expecting.

Do we need to differentiate between the prototypes in our evaluations?

Yes, you should clearly label "Prototype 1" or "Prototype 2," and make your feedback as comparative as possible.

Can we change our paper prototypes between submitting A03 and getting heuristic evaluation done for A04?

Yes. You might have received feedback in studio or from TA in your grades for A03 and you should feel free to modify your prototypes to get the best feedback during heuristic evaluation.


This assignment will be submitted individually, in a single formatted pdf file with the following items:

  • A typed heuristic evaluation of another group's two paper prototypes. This comprises a bulleted list of usability issues you found, along with their severity, for each prototype. Include comparative feedback between the two prototypes.(Relation to Nielsen’s Heuristics, Volume of Feedback, Quality of Feedback, Severity Ratings)
  • A paragraph, written jointly with the other evaluators, addressing the major problems identified with the prototypes and potential solutions. Include a sentence or two reflecting on what kinds of things you found heuristic evaluation valuable for, and what kinds of things it's not very useful for. All evaluators of the same prototypes will submit this paragraph. (Aggregate Evaluation & Reflection)
  • A link to the team's webapp home screen. (Home Screen)

Submit your formatted pdf here

Evaluation Criteria & Grading Rubric

Category Nope Weak Proficient Mastery
Relation to Nielsen’s Heuristics
3 points
Minimal relation to Nielsen’s heuristics. Analysis and relations to Nielsen's heuristics are overly vague or tough to understand. Evaluation applies heuristics to partner's prototype in a useful, organized way--most of the time. There may be a few parts, though, that don't seem as related to the heuristics, which may make the reader wonder if the evaluator was still familiarizing himself with the heuristics during the evaluation. Clearly grounded in Nielsen's heuristics. This evaluation could be used in the lecture about Nielsen's heuristics.
Volume of Feedback
3 points
No interpretable, individually-generated feedback. Could be empty, illegible, copied from other group members, or all generated as a group (rather than indivudally). A small amount of legible, individually-generated feedback. The evaluator missed a majority of the obvious heuristic violations in the prototype. There is a good amount of feedback, but there probably could be more. A reader gets the feeling that the evaluator was trying to be "nice" or "hold back" certain feedback. At least 15 violations found, employing at least 8 of the 10 heuristics. You really couldn't ask for more. Clearly the evaluator "showed no mercy."
Quality of Feedback
3 points
No feedback given. Most of the feedback was obvious (you could have come up with it without really going through the HE process) or vague (the designers might not be sure what the problem you're referring to is). There is a good amount of high-quality feedback, but some feedback may still be obvious or vague. It also may lack a useful comparison between the two prototypes. Insightful, widely varied feedback that compared the two prototypes. This feedback will give the designers a solid grasp of the advantages and drawbacks of each design, and help them decide which design or which features to implement.
Severity Ratings
2 points
At least half of the problems have no ratings. A large fraction of the problems have poorly-chosen severity ratings. A severity rating is poorly chosen when serious problems--such as fundamental issues with the core information architecture--are rated low-severity, or if minor issues--like small layout issues--are rated high severity. Pretty much all of the problems have appropriate severity ratings. As a result, the group that made the prototypes can make a good prioritized list of problems to address in their designs.
Aggregate Evaluation & Reflection
3 pts
None submitted online. The conclusions reached by the evaluators as a group was just a rehash of all the individual evaluations. No potential solutions to major usability problems were brainstormed. The conclusions reached by the evaluators as a group was just a rehash of all the individual evaluations. Potential solutions to major usability problems were brainstormed, but they were obvious, and something that the group that made the prototypes could have figured out for themselves. The conclusions reached by the evaluators as a group were more insightful than the aggregation of all the individual evaluations. Potential solutions to major usability problems were creative and insightful.
Home Screen
3 pts
No link to an HTML home screen provided. Image-based mock-ups (e.g., Photoshop) receive no credit. HTML home screen has little content. HTML home screen appears to have most of its content. It does not need to be pretty or highly stylized. Home screen content is very thoroughly developed. It does not need to be pretty or highly stylized.

Outside the Box
1 pt
Multiple parts of the feedback were innovative and interesting. The student has obviously spent a lot of time and energy applying Nielsen's Heuristics in a clever manner.