This article is about usability evaluation. For a list of Heuristic analysis topics ranging from application of heuristics to antivirus software, see Heuristic analysis.
A heuristic evaluation is a usability inspection method for computer software that helps to identify usability problems in the user interface design. It specifically involves evaluators examining the interface and judging its compliance with recognized usability principles (the "heuristics"). These evaluation methods are now widely taught and practiced in the new media sector, where user interfaces are often designed in a short space of time on a budget that may restrict the amount of money available to provide for other types of interface testing.
Introduction
The main goal of heuristic evaluations is to identify any problems associated with the design of user interfaces. Usability consultants Rolf Molich and Jakob Nielsen developed this method on the basis of several years of experience in teaching and consulting about usability engineering. Heuristic evaluations are one of the most informal methods[1] of usability inspection in the field of human–computer interaction. There are many sets of usability design heuristics; they are not mutually exclusive and cover many of the same aspects of user interface design. Quite often, usability problems that are discovered are categorized—often on a numeric scale—according to their estimated impact on user performance or acceptance. Often the heuristic evaluation is conducted in the context of use cases (typical user tasks), to provide feedback to the developers on the extent to which the interface is likely to be compatible with the intended users' needs and preferences.
The simplicity of heuristic evaluation is beneficial at the early stages of design and prior to user-based testing. This usability inspection method does not rely on users which can be burdensome due to the need for recruiting, scheduling issues, a place to perform the evaluation, and a payment for participant time. In the original report published, Nielsen stated that four experiments showed that individual evaluators were "mostly quite bad" at doing heuristic evaluations and suggested multiple evaluators were needed, with the results aggregated, to produce and to complete an acceptable review. Most heuristic evaluations can be accomplished in a matter of days. The time required varies with the size of the artifact, its complexity, the purpose of the review, the nature of the usability issues that arise in the review, and the competence of the reviewers. Using heuristic evaluation prior to user testing is often conducted to identify areas to be included in the evaluation or to eliminate perceived design issues prior to user-based evaluation.
Although heuristic evaluation can uncover many major usability issues in a short period of time, a criticism that is often leveled is that results are highly influenced by the knowledge of the expert reviewer(s). This "one-sided" review repeatedly has different results than software performance testing, each type of testing uncovering a different set of problems.
Methodology
Heuristic evaluation are conducted in variety of ways depending on the scope and type of project. As a general rule of thumb, there are researched frameworks involved to reduce bias and maximize findings within an evaluation. There are various pros and cons to heuristic evaluation. A lot of it depends on the amount of resources and the time available to the user.
Pros: Because there’s a very detailed list of criteria the evaluator goes through, it is a very detailed process and provides good feedback on areas that could be improved on. In addition, since it is done by several people the designer can get feedback from multiple perspectives. As it is a relatively straightforward process, there are less ethical and logistical concerns related to organizing the evaluation and executing it.
Cons: Since there is a specific set of criteria, the quality of the evaluation is limited by the skill and knowledge of the people who evaluate it. This leads to another issue: finding experts and people qualified enough to conduct this evaluation. However, if you have close resources of experts and qualified evaluators, this wouldn’t pose an issue. In addition, because the evaluations are more just personal observations, there’s no hard data in the results — the designer just has to take all the information and evaluations with these considerations in mind.
Number of Evaluators
According to Nielsen, three to five evaluators are recommended within a study.[2] Having more than five evaluators does not necessarily increase the amount of insights, and this may add more cost than benefit to the overall evaluation.
Individual and Group Process
Heuristic evaluation must start individually before aggregating results in order to reduce group confirmation bias.[2] The evaluator should examine the prototype independently before entering group discussions to accumulate insights.
Observer Trade-offs
There are costs and benefits associated when adding an observer to an evaluation session.[2]
In a session without an observer, evaluators would need to formalize their individual observations within a written report as they interact with the product/prototype. This option would require more time and effort from the evaluators, and this would also require further time for the conductors of the study to interpret individual reports. However, this option is less costly because it reduces the overhead costs associated with hiring observers.
With an observer, evaluators can provide their analysis verbally while observers transcribe and interpret the evaluators' findings. This option reduces the amount of workload from the evaluators and the amount of time needed to interpret findings from multiple evaluators.
Nielsen's heuristics
Jakob Nielsen's heuristics are probably the most-used usability heuristics for user interface design. An early version of the heuristics appeared in two papers by Nielsen and Rolf Molich published in 1989-1990.[3][4] Nielsen published an updated set in 1994,[5] and the final set still in use today was published in 2005:[6]
Visibility of system status: The system should always keep users informed about what is going on, through appropriate feedback within reasonable time.
Match between system and the real world: The system should speak the user's language, with words, phrases and concepts familiar to the user, rather than system-oriented terms. Follow real-world conventions, making information appear in a natural and logical order.
User control and freedom: Users often choose system functions by mistake and will need a clearly marked "emergency exit" to leave the unwanted state without having to go through an extended dialogue. Support undo and redo.
Consistency and standards: Users should not have to wonder whether different words, situations, or actions mean the same thing. Follow platform conventions.
Error prevention: Even better than good error messages is a careful design which prevents a problem from occurring in the first place. Either eliminate error-prone conditions or check for them and present users with a confirmation option before they commit to the action.
Recognition rather than recall: Minimize the user's memory load by making objects, actions, and options visible. The user should not have to remember information from one part of the dialogue to another. Instructions for use of the system should be visible or easily retrievable whenever appropriate.
Flexibility and efficiency of use: Accelerators—unseen by the novice user—may often speed up the interaction for the expert user such that the system can cater to both inexperienced and experienced users. Allow users to tailor frequent actions.
Aesthetic and minimalist design: Dialogues should not contain information which is irrelevant or rarely needed. Every extra unit of information in a dialogue competes with the relevant units of information and diminishes their relative visibility.
Help users recognize, diagnose, and recover from errors: Error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution.
Help and documentation: Even though it is better if the system can be used without documentation, it may be necessary to provide help and documentation. Any such information should be easy to search, focused on the user's task, list concrete steps to be carried out, and not be too large.
Gerhardt-Powals' cognitive engineering principles
Although Nielsen is considered the expert and field leader in heuristic evaluation, Jill Gerhardt-Powals developed a set of cognitive engineering principles for enhancing human-computer performance.[7] These heuristics, or principles, are similar to Nielsen's heuristics but take a more holistic approach to evaluation. The Gerhardt Powals' principles[8] are listed below.
Automate unwanted workload: Eliminate mental calculations, estimations, comparisons, and any unnecessary thinking, to free cognitive resources for high-level tasks.
Reduce uncertainty: Display data in a manner that is clear and obvious to reduce decision time and error.
Fuse data: Bring together lower level data into a higher level summation to reduce cognitive load.
Present new information with meaningful aids to interpretation: New information should be presented within familiar frameworks (e.g., schemas, metaphors, everyday terms) so that information is easier to absorb.
Use names that are conceptually related to function: Display names and labels should be context-dependent, which will improve recall and recognition.
Group data in consistently meaningful ways: Within a screen, data should be logically grouped; across screens, it should be consistently grouped. This will decrease information search time.
Limit data-driven tasks: Use color and graphics, for example, to reduce the time spent assimilating raw data.
Include in the displays only that information needed by the user at a given time: Exclude extraneous information that is not relevant to current tasks so that the user can focus attention on critical data.
Provide multiple coding of data when appropriate: The system should provide data in varying formats and/or levels of detail in order to promote cognitive flexibility and satisfy user preferences.
Practice judicious redundancy: Principle 10 was devised by the first two authors to resolve the possible conflict between Principles 6 and 8, that is, in order to be consistent, it is sometimes necessary to include more information than may be needed at a given time.
Shneiderman's Eight Golden Rules of Interface Design
Ben Shneiderman's book was published a few years prior to Nielsen, Designing the User Interface: Strategies for Effective Human-Computer Interaction (1986) covered his popular list of the, "Eight Golden Rules".[9][10]
Strive for consistency: Consistent sequences of actions should be required in similar situations ...
Enable frequent users to use shortcuts: As the frequency of use increases, so do the user's desires to reduce the number of interactions ...
Offer informative feedback: For every operator action, there should be some system feedback ...
Design dialog to yield closure: Sequences of actions should be organized into groups with a beginning, middle, and end ...
Offer simple error handling: As much as possible, design the system so the user cannot make a serious error ...
Permit easy reversal of actions: This feature relieves anxiety, since the user knows that errors can be undone ...
Support internal locus of control: Experienced operators strongly desire the sense that they are in charge of the system and that the system responds to their actions. Design the system to make users the initiators of actions rather than the responders.
Reduce short-term memory load: The limitation of human information processing in short-term memory requires that displays be kept simple, multiple page displays be consolidated, window-motion frequency be reduced, and sufficient training time be allotted for codes, mnemonics, and sequences of actions.
Weinschenk and Barker classification
In 2000, Susan Weinschenk and Dean Barker[11] created a categorization of heuristics and guidelines used by several major providers into the following twenty types:[12]
User Control: The interface will allow the user to perceive that they are in control and will allow appropriate control.
Human Limitations: The interface will not overload the user’s cognitive, visual, auditory, tactile, or motor limits.
Modal Integrity: The interface will fit individual tasks within whatever modality is being used: auditory, visual, or motor/kinesthetic.
Accommodation: The interface will fit the way each user group works and thinks.
Linguistic Clarity: The interface will communicate as efficiently as possible.
Aesthetic Integrity: The interface will have an attractive and appropriate design.
Simplicity: The interface will present elements simply.
Predictability: The interface will behave in a manner such that users can accurately predict what will happen next.
Interpretation: The interface will make reasonable guesses about what the user is trying to do.
Accuracy: The interface will be free from errors.
Technical Clarity: The interface will have the highest possible fidelity.
Flexibility: The interface will allow the user to adjust the design for custom use.
Fulfillment: The interface will provide a satisfying user experience.
Cultural Propriety: The interface will match the user’s social customs and expectations.
Suitable Tempo: The interface will operate at a tempo suitable to the user.
Consistency: The interface will be consistent.
User Support: The interface will provide additional assistance as needed or requested.
Precision: The interface will allow the users to perform a task exactly.
Forgiveness: The interface will make actions recoverable.
Responsiveness: The interface will inform users about the results of their actions and the interface’s status.
Domain or culture-specific heuristic evaluation
For an application with a specific domain and culture, the heuristics mentioned above do not identify the potential usability problems.[13] These limitations of heuristics occur because these heuristics are incapable of considering the domain and culture-specific features of an application. This results in the introduction of domain-specific or culture-specific heuristic evaluation.[14]
Garrett, Jesse James (2010). The Elements of User Experience: User-Centered Design for the Web. Voices That Matter (2 ed.). Pearson Education. ISBN9780321624642.