Kientz, J.A., Choe, E.K., Birch, B., Maharaj, R., Fonville, A., Glasson, C. & Mundt, J. (2010) Heuristic evaluation of persuasive health technologies. Proceeding of the 1st ACM International Health Informatics Symposium (IHI ’10), 555-564. doi: 10.1145/1882992.1883084.
1. Purpose of the research:
Develop a set of 10 heuristics intended to find problems in persuasive technologies, and compare with Nielsen’s heuristics to see if specific designed heuristics could be more helpful for persuasive technologies.
2.1 How to Define Heuristics
The research group firstly reviewed related literature and compile a master list of all usability guidelines and heuristics for persuasive technologies. Then they narrow down the list by combining the similar guidelines, prioritizing them, and discussing them using a process similar to affinity diagramming and ultimately came up with 10 heuristics. The list of 10 heuristics enable evaluators to focus on the most important aspects and also allow researchers do a comparison to Nielsen’s 10 heuristics.
The list of 10 heuristics are as follows (you could find explanation for each one in the paper):
Appropriate Functionality; Not Irritating or Embarrassing; Protect Users’ Privacy; Use of Positive Motivation Strategies; Usable and Aesthetically Appealing Design; Accuracy of Information; Appropriate Time and Place; Visibility of User’s Status; Customizability; and Educate Users.
As you might see these heuristics had some overlap with Nielsen’s. This was intentional and necessary because Nielsen’s list reflects the fundamental usability principles.
2.2 How to Conduct Evaluation
The researchers chose two web-based applications to evaluate: Mindbloom and MyPyramid BlastOff. The former is a website designed to track progress of users’ life goals, including health goal; while the latter is a online game aiming to educate children about healthy food choices. These two examples of persuasive technologies were chose because they could be accessed easily by any evaluators at any places with internet connection.
The researchers also recruited 10 evaluators, among who there were graduate students in HCI-related program and one game designer and one web coordinator. They were then randomly assigned to 2 groups: experimental and control group, corresponding to evaluate applications using new heuristics and Nielsen’s heuristics, respectively. The evaluation process was basically identical to Nielsen’s instruction, however, the researchers define the severity rating afterwards, instead of evaluators.
3. Main Findings:
There were several interesting findings.
The researchers claimed that the designed heuristics could discover more sever issues, more severe issues more frequently, and more issues that are useful in improving persuasive aspects of the interface evaluated.
What’s more interesting to me is that they found out the first two heuristics in the list had the highest number of issues correlated. This phenomena was consistent in both experimental and control groups. This finding suggested the order of heuristics we gave to evaluators might influence their findings. Thus we could intentionally randomize the order or place the heuristics in order of importance to get better data.
I will give 3 out of 5 points to this paper. Though the findings are kind of interesting, and valuable for evaluation of persuasive technologies, I doubt its scientific significance for the following reasons:
First, as the authors stated at the beginning of the paper, design specialized heuristics was already a trend in usability evaluation area. What this paper did was duplicate the research procedure of similar papers and applied it to evaluate persuasive technologies.
What’s more, I saw too much manipulations from researchers in the experiment. For example, since the evaluators recruited were not expert-level (with average self-rate experience = 2.3, 1=no experience, 4=very experienced), the researchers had to re-group the issues they found during evaluation and gave them severity rating. Without any verbalized comments from observer (see also Nielsen’s instruction), it was hard to tell how much subject opinions had been added during this process, which might skew the final conclusions.
Generally, I don’t doubt the overall conclusion of this paper, and agree that specialized heuristics could facilitate the evaluation for specific applications than Nielsen’s heuristics. However, as a research paper analyzed with statistics, I would like to see more rigid control of the experiment.