Results (
Thai) 1:
[Copy]Copied!
21.3.3 Which Strategy Performs Best We conducted a series of experiments to investigate which strategy from Table 21.3 is best (see [10] for details). In Experiment 1 (see Figure 21.1), we investigated how people would solve this problem, using the User as Wizard evaluation method [13]. Participants were given individual ratings identical to those in Table 21.1. These ratings were chosen to be able to distinguish between strategies. Participants were asked which items the group should watch, if there was time for one, two, .., seven items. We compared participants’ decisions and rationale with those of the aggregation strategies. We found that participants cared about fairness, and about preventing misery and starvation (”this one is for Mary, as she has had nothing she liked so far”). Participants’ behaviour reflected that of several of the strategies (e.g. the Average, Least Misery, and Average Without Misery were used), while other strategies (e.g. Borda count, Copeland rule) were clearly not used. In Experiment 2 (see Figure 21.2), participants were given item sequences chosen by the aggregation strategies as well as the individual ratings in Table 21.1. They rated how satisfied they thought the group members would be with those sequences, and explained their ratings. We found that the Multiplicative Strategy (which multiplies the individual ratings) performed best, in the sense that it was the only strategy for which all participants thought its sequence would keep all members of the group satisfied. Borda count, Average, Average without Misery and Most Pleasure also performed quite well. Several strategies (such as Copeland rule, Plurality voting, Least misery) could be discarded as they clearly were judged to result in misery for group members. We also compared the participants’ judgements with predictions by simple satisfaction modelling functions. Amongst other, we found that more accurate predictions resulted from using: • quadratic ratings, which e.g. makes the difference between a rating of 9 and 10 bigger than that between a rating of 5 and 6 • normalization, which takes into account that people rate in different ways, e.g., some always use the extremes of a scale, while others only use the middle of the scale. 21.4 Impact of Sequence OrderAs mentioned in Section 21.2, we are particularly interested in recommending a sequence of items. For example, for a personalised news program on TV, a recommender may select seven news items to be shown to the group. To select the items, it can use an aggregation strategy (such as the Multiplicative Strategy) to combine individual preferences, and then select the seven items with the highest group ratings. Once the items have been selected, the question arises in what order to show them in the news program. For example, it could show the items in descending order of group rating, starting with the highest rated item and ending with the lowest rated one. Or, it could mix up the items, showing them in a random order.However, the problem is actually far more complicated than that. Firstly, in responsive environments, the group membership changes continuously, so deciding on the next seven items to show based on the current members seems not a sensible
Being translated, please wait..
