Reward & Punishment
by Christoph Hauert, Version 2.2, February 2005.
- » Reward & punishment
Minigames capturing the essence of public goods experiments show that even in the absence of rationality assumptions, both punishment and reward fail to bring about prosocial behaviour. This holds in particular for the well-known ultimatum game, which emerges as a special case. But reputation can induce fairness and cooperation in populations adapting through learning or imitation. Indeed, the inclusion of reputation effects in the corresponding dynamical models leads to the evolution of economically productive behaviour, with agents contributing to the public good and either punishing those who don't, or rewarding those who do. Reward and punishment correspond to two types of bifurcation with intriguing complementarity. The analysis suggests that reputation is essential for fostering social behaviour among selfish agents, and that it is considerably more effective with punishment than with rewards.
The following pages illustrate several research articles and provide interactive Java applets to visualize and experiment with the system's dynamics for parameter settings of your choice.
Reward, Punishment & Reputation
Cooperation is abundant in nature and ranges from bacterial colonies and group defense or predator inspection behavior to human interactions including health insurrance, public transportation, environmental issues etc. Nevertheless, it is far from obvious how costly cooperative behavior could have evolved under Darwinian selection. In order to model such situations, social dilemmas such as the prisoner's dilemma and the public goods game have received considerable attention. A social dilemma is characterized by a conflict of interest between the wellfare of the community and the performance of an individual. In pairwise interactions, for example, the act of cooperation produces a benefit for the partner at some cost to the cooperator. If both cooperate they both get the benefit less the costs but each individual is tempted defect hoping to get away with the benefit without having to bear the costs. Actually, it is always better to defect no matter what the other player does. Consequentially cooperation will vanish - but then both players are worse off than if they had cooperated. This is the famous prisoner's dilemma. Public goods interactions are essentially a generalization of the prisoner's dilemma to groups of arbitrary size. For further details on the prisoner's dilemma see the tutorials on 2×2 games and on cooperation in structured populations. For an introduction on cooperation in larger groups see the tutorial on public goods games.
Both, the prisoner's dilemma as well as public goods games require additional mechanisms to overcome the confilct of interests and allow for persistence of cooperation. This can be achieved, for example, by introducing repeated interactions, population structure, optional interactions, or by adding a second stage to the interaction where defection can be punished or cooperation rewarded. The fear from punishment as well as the prospect of a reward create high incentives to cooperate and contribute to the public good. However, since reward and punishment are costly, 'rational' players will choose not to punish or reward their co-players. But then, if nobody punishes (rewards), the incentive to cooperate vanishes. Consequentially punishment (reward) alone is unable to promote persistent cooperation.
However, when additionally introducing population structure or if players carry a reputation then punishment turns out to be very efficient - in accordance with our experiences from every day life. Both mechanisms add some sort of social control either through interactions within a limited local neighborhood or through reputation that builds on a players behavior in previous interactions and may become known to other members of the population (e.g. spreading of gossip). In The Prince, Machiavelli nicely addresses the efficacy of punishment:
In nature a rich variety of punishment mechanisms is realized, ranging from toxin production in bacteria to the complex law system in human societies. But quite in contrast, rewarding mechanisms seem to be limited to higher organisms, and perhaps even to humans. Interestingly, already the simplest models indicate that rewarding mechanisms lead to far more complicated dynamics that make it much more difficult to establish and maintain cooperation. Consequentially, rewarding mechanisms do not allow for similarly clear-cut conclusions as in the case of punishment.
Further research on the dynamics in the reward scenario is in progress and will be added here.
Punishment and the ultimatum game
In a typical experimental setup of the ultimatum game an experimenter offers $100 to two people under the condtion that agree on how to share the sum according to specific rules. By tossing a coin one participant is assigned the role of the 'proposer' and the other participant is the 'responder'. The proposer can now make an offer of how to split the money. If the responder accepts then the money is divided as suggested. However, if he refuses the offer, then he forfeits the money, the experimenter takes the $100 bill and walks away. So, how much should you offer and and what share of the total should you insist? According to rational reasoning one should of course offer the smallest possible amount and that offer should be accepted - after all $1 is better than nothing! - or is it not?
In actual experiments, the vast majority of offers lies between 40-50% and essentially all offers below 20% are rejected. This basic pattern has been confirmed in countless experimental studies - independent of country, religion, culture, sex or monetary amount. Only very recently deviations have been documented in small-scale societies, i.e. in tribes that live in isolated areas far from modern civilization. These results suggest that the observed behavior in the ultimatum game is linked to the experience of markets and trading.
Neat, but why is this game interesting in the first place? Traditional economics builds on the assumption of homo oeconomicus, i.e. assuming that humans are rational utility maximizers, where utility is essentially reduced to monetary values, at least in market situations. The results of the ultimatum game demonstrate that the basic rationality assumption might be flawed and could be rather sensitive to the context. Moreover, the ultimatum game is intrinsically linked to fairness considerations (low offers are perceived as an insult and are followed by retaliation in the form of rejection), which then leads to the emergence of social norms. Whatever the reasons are, the huge difference between theoretical prediction and experimental evidence clearly calls for an explanation.
So far, the link to social dilemmas and the prisoner's dilemma, in particular, did not become apparent. At least aspects of punishment were briefly touched in the form of revenge. To illustrate this connection, suppose a simplified version of the ultimatum game where only two offers are possible, a high and a low offer. Rational players should thus make the low offer and accept it. Making a high offer can be related to cooperation because it is costly to the proposer. Similarly, rejecting an offer relates to punishment - rejection is costly but the loss of the opponent is much higher. In fact, this analogy can be shown with full mathematical rigor (Sigmund et al. PNAS 2001).
This tutorial provides additional and complementary information and interactive experiences to the following research articles:
- Sigmund, K., Hauert, Ch. & Nowak. M. A. (2001) Reward and Punishment, Proc. Natl. Acad. Sci. USA 98 10757-10762.
- Brandt, H., Hauert, Ch. & Sigmund, K. (2003) Punishment and reputation in spatial public goods games, Proc. R. Soc. Lond B 270,1099-1104.
- Hauert, Ch. Haiden, N. & Sigmund, K. (2004) The dynamics of public goods, DCDS-B 4 (3) 575-587.
For the development of these pages help and advice of the following two people was of particular importance: First, my thanks go to Karl Sigmund for helpful comments on the game theoretical parts and second, my thanks go to Urs Bill for introducing me into the Java language and for his patience and competence in answering my many technical questions. Financial support of the Swiss National Science Foundation is gratefully acknowledged.