Hamilton's original work on inclusive fitness theory assumed additivity of costs and benefits. Recently, it has been argued that an exact version of Hamilton's rule for the spread of a pro-social allele (rb > c) holds under nonadditive pay-offs, so long as the cost and benefit terms are defined as partial regression coefficients rather than pay-off parameters. This article examines whether one of the key components of Hamilton's original theory can be preserved when the rule is generalized to the nonadditive case in this way, namely that evolved organisms will behave as if trying to maximize their inclusive fitness in social encounters.

Introduction

Inclusive fitness theory is a widely used framework for studying the evolution of social behaviour. Hamilton's original formulation of the theory contains two distinct though related ideas (Hamilton, 1964, 1971). The first is Hamilton's rule, the famous criterion (rb > c) for when an allele coding for a social behaviour will be favoured by selection. This aspect of the theory fits with the ‘gene's eye’ view of evolution. The second is maximization of inclusive fitness, rather than classical fitness, as the ‘goal’ towards which an individual's social behaviour will appear designed. This aspect fits with the traditional individualist view of evolution and is frequently employed by behavioural ecologists.

The relation between these two aspects of inclusive fitness theory is not fully settled. Much theoretical work has focused solely on the first aspect; indeed, the notion of individuals ‘trying’ to maximize their inclusive fitness is often omitted from expositions of kin selection theory. However, recently Grafen (2006, 2009), Queller (2011) and Gardner et al. (2011) have argued for the central importance of inclusive fitness maximization as the ‘goal’ of individual behaviour; Grafen (2006) provides a population-genetic foundation for the idea. This goes some way to reconciling the two aspects of Hamilton's theory.

The work of Grafen, Queller and Gardner et al. suggests an intriguing link between social evolution and rational choice theory. For in effect, these authors are arguing that inclusive fitness plays the role of a utility function in rational choice, that is, it is the quantity that an evolved organism will behave as if it is trying to maximize. Thus, Gardner et al. (2011) write ‘we can imagine the individual adjusting her inclusive fitness…by altering her behaviour’, before choosing an action which brings maximal inclusive fitness (p. 1039–40). This way of thinking about evolution is an instance of what Sober (1988) called ‘the heuristic of personification’, which says that a trait will be favoured by natural selection if and only if a rational individual, seeking to maximize its fitness, would choose that trait over the alternatives. In effect, Gardner et al. are suggesting that this heuristic is valid in social settings, where the trait in question is a social action, so long as ‘fitness’ is defined as inclusive fitness.

Our aim here is to propose a particular way of formalizing this ‘rational actor heuristic’ in the context of social evolution and to ask how generally it applies. This is a pressing question because Grafen's (2006) argument that evolution will lead to inclusive fitness maximizing behaviour assumes additivity of costs and benefits. This assumption is quite restrictive as in many social situations, the benefit that a given action confers on a recipient may depend on the recipient's own type (Frank, 1998; Lehmann & Rousset, 2014a). In our simple model below we find that if the additivity assumption is made, then the rational actor heuristic, with inclusive fitness as the individual's utility function, applies neatly. However, matters are more complex if there is nonadditivity.

Asking whether the rational actor heuristic applies is different from asking whether Hamilton's rule itself applies in nonadditive scenarios. This latter question has been extensively discussed in the literature. The upshot is that an exact version of Hamilton's rule does apply under nonadditivity, so long as the cost and benefit terms are suitably defined (Queller, 1992; Frank, 1998, 2013; Gardner et al., 2011); though the biological significance of the resulting rule has been questioned (Allen et al., 2013; Birch, 2014; Birch & Okasha, 2015). However, this does not settle the issue about individual maximization that is our focus here.

The structure of this article is as follows. Firstly, we study social evolution using a simple additive Prisoner's dilemma model and show how the rational actor heuristic applies. Secondly, we consider a nonadditive variant of the same model and ask whether a similar conclusion holds. Finally, we discuss the significance of the results obtained.

The case of additive pay-offs

Additive Prisoner's dilemma

Consider a simple model of the evolution of social behaviour of the sort used in evolutionary game theory. An infinite population of haploid asexual organisms engage in pairwise social interactions in every generation. Organisms are of two types, altruists (A) and selfish (S). A types perform an action that is costly for themselves but benefits their partner; S types do not perform the action. Type is hard-wired genetically and perfectly inherited.

An organism's pay-off from the social interaction depends on its own type and its partner's type. Pay-offs are interpreted as increases in lifetime reproductive fitness over a unit baseline. The social action is assumed to affect only the actor and their partner, and thus local interaction is assumed absent. An A type incurs a cost of −c as a result of its action and confers a benefit of b on its partner, where c > 0 and b > 0; thus, the game is a Prisoner's dilemma.

Pay-offs to the actor, referred to as ‘personal pay-offs’, are shown in Table 1. We let V(i, j) denote the pay-off to an actor from playing i when her opponent plays j, where i, j ∈ {A, S}. Note that pay-offs are additive: an altruist alters their own pay-off by −c and their partners’ pay-off by b, irrespective of the type of their partner.

Table 1. Additive Prisoner's dilemma

		A	S
		Partner
Actor	A	b − c	−c
Actor	S	b	0

There are three pair-types in the population, AA, AS and SS, whose relative frequencies in the initial generation are f_AA, f_AS and f_SS, respectively, where f_AA + f_AS + f_SS = 1. The overall frequency of the A type in the initial generation is denoted p, where $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0001$ . The change in p over one generation is denoted Δp.

The sign and magnitude of Δp depend on the rules by which the pairs are formed. If pairing is random, then the S type must be fitter overall, so Δp will be negative. However, if pairing is assortative, then the A type may be fitter overall, for the benefits of altruistic actions then fall disproportionately on other altruists. Random pairing means that the frequency distribution of the pair-types will be binomial, that is f_AA = p², f_AS = 2p(1 − p) and f_SS = (1 − p)².

Where pairing is nonrandom, a simple regression analysis yields a measure of the statistical correlation between social partners. We use the variable p_i to indicate an organism's own type and $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0002$ to indicate its partner's type; thus p_i = 1 if the ith organism is an A, p_i = 0 otherwise; and $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0003$ if the ith organism is paired with an A, $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0004$ otherwise. We then compute the linear regression of $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0005$ on p_i, given by $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0006$ , which is a standard way of defining the r term of Hamilton's rule. Henceforth, we refer to $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0007$ as r.

In the early kin selection literature, r was often defined in genealogical terms, for example as the probability that actor and partner share an allele that is identical by descent, yielding the familiar values of $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0008$ for full sibs, $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0009$ for offspring and $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0010$ for grandoffspring (see Michod & Hamilton, 1980). In some ways, this definition of r is the more natural one for expressing the idea that organisms value their relatives’ reproduction in proportion to how closely related they are. However, the statistical definition of r, above, yields a version of Hamilton's rule that is more generally applicable.

In the context of pairwise interactions, r can be conveniently expressed as a difference in conditional probabilities:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0011$

It follows that r ranges from −1 (perfect disassortment) to 1 (perfect assortment); when pairing is random, r = 0.

Evolutionary analysis

In the Appendix, we show that the change in p over one generation is given by:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0012$ (1)

where $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0013$ is average population fitness. As Var(p) is non-negative, this tells us that so long as 0 < p < 1, the A type will increase in frequency in the population whenever rb > c, which is of course Hamilton's rule. As b and c are fixed parameters of the pay-off matrix, this condition for the spread of A is frequency-independent so long as r itself does not change as the population evolves. Constancy of r across generations will sometimes be a reasonable assumption, for the pattern of assortment in the population, which is what r measures, may be determined by biological factors, for example dispersal, which are independent of the social trait that is evolving. This assumption is considered further in the Discussion section.

With the constant r assumption, the outcome of the evolutionary process is easily determined. If rb > c the A type will spread to fixation; if rb < c the S type will spread to fixation; if rb = c there will be no evolutionary change.

Rational actor analysis: preliminaries

To apply a rational actor analysis, we transpose our evolutionary model to a rational choice context. We consider two players playing a symmetric game. Each player has two pure strategies, A and S. If a player plays a mixed strategy, this means that they randomize over their pure strategies; thus, π_A denotes the mixed strategy in which A is played with probability π_A and S with probability 1 − π_A. The pay-off to a mixed strategy is then simply its expected pay-off.

Each player has a utility function which measures how desirable they find the possible outcomes of the game; we assume that both players have the same utility function. Each player's goal is to maximize their utility function. One possibility is that the utility function is given by the personal pay-offs in Table 1 above, in which case we write U(i, j) = V(i, j), where U(i, j) is the utility a player gets from playing i when their partner plays j; i, j ∈ {A, S}. There are other possibilities too; see below.

Once the players’ utility function has been specified, the next step is to seek the Nash equilibrium (or equilibria) of the game. (A Nash equilibrium is a pair of strategies, possibly mixed, one for each player, each of which is a best response to the other.) Game theory predicts that if the players are rational, they will end up at a Nash equilibrium of the game (see for example Binmore, 2007). We can then ask whether the Nash equilibria of the game correspond to the outcomes of the evolutionary process described above. If so, we can conclude that evolution will lead organisms to behave as if trying to maximize the utility function in question.

This is a natural way of formalizing the rational actor heuristic in a game-theoretic context. It differs somewhat from Grafen's (2006) formalization of the same idea, which posits ‘links’ between gene-frequency change and individual optimization. Our approach allows recovery of Grafen's main result, and by taking optimization to include best-response, that is optimal choice conditional on the other player's choice, extends easily to the nonadditive case. A similar approach is found in Alger & Weibull (2012) and Lehmann et al. (2015).

Utility as inclusive fitness

One possibility to explore is that a player's utility function depends on their partner's pay-off as well as their own. For example, suppose that a player's utility for any outcome is given by the quantity: personal pay-off plus r times partner's pay-off, that is U(i, j) = V(i, j) + rV(j, i). Applying this transformation to the personal pay-offs yields Table 2 below, which we refer to as the ‘inclusive fitness pay-off matrix’.

Table 2. Additive PD with inclusive fitness pay-offs

		A	S
		Partner
Actor	A	(b − c)(r + 1)	−c + rb
Actor	S	b − rc	0

This transformation was first suggested by Hamilton (1971) and has been discussed by Grafen (1979), Bergstrom (1995), Wade & Breden (1980), Day & Taylor (1998), Taylor & Nowak (2007) and Martens (2015). It is a natural formalization of the idea that an actor, in their social behaviour, will care about their partner's pay-off, discounted by relatedness, as well as their own pay-off. Clearly, other transformations of the personal pay-off matrix are also conceivable.

The pay-offs in Table 2 do not correspond exactly to the verbal definition of inclusive fitness in Hamilton (1964), which was ‘the personal fitness which an individual actually expresses…once it is stripped of all components which can be considered as due to the individual's social environment…then augmented by certain fractions of the quantities of harm and benefit which the individual himself causes to the fitnesses of his neighbours…The fractions in question are simply the coefficients of relationship’ (p. 8). This definition is sometimes but not always adhered in the literature.

The discrepancy between Table 2 and Hamilton's definition arises because in the left column, the actor's pay-off has not been stripped of the component that is due to the partner's altruistic action (b) and has been augmented by r times the partner's entire pay-off, rather than the portion of that pay-off that is caused by the actor (b). Applying Hamilton's definition exactly would lead to the pay-off matrix in Table 3 below.

Table 3. Additive PD with Hamilton (1964) pay-offs

		A	S
		Partner
Actor	A	−c + rb	−c + rb
Actor	S	0	0

Note that Table 3 derives from Table 2 by subtraction of the quantity b − rc from the left column. In game-theoretic terms, Table 3 is thus a ‘local shift’ of Table 2 (and vice versa), which means that their Nash equilibria are necessarily identical (Weibull, 1995). Therefore, if the players’ utility function is given by Table 3, game theory predicts exactly the same outcome(s) as if it were given by Table 2. So although taking Table 2 as the definition of inclusive fitness involves an element of ‘double counting’ – which Hamilton's definition was designed to avoid – it is harmless.

In fact, there is a positive reason to prefer Table 2 as the definition of inclusive fitness, in a game-theoretic context, for Hamilton's definition does not generalize easily to nonadditive pay-offs. With nonadditivity, it is unclear how to decide which component of the actor's pay-off is ‘caused’ by its partner's action and vice versa (cf. Allen et al., 2013). By contrast, the definition used in Table 2 – actor pay-off plus r times partner pay-off – applies just as well to the nonadditive case. In order not to prejudge the issue of whether inclusive fitness maximization, or a similar result, obtains under nonadditivity, this is the definition preferred here.

Rational actor analysis: results

Suppose firstly that the utility function is personal pay-off (Table 1). It is easy to see that (S, S) is the only Nash equilibrium of the game, as S strongly dominates A, that is each player does strictly better by playing S irrespective of their partner's choice. This familiar result shows that the rational actor heuristic fails for this choice of utility function, as it would have us conclude that altruism can never evolve, which we know to be false.

What if the utility function is inclusive fitness pay-off (Table 2)? In that case, we can show the following. If rb > c, then (A, A) is the unique Nash equilibrium; if rb < c, then (S, S) is the unique Nash equilibrium; if rb = c, then (A, A) and (S, S) are both Nash equilibria, as is every pair of mixed strategies, so game theory makes no prediction about the players’ choices (See Appendix for proof).

It follows that with additive pay-offs, defining utility as inclusive fitness makes the rational actor heuristic valid. The condition for the A type to evolve, rb > c, is identical to the condition for (A, A) to be the unique Nash equilibrium of the rational game, and similarly for S (Table 4). This supports the idea that evolution will lead organisms to appear as if trying to maximize their inclusive fitness, just as Hamilton originally argued.

Table 4. Rational actor heuristic with utility = inclusive fitness

rb > c ⇔ A evolves ⇔ (A, A) is unique Nash equilibrium

rb < c ⇔ S evolves ⇔ (S, S) is unique Nash equilibrium

rb = c ⇔ no evolution ⇔ all pairs of strategies, pure and mixed, are Nash equilbiria

⇔ means ‘if and only if’.

An equivalent perspective on the situation is this. The quantity (rb − c) equals the difference in a player's inclusive fitness pay-off between playing A and S, irrespective of what its partner does (see Table 2). Thus, we can determine whether the A type will evolve by asking whether a rational agent, who wants to maximize their inclusive fitness, would choose A over S. In short, equating utility with inclusive fitness ensures that the rational agent's choice coincides with the ‘choice’ made by natural selection.

A caveat: uniqueness

One important caveat is needed. In the above model, the inclusive fitness pay-off matrix (Table 2) is not the unique utility function that yields the rb > c condition for action A to be chosen over S. In game theory, the utility function is only ever unique up to choice of origin and unit; so any affine transformation (of the form U^′ = aU + b, where $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0014$ , a > 0) will leave all Nash equilibria of the game unchanged. Furthermore, a ‘local shift’ of the utility function, which involves adding a constant to any column of the utility matrix, will also leave unchanged the Nash equilbiria, as noted above.

One local shift of the inclusive fitness pay-off matrix (Table 2) is of particular interest. If we add the quantity (rc − rb) to the left-hand column of Table 2, we get the matrix in Table 5 below.

Table 5. Additive PD with Grafen 1979 pay-offs

		A	S
		Partner
Actor	A	(b − c)	−c + rb
Actor	S	(1 − r)b	0

The pay-offs in Table 5 are related to the personal pay-offs (Table 1) by the transformation U(i, j) = rV(i, i) + (1 − r)V(i, j). This transformation was first suggested by Grafen (1979), hence the label ‘Grafen 1979 pay-off’; see Bergstrom (1995), Day & Taylor (1998), Alger & Weibull (2012) for discussion. By contrast with the inclusive fitness pay-offs (Table 2), which involve adding r times partner's pay-off to the actor's personal pay-off, the Grafen 1979 pay-offs involve taking an (r, 1 − r) weighted average of the personal pay-off that would accrue to the actor if their partner had chosen the same as the actor and if their partner made the choice that they actually did.

As the Grafen 1979 pay-off matrix (Table 5) is a local shift of the inclusive fitness pay-off matrix (Table 2), the Nash equilibria of the resulting games are identical; thus, the rational actor heuristic works equally well with either. [This is because in both cases, (rb − c) is the pay-off difference between playing A and S.] Therefore, although our simple model has vindicated Hamilton's claim that evolution will lead organisms to behave as if trying to maximize their inclusive fitness, it is important to see that inclusive fitness [whether defined our way (Table 2) or in Hamilton's original way (Table 3)] is not the unique quantity of which this maximization claim is true.

Nonadditive pay-offs

To determine whether the above results generalize to the nonadditive case, we consider a modified Prisoner's dilemma in which the pay-off to an A type paired with another A type is (b − c + d) rather than (b − c). So the parameter d quantifies the deviation from pay-off additivity, or synergistic effect, when two A types are paired together; d can be either positive or negative. The resulting pay-off structure (Table 6) is sometimes referred to as a ‘synergy game’ (van Veelen, 2009).

Table 6. Nonadditive Prisoner's dilemma (‘synergy game’)

		A	S
		Partner
Actor	A	b − c + d	−c
Actor	S	b	0

Again, we assume that pairs of organisms are drawn from an infinite population to play the game; type is genetically hard-wired and mutation is absent.

Evolutionary analysis

As before, Δp denotes the change in frequency of the A type over a generation. Unsurprisingly, rb > c is no longer the condition for Δp to be positive. However, an exact version of Hamilton's rule can be recovered by suitably defining the cost and benefit terms, as emphasized by Gardner et al. (2011), whose approach we follow here. (A different approach, not discussed here, incorporates nonadditive pay-offs into Hamilton's rule by a weak selection approximation, see for example Lehmann & Rousset, 2014b).

For each individual i, we let w_i denote its actual reproductive fitness (number of offspring). We then write w_i as a linear regression on p_i and $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0015$ :

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0016$ (2)

where α is baseline fitness; $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0017$ is the partial regression of an individual's fitness on their own type, controlling for their partner's type; $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0018$ is the partial regression of an individual's fitness on their partner's type, controlling for their own type; and e_i is the residual. These partial regression coefficients quantify the average effect (sensu Fisher, 1930) of the actor's action, and their partner's action's, on the actor's fitness.

Following Hamilton (1964), instead of considering the effect on the actor's fitness of their partner's action $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0019$ , we can consider the effect on their partner's fitness of the actor's action, denoted $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0020$ . These two partial regression coefficients are numerically dentical (Taylor et al., 2007). (This is the well-known switch from ‘neighbour-modulated’ to ‘inclusive’ fitness.) Following Gardner et al. (2011), we denote the $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0021$ and $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0022$ coefficients as −C and B, respectively.

Importantly, eqn 2 can be fitted whether or not the true relation between w, p_i and $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0023$ is linear. In the nonadditive case under consideration that relation is nonlinear (since d > 0), which implies that the partial regression coefficients −C and B will be functions of population-wide gene frequencies, and liable to change as the population evolves. Therefore, unlike c and b, which are fixed pay-off parameters, −C and B are population variables.

Following Gardner et al. (2007, p. 219), we can write explicit expressions for −C and B in terms of r, p, and the parameters of the pay-off matrix b, c and d. This yields:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0024$ (3)

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0025$ (4)

We can then derive the following expression for evolutionary change:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0026$ (5)

where $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0027$ is average population fitness (see Appendix). Equation 5 tells us that when 0 < p < 1, the A type will increase in frequency if and only if rB > C. This is a generalized version of Hamilton's rule, applicable whether pay-offs are additive or not.

The quantity (rB − C), whose sign determines whether altruism spreads, can be computed by adding eqn 3 to r times eqn 4. After simplifying, this yields:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0028$ (6)

Note that (rB − C) is a function of p, so satisfaction of rB > C in generation t does not imply its satisfaction in generation t + 1. Selection is thus frequency-dependent, and neither type will necessarily spread to fixation. A polymorphic equilibrium will obtain when p = [c − r(b + d)]/d[1 − r]; the stability of this equilibrium depends on the sign of d. The full evolutionary dynamics are summarized in Table 7 below, see Appendix for proof.

Table 7. Evolutionary dynamics of nonadditive PD

Case 1: r < 1, d > 0
(i)	rb − c + rd ≥ 0	A evolves to fixation
(ii)	rb − c + d ≤ 0	S evolves to fixation
(iii)	rb − c + d > 0 > rb − c + rd	Unstable polymorphism at p = [c−r(b + d)]/d[1 − r]
Case 2: r < 1, d < 0
(i)	rb − c + d ≥ 0	A evolves to fixation
(ii)	rb − c + rd ≤ 0	S evolves to fixation
(iii)	rb − c + d < 0 < rb − c + rd	Stable polymorphism at p = [c − r(b + d)]/d[1 − r]
Case 3: r = 1
(i)	b − c + d > 0	A evolves to fixation
(ii)	b − c + d < 0	S evolves to fixation
(iii)	b − c + d = 0	No evolutionary change

The general version of Hamilton's rule embodied in eqn 5 raises interesting interpretive questions. Some have argued that the rule in this form has little explanatory value (Nowak et al., 2011; Allen et al., 2013), whereas others have seen the generality of the rule as an advantage, a proof that inclusive fitness theory does not rely on restrictive assumptions (Gardner et al., 2011). This debate has been analysed elsewhere (Birch, 2014; Birch & Okasha, 2015) and is not the focus here.

Instead, our question is this. Given that eqn 5 is true, and given the resulting evolutionary dynamics, can the rational actor heuristic be applied? Will evolution lead organisms to behave as if maximizing a utility function, and if so what is it?

Importantly, the answer to this question cannot simply be read off eqn 5. In the additive case, there was a simple link between Hamilton's rule and a utility function with the desired property: rb − c > 0 was the condition for the A type to spread, and (rb − c) the utility difference between playing A and S. One might hope to extrapolate this to the nonadditive case by simply replacing (rb − c) with (rB − C) in Table 2. However, as B and C are functions of p, they cannot meaningfully feature in the utility function.

The reason is as follows. The point of the rational actor heuristic is to find a link between gene-frequency dynamics and a ‘goal’ that organisms behave as if they are trying to achieve. Such a link would be trivial if the ‘goal’ were allowed to change as gene frequencies change. For the heuristic to have any value, the goal must remain fixed. So our task is to find a utility function whose arguments are restricted to the pay-off parameters (b, c and d), and the relatedness coefficient r, which makes the rational actor heuristic work.

Rational actor analysis

To address this question, we again transpose the evolutionary model to a rational choice context and study the Nash equilibria of the resulting game. Suppose firstly that the utility function is given by the inclusive fitness pay-off transformation, that is personal pay-off plus r times partner pay-off. This yields the pay-offs in Table 8 below.

Table 8. Nonadditive PD with inclusive fitness pay-offs

		A	S
		Partner
Actor	A	(b − c + d)(r + 1)	−c + rb
Actor	S	b − rc	0

The Nash equilibria are then as follows:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0029$

It follows that, unlike in the additive case, the rational actor heuristic does not work when utility is defined as inclusive fitness. The condition for (A, A) to be a Nash equilibrium is not identical to the condition for A to evolve to fixation, similarly for S. Furthermore, the condition for there to be a mixed-strategy Nash equilibrium is not the same as the condition for there to be a polymorphism. So it is not true that at evolutionary equilibrium, organisms will behave as if trying to maximize their inclusive fitness.

Can we find a utility function modulo which the rational actor heuristic works? The answer is yes. The Grafen 1979 pay-off matrix, which to recall is derived from the personal pay-off matrix by the transformation U(i, j) = rV(i, i) + (1 − r)V(i, j), does the trick. This yields the pay-offs in Table 9 below.

Table 9. Nonadditive PD with Grafen 1979 pay-offs

		A	S
		Partner
Actor	A	(b − c + d)	−c + rb + rd
Actor	S	(1 − r)b	0

The Nash equilibria are then as follows:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0030$

This restores the rational actor heuristic. In particular, if (A, A) is the only pure-strategy Nash equilibrium, then A evolves to fixation; if (S, S) is the only pure-strategy equilibrium, then S evolves to fixation. If there is a mixed-strategy Nash equilibrium but no pure-strategy equilibria, the population evolves to a stable polymorphism; if there is a mixed-strategy Nash equilibrium and both (A, A) and (S, S) are pure-strategy equilibria, then there is an unstable polymorphism; in both cases, the weights on A and S in the mixed-strategy Nash equilibrium equal the proportions of A and S in the polymorphism. Thus, there is a tight correspondence between the Nash equilibria and the evolutionary dynamics, summarized in Table 10 (see Appendix for proof).

Table 10. Rational actor heuristic, utility = Grafen 1979 pay-off

(A, A) is only pure N.E. ⇒ A evolves to fixation

(S, S) is only pure N.E. ⇒ S evolves to fixation

(π_A, π_A) is the only N.E. ⇔ stable polymorphism at p = π_A

(π_A, π_A), (A, A), (S, S) all N.E. ⇔ unstable polymorphism at p = π_A

Note: π_A = (c − r(b + d))/d(1 − r).

The upshot is that with nonadditive pay-offs, the rational actor heuristic will work so long as the utility function is defined as Grafen 1979 pay-off, rather than inclusive fitness pay-off. Again any affine transformation of the Grafen 1979 pay-off matrix, or any local shift, will also preserve the correspondences above. Note that, unlike in the additive case, the Grafen 1979 pay-off matrix (Table 9) is not a local shift of the inclusive fitness pay-off matrix (Table 8). This is why the rational actor heuristic fails if utility is defined as inclusive fitness in the nonadditive case.

Discussion

Hamilton's original formulation of inclusive fitness theory assumed additivity of costs and benefits. A number of authors have emphasized that an exact version of Hamilton's rule holds with nonadditive pay-offs, so long as the −C and B terms are suitably defined. Here, we have focused on the relevance of pay-off additivity not for Hamilton's rule itself, but for Hamilton's (logically distinct) claim that evolution will lead organisms to behave as if trying to maximize their inclusive fitness, understood here to mean personal pay-off plus r times partner pay-off.

In a recent critique, Allen et al. (2013) observe that arguments for inclusive fitness maximization all rely on pay-off additivity, and that where selection is frequency-dependent, fitness maximization need not generally occur. They write: ‘evolution does not, in general, lead to the maximization of inclusive fitness or any other quantity’ (p. 20138).

Our analysis partly supports this conclusion. Here, we have understood maximization to include best-response, so that the presence of frequency dependence does not automatically preclude a maximization principle from holding; we have allowed the utility function to be any function of the pay-off parameters b, c and d and the relatedness coefficient r. At the evolutionary equilibrium of our simple nonadditive model, it is not true that organisms behave as if trying to maximize their inclusive fitness pay-off. However, there is a somewhat similar quantity – Grafen 1979 pay-off – that organisms do behave as if they are trying to maximize.

It is an open question whether our positive result – maximization of Grafen 1979 pay-off – extends to more complicated models of social evolution, for example that incorporate local interaction, multiple social partners, or class structure, or to more realistic genetic architectures than haploid inheritance. There is no guarantee that it does, as such models typically lead to more complicated evolutionary dynamics than those assumed here. As has been emphasized before, a valid maximization argument must always deduce the quantity being maximized, if any, from the underlying evolutionary dynamics (Mylius & Diekmann, 1995).

Also, we have assumed that the coefficient of relatedness, r, remains constant as the population evolves. Without this assumption, it makes little sense to allow the utility function to depend on r, as this would be tantamount to positing a changing ‘goal’ so would again trivialize the rational actor heuristic. In some inclusive fitness models, r is in fact a dynamic variable rather than a constant (e.g. van Baalen & Rand, 1998), so it cannot be assumed that our results, or ones like them, can be derived for these models.

Our negative result, that maximization of inclusive fitness only holds with additive pay-offs, is in line with previous results by Bergstrom (1995) and Lehmann & Rousset (2014a); it supports some of the claims made by opponents of inclusive fitness theory such as Allen & Nowak (2015). The key logical point to note is that although a version of Hamilton's rule is indeed a fully general evolutionary principle, as Gardner et al. (2011) stress, no principle about individual maximization can be deduced directly from this form of the rule. Whether such a principle holds, and if so what the quantity being maximized is, needs to be shown on a case-by-case basis.

Finally, what are the implications for biological practice? Behavioural ecologists have often used inclusive fitness maximization as a way to interpret observed behaviour in the field, in line with Hamilton's original suggestion. Our analysis suggests that this will not always be possible. If an observed social behaviour fails to maximize an individual's inclusive fitness, defined as personal pay-off plus r times partner's pay-off, the behaviour may nonetheless be adaptive and the population at an evolutionary equilibrium. Moreover, the quantity we have called ‘Grafen 1979 pay-off’ will serve the needs of the behavioural ecologist seeking to identify the ‘goal’ of evolved behaviour in a broader range of cases than will inclusive fitness itself.

Acknowledgments

Thanks to Ken Binmore, Bengt Autzen, Jonathan Birch, Steve Frank, Herb Gintis, Alan Grafen, Andy Gardner and two anonymous referees for their comments and discussion. This work was supported by the European Research Council Seventh Framework Program (FP7/20072013), ERC Grant agreement no. 295449.

Appendix 1 Additive pay-offs

(i) By the covariance formula of Price (1970), we have:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0031$

where w_i is fitness of ith individual, and w_A is average fitness of A type.

From Table 1 and the definition of r,

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0032$

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0033$

Substituting into the expression for Δp and simplifying:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0034$

which is eqn 2.

(ii) When utility = inclusive fitness pay-off (Table 2), then by the definition of Nash Equilibrium (N.E.):

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0035$

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0036$

Therefore, (A, A) and (B, B) are both N.E. $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0037$ rb = c.

Let π_A be an arbitrary mixed strategy that plays A with probability π_A.

Then, by the definition of mixed strategy pay-off:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0038$

where U(A, π_A) is the expected pay-off to strategy A played against π_A, etc.

By Table 2:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0039$

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0040$

Therefore, U (A, π_A) = U(S, π_A) $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0041$ rb = c $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0042$ (π_A, π_A ) is a N.E.

This establishes the correspondences in Table 4.

Nonadditive pay-offs

(i) By the covariance formula of Price (1970), we have:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0043$

From eqn 3, we have: $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0044$

Substituting into the expression for ∆p:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0045$

which is eqn 5.

(ii) Equation 6 us that rB – C = (rb − c) + d [r + p(1 − r)]

Therefore, A evolves to fixation from any initial frequency

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0046$

Similarly, S evolves to fixation from any initial frequency

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0047$

Polymorphic equilibrium obtains when rB − C = 0. Solving for p gives:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0048$

If d > 0, then ∆p < 0 when p < p* and ∆p > 0 when p > p*, so the equilibrium is unstable. If d < 0, the equilibrium is stable.

Suppose that r < 1 and d > 0 (case 1).

Then, (rb − c) + d[r + p(1 − r)] → (rb – c + rd) as p → 0
and (rb − c) + d[r + p(1 − r)] → (rb – c + d) as p → 1
Therefore, A evolves to fixation $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0049$ (rb − c) + rd ≥ 0.
Similarly, S evolves to fixation $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0050$ (rb − c) + d ≤ 0.
When rb – c + d > 0 > rb – c + rd, then 0 < [c − r(b + d)]/d[1 − r] < 1, so an unstable polymorphism evolves.

Suppose that r < 1 and d < 0 (case 2).

By identical reasoning to case 1:
A evolves to fixation $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0051$ (rb − c) + d ≥ 0.
S evolves to fixation $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0052$ (rb − c) + rd ≤ 0.
When rb − c + d < 0 < rb – c + rd, then 0 < [c − r(b + d)]/d[1 − r] < 1, so a stable polymorphism evolves.

Suppose that r = 1 (case 3).

Then, rB – C = b – c + d. Therefore,
A evolves to fixation $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0053$ b – c + d > 0
S evolves to fixation $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0054$ b – c + d < 0
No evolutionary change occurs $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0055$ b – c + d = 0

This establishes the evolutionary dynamics in Table 7.

(iii) When utility = inclusive fitness pay-off (Table 8), then by the definition of Nash equilibrium:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0056$

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0057$

Let π_A be an arbitrary mixed strategy. Then, (π_A, π_A) is a N.E. if and only if U(A, π_A) = U(S, π_A). From Table 8,

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0058$

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0059$

Therefore, $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0060$

(iv) When utility = Grafen 1979 pay-off (Table 9), then by the definition of Nash equilibrium:

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0061$

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0062$

Let π_A be an arbitrary mixed strategy. Then, (π_A, π_A) is a N.E. if and only if U(A, π_A) = U(S, π_A). From Table 9,

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0063$

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0064$

Therefore, $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0065$

If (A, A) is sole pure-strategy N.E.

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0066$

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0067$

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0068$

Similarly, (S, S) is sole pure-strategy N.E. ⇒ S spreads to fixation.

Consider mixed strategy (π_A, π_A), where $urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0069$

Then, (π_A, π_A) is the only N.E.

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0070$

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0071$

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0072$

Also, (π_A, π_A), (A, A) and(S, S) are all N.E.

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0073$

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0074$

$urn:x-wiley:1010061X:media:jeb12808:jeb12808-math-0075$

This establishes the correspondences in Table 10.

References

Alger, I. & Weibull, J.W. 2012. A generalization of Hamilton's rule–love others how much? J. Theor. Biol. 299: 42–54.
10.1016/j.jtbi.2011.05.008
PubMed Web of Science® Google Scholar
Allen, B. & Nowak, M.A. 2015. Games among relatives revisited. J. Theor. Biol. 194: 391–407.
PubMed Web of Science® Google Scholar
Allen, B., Nowak, M.A. & Wilson, E.O. 2013. Limitations of inclusive fitness. Proc. Natl. Acad. Sci. USA 110: 20135–20139.
10.1073/pnas.1317588110
CAS PubMed Web of Science® Google Scholar
van Baalen, M. & Rand, D.A. 1998. The unit of selection in viscous populations and the evolution of altruism. J. Theor. Biol. 193: 631–648.
10.1006/jtbi.1998.0730
PubMed Web of Science® Google Scholar
Bergstrom, T. 1995. On the evolution of altruistic ethical rules for siblings. Am. Econ. Rev. 85: 58–81.
Web of Science® Google Scholar
Binmore, K. 2007. Playing for Real. Oxford University Press, Oxford.
10.1093/acprof:oso/9780195300574.001.0001
Google Scholar
Birch, J. 2014. Hamilton's rule and its discontents. Br. J. Philos. Sci. 65: 381–411.
10.1093/bjps/axt016
Web of Science® Google Scholar
Birch, J. & Okasha, S. 2015. Kin selection and its critics. Bioscience 65: 22–32.
10.1093/biosci/biu196
Web of Science® Google Scholar
Day, T. & Taylor, P.D. 1998. Unifying genetic and game theoretic models of kin selection for continuous traits. J. Theor. Biol. 194: 391–407.
10.1006/jtbi.1998.0762
CAS PubMed Web of Science® Google Scholar
Fisher, R.A. 1930. The Genetical Theory of Natural Selection. Clarendon Press, Oxford.
10.1890/0012-9658(2006)87[1445:SOEFDD]2.0.CO;2
Google Scholar
Frank, S.A. 1998. Foundations of Social Evolution. Princeton University Press, Princeton, NJ.
10.1515/9780691206820
Google Scholar
Frank, S.A. 2013. Natural selection VII. History and interpretation of kin selection theory. J. Evol. Biol. 26: 1151–1184.
10.1111/jeb.12131
CAS PubMed Web of Science® Google Scholar
Gardner, A., West, S.A. & Barton, N. 2007. The relation between multilocus population genetics and social evolution theory. Am. Nat. 169: 207–226.
10.1086/510602
PubMed Web of Science® Google Scholar
Gardner, A., West, S.A. & Wild, G. 2011. The genetical theory of kin selection. J. Evol. Biol. 24: 1020–1043.
10.1111/j.1420-9101.2011.02236.x
CAS PubMed Web of Science® Google Scholar
Grafen, A. 1979. The hawk-dove game played between relatives. Anim. Behav. 27: 905–907.
10.1016/0003-3472(79)90028-9
Web of Science® Google Scholar
Grafen, A. 2006. Optimization of inclusive fitness. J. Theor. Biol. 238: 541–563.
10.1016/j.jtbi.2005.06.009
PubMed Web of Science® Google Scholar
Grafen, A. 2009. Formalizing Darwinism and inclusive fitness theory. Philos. Trans. R. Soc. B Biol. Sci. 364: 3135–3141.
10.1098/rstb.2009.0056
PubMed Web of Science® Google Scholar
Hamilton, W.D. 1964. The genetical evolution of social behaviour. J. Theor. Biol. 7: 1–52.
10.1016/0022-5193(64)90038-4
CAS PubMed Web of Science® Google Scholar
Hamilton, W.D. 1971. Selection of selfish and altruistic behaviour in some extreme models. In: Narrow Roads of Gene Land, Vol. 1, pp. 198–228. W. H. Freeman, New York, NY.
Google Scholar
Lehmann, L. & Rousset, F. 2014a. Fitness, inclusive fitness and optimization. Biol. Philos. 29: 181–195.
10.1007/s10539-013-9415-x
Web of Science® Google Scholar
Lehmann, L. & Rousset, F. 2014b. The genetical theory of social behaviour. Philos. Trans. R. Soc. B Biol. Sci. 369: 20130357.
10.1098/rstb.2013.0357
PubMed Web of Science® Google Scholar
Lehmann, L., Alger, I. & Weibull, J. 2015. Does evolution lead to maximizing behaviour? Evolution 69: 1858–1873.
10.1111/evo.12701
PubMed Web of Science® Google Scholar
Martens, J. 2015. Hamilton meets causal decision theory. Br. J. Philos. Sci., in press.
Google Scholar
Michod, R. & Hamilton, W.D. 1980. Coefficients of relationship in sociobiology. Nature 288: 694–697.
10.1038/288694a0
Web of Science® Google Scholar
Mylius, S.D. & Diekmann, O. 1995. On evolutionarily stable life histories, optimization and the need to be specific about density dependence. Oikos 74: 218–224.
10.2307/3545651
Web of Science® Google Scholar
Nowak, M.A., Tarnita, C.E. & Wilson, E.O. 2011. Nowak et al. reply. Nature 471: E9–E10.
10.1038/nature09836
CAS Web of Science® Google Scholar
Price, G.R. 1970. Selection and covariance. Nature 227: 520–521.
10.1038/227520a0
CAS PubMed Web of Science® Google Scholar
Queller, D.C. 1992. A general model for kin selection. Evolution 46: 376–380.
10.2307/2409858
PubMed Web of Science® Google Scholar
Queller, D.C. 2011. Expanded social fitness and Hamilton's rule for kin, kith and kind. Proc. Natl. Acad. Sci. USA 108: 10792–10799.
10.1073/pnas.1100298108
CAS PubMed Web of Science® Google Scholar
Sober, E. 1988. Three differences between evolution and deliberation. In: Modeling Rationality, Morality and Evolution ( P. Danielson, ed.), pp. 408–422. Oxford University Press, Oxford.
Google Scholar
Taylor, C. & Nowak, M.A. 2007. Transforming the dilemma. Evolution 61: 2281–2292.
10.1111/j.1558-5646.2007.00196.x
CAS PubMed Web of Science® Google Scholar
Taylor, P.D., Wild, G. & Gardner, A. 2007. Direct fitness or inclusive fitness: how shall we model kin selection? J. Evol. Biol. 20: 301–309.
10.1111/j.1420-9101.2006.01196.x
CAS PubMed Web of Science® Google Scholar
van Veelen, M. 2009. Group selection, kin selection, altruism and cooperation: when inclusive fitness is right and when it can be wrong. J. Theor. Biol. 259: 589–600.
10.1016/j.jtbi.2009.04.019
PubMed Web of Science® Google Scholar
Wade, M. & Breden, F. 1980. The evolution of cheating and selfish behaviour. Behav. Ecol. Sociobiol. 7: 167–172.
10.1007/BF00299360
Web of Science® Google Scholar
Weibull, J. 1995. Evolutionary Game Theory. MIT Press, Cambridge, MA.
Google Scholar

Citing Literature

Volume29, Issue3

March 2016

Pages 473-482

This article also appears in:

Editor's Choice

Hamilton's rule, inclusive fitness maximization, and the goal of individual behaviour in symmetric two-player games

Abstract

Introduction

The case of additive pay-offs

Additive Prisoner's dilemma

Evolutionary analysis

Rational actor analysis: preliminaries

Utility as inclusive fitness

Rational actor analysis: results

A caveat: uniqueness

Nonadditive pay-offs

Evolutionary analysis

Rational actor analysis

Discussion

Acknowledgments

Appendix 1

Additive pay-offs

Nonadditive pay-offs

References

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Hamilton's rule, inclusive fitness maximization, and the goal of individual behaviour in symmetric two-player games

Abstract

Introduction

The case of additive pay-offs

Additive Prisoner's dilemma

Evolutionary analysis

Rational actor analysis: preliminaries

Utility as inclusive fitness

Rational actor analysis: results

A caveat: uniqueness

Nonadditive pay-offs

Evolutionary analysis

Rational actor analysis

Discussion

Acknowledgments

Appendix 1

Additive pay-offs

Nonadditive pay-offs

References

Citing Literature

References

Related

Information