Volume 90, Issue 5 pp. 2129-2159
Original Articles
Full Access

A General Framework for Robust Contracting Models

Daniel WaltonGabriel Carroll

Corresponding Author

Gabriel Carroll

Department of Economics, University of Toronto

Search for more papers by this author
First published: 20 October 2022
Citations: 5
Formerly titled “When are Robust Contracts Linear?” We thank Rohan Pitchford, Kieron Meagher, Andrés Carvajal, Ilya Segal, Idione Meneghel, Oleg Itskhoki, Ayça Kaya, Marina Halac, Stephen Morris, Matt Jackson, and Laura Doval for helpful comments and discussions, as well as audiences at ANU, BYU, UC Davis, Caltech, Johns Hopkins, and Texas A&M. This research was supported by a Sloan Foundation Fellowship and an NSF CAREER grant. Parts of this work were done while the second author was visiting the Cowles Foundation at Yale and the Research School of Economics at ANU, and he gratefully acknowledges their hospitality. Authors are listed in random order; both contributed equally. An earlier version of this paper was part of the first author's PhD thesis at Stanford University.

Abstract

We study a class of models of moral hazard in which a principal contracts with a counterparty, which may have its own internal organizational structure. The principal has non-Bayesian uncertainty as to what actions might be taken in response to the contract, and wishes to maximize her worst-case payoff. We identify conditions on the counterparty's possible responses to any given contract that imply that a linear contract solves this maxmin problem. In conjunction with a Richness property motivated by much previous literature, we identify a Responsiveness property that is sufficient—and, in an appropriate sense, also necessary—to ensure that linear contracts are optimal. We illustrate by contrasting several possible models of contracting in hierarchies. The analysis demonstrates how one can distill key features of contracting models that allow their findings to be carried beyond the bilateral setting.

1 Introduction

Suppose that a principal wishes to write an incentive contract to induce productive effort. How should she structure the incentives so as to optimally ensure their effectiveness? A rich theoretical literature has explored this question, giving arguments in favor of one or another form of contract.

However, this literature has generally focused on models involving a single principal and a single agent. In reality, agency often takes place beyond simple bilateral relationships. For example, the principal may be a firm or government office, procuring a good of unpredictable quality from a supplier, and committing to a payment that depends on the realized quality; but the supplier has its own internal agency problem, since the representative who signs the contract with the principal may not be the same worker who produces the good. Can we abstract away from the specificity of bilateral contracting models to understand when and why their lessons carry over to models of more complex organizations?

In this paper, we focus on one such lesson that has arisen frequently: that linear contracts—which simply pay some fixed fraction of the output produced—perform well by aligning the expected payoffs of the two parties. The literature on this theme has generally drawn on the idea that there may be a large space of possible actions by the agent, a notion that we formalize subsequently under the name of “Richness.” With a narrow space of possible actions, the structure of the optimal contract may be nonlinear and finely tuned to the known possibilities; but under richness, any such nonlinearities are vulnerable to strategic gaming by the agent. Incarnations of this idea have appeared in static moral hazard models (Diamond (1998), Carroll (2015), Barron, Georgiadis, and Swinkels (2020), Antić (2021)), dynamic moral hazard (Holmström and Milgrom (1987)), and screening as well (Malenko and Tsoy (2020)).

We focus more specifically on the following version of the argument, based on robustness to uncertainty: Consider a principal (“she”) and an agent (“he”), both risk-neutral, who agree to a contract that pays the agent 1/4 of whatever output he produces. Suppose that the principal does not know exactly what productive actions the agent is able to take, but she knows he has some action available that will give him an expected payoff of at least 1500 under this contract. Then, even without any information about what other actions are available, the principal can be sure the agent will get a payoff at least 1500 and, therefore, she gets at least 4500 for herself (since she receives 3/4 of output against the agent's 1/4). This argument was developed previously by Carroll (2015), which formalized this idea of a guarantee for the principal via a worst-case criterion, and showed more generally that linear contracts are optimal under such a criterion.

To see the difficulty in generalizing this conclusion beyond the simple bilateral setting, consider now a three-player hierarchy: The principal contracts with a supervisor (also “she”); the supervisor then subcontracts with an agent, and the agent chooses the action that determines output. Payments in both contracts are functions of output only. There are (at least) three natural ways to write this model, with different informational assumptions:

  • (i) As in the bilateral model, the principal knows some actions available to the agent, but there may be other actions that she does not know about. The supervisor, however, fully knows the agent's production technology.
  • (ii) The supervisor may know more actions than the principal does, but she suspects that the agent has still more actions available. Thus, the supervisor maximizes a worst-case objective with respect to unknown actions the agent may have; the principal has uncertainty over both the agent's possible actions and the supervisor's knowledge.
  • (iii) The supervisor knows no more than the principal does; both of them face the same uncertainty about the agent's technology (and both maximize for the worst case).

We outline these three models more carefully in Section 2. All three of them allow a large space of possible actions by the agent, and indeed, all three satisfy our formal Richness condition. Yet, it turns out that linear contracts maximize the principal's worst-case criterion in models (i) and (ii), but not in model (iii) in general. (This will be shown in our later analysis.) Thus, the details of the model matter, and it may not be initially obvious what, beyond Richness, is needed.

Our paper aims to identify the additional condition at a high level of generality—and, in the process, obtain a better understanding of the essential ingredients behind the linearity argument. To do this, we abstract away from any particular organizational form. Instead, the principal contracts with a counterparty of unspecified structure. The principal's uncertainty about the environment is described by a correspondence Φ, where urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0001 specifies the distributions over output that she thinks may potentially arise when she offers contract w. The principal wants to choose w to maximize her expected net profit in the worst case. Throughout, we maintain the background assumptions that the principal is risk-neutral and uses the worst-case criterion; the focus is on understanding the properties of Φ that are central to the linearity argument.

We formalize the Richness property by requiring that, whenever some distribution over output is a possible response to a given contract w (i.e., lies in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0002), any other distribution with the same expected output but higher expected payment to the counterparty is also possible. That is, the counterparty has the flexibility to extract maximal payment for a given expected output. In this form, it is clear that Richness by itself cannot pick out linear contracts, since it says nothing about how Φ varies when the contract changes.

The needed additional property that we identify is Responsiveness, which expresses that the counterparty's behavior responds to the incentives provided by expected payment. Responsiveness requires that when one contract w is replaced by a new contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0003, such that every distribution that might have been chosen in response to w earns a higher expected payment than before, while some other distribution not chosen under w earns a lower payment than before, this unchosen distribution remains unchosen.

We show that the Richness and Responsiveness properties together imply that linear contracts give the best guarantees for the principal. This allows the lesson about the robustness of linear contracts to apply to a broad class of models of contracting with diverse organizational forms. We formally develop the general framework, define Richness and Responsiveness, and present this result in Section 3.

Having noted that, as a supplement to the Richness property, Responsiveness is sufficient to make linear contracts optimal, we next ask if it is also necessary. To address this question, in Section 4, we give a converse result. For this, we develop an auxiliary framework in which contracts specify payment as a function of a physical outcome, and Φ describes the counterparty's behavior in response to such contracts; the principal's value for each possible outcome is a separate parameter of the model. (For example, one can think of a supplier that can produce different goods, and a contract specifies a payment for each. How the supplier reacts to any given contract is independent of how much the principal values each good.) The Responsiveness property and a strengthened version of Richness can be expressed in this setting. When they hold, our main linearity result immediately implies that the principal can always maximize her guarantee by offering a linear contract, meaning one that pays proportionally to the principal's value for the realized outcome. Our converse shows that, once we ignore certain contracts that can be ruled out a priori as never optimal, if Responsiveness is violated, then there exists a valuation for the principal under which linear contracts are not optimal. This shows that our Responsiveness condition captures, at a formal level, the specific cross-contract restriction on behavior needed for the linearity result.

After presenting the results above, in Section 5 we analyze the robust principal-agent model and hierarchical models (i)–(ii) sketched above in more detail to indicate how the Richness and Responsiveness properties can be verified. (The Online Supplementary Material, Walton and Carroll (2022), in Section S-1, presents two further applications to illustrate the breadth of our framework.) Section 6 examines hierarchical model (iii), where Responsiveness is violated and linear contracts can fail to be optimal.

A reader might think that our whole exercise is unnecessary because the models where linear contracts turn out to be optimal can be easily reduced to bilateral contracting models anyway. This reaction is misplaced. For example, one might try to reduce hierarchical model (i) to a robust principal-agent model by combining the supervisor and agent into a single entity, whose cost of producing any output distribution is defined as the cost for the supervisor to induce the agent to choose that distribution in the original hierarchical model. However, this reduction fails because it does not preserve the structure of uncertainty needed to apply the result from the robust principal-agent model. In Section 7, we explain this failure in further detail. We also describe what the worst-case environment in the hierarchical model actually looks like; it is quite different from the worst-case in the principal-agent model.

Although the substantive results in this paper concern linear contracts, our conceptual framework more generally offers a way to express and prove results for contracting models without relying on a particular organizational structure. Robustness arguments for linear contracts naturally call for such a framework, because, as Section 7 shows, there does not seem to be any easy reduction argument that would allow us to directly extend the results from bilateral environments to more complex ones. In principle, the same methodology is applicable in the analysis of other forms of contracts and properties of contracting models that favor them. To illustrate by example, we demonstrate, in Section S-3 of the Online Supplementary Material, a result on concave contracts analogous to our main theorem on linear contracts: under Responsiveness and a particular weakening of Richness, concave contracts are optimal.

Our work connects to several branches of literature. First, it naturally relates to the body of work on linear contracts and their robustness against large spaces of actions, mentioned previously. While many of the arguments in this literature are thematically related, which helps motivate our interest in focusing on linear contracts, we certainly do not claim that all of these previous findings are special cases of the results developed here. Also related is the literature explaining other kinds of simple incentive structures as robust in unknown environments, such as Frankel (2014), Garrett (2014), and Carroll and Meng (2016).

There is also considerable previous work on incentives in hierarchies and more complex structures, mostly focusing on comparison across organizational forms (surveyed in Mookherjee (2006, 2013)). (There is a separate and more distant strand of literature on hierarchies, such as Tirole (1986), that focuses on issues of collusion.) Yet there seems to be little work studying how organizational structure interacts with the optimal choice of contractual form.

Finally, closest in spirit to the present work are several recent papers that study robust moral hazard contracting in different organizational environments. In particular, there is the work of Dai and Toikka (2022), which studies robust incentives for a team (and which inspired one of the additional applications explored in the Online Supplementary Material). Others in this line are Marku, Ocampo, and Tondji (2022), which takes up a common agency model, and Kambhampati (2022), which considers two agents who produce independently but are known to share a common technology.

2 Overview of Examples

We begin with brief descriptions of our main example applications, meant to give context for the general framework introduced in Section 3. The examples will be presented in formal detail in Section 5.

Robust Principal-Agent Model. In the basic application, the principal contracts directly with an agent, offering a contract that specifies payment as a function of output. Limited liability applies (in this example and throughout the paper): the contract can never pay less than zero. The agent can take any of various actions; an action is modeled as a pair, consisting of a probability distribution over output and a (nonnegative) effort cost incurred by the agent. The principal knows of some set of actions that are definitely available to the agent. But the principal does not know the true production technology, that is, the set of actions actually available. For any contract she can offer, she evaluates it based on her guaranteed payoff, that is, her expected net profit (after paying the agent) in the worst case over all possible technologies consistent with her knowledge. The guarantee of a contract is typically strictly positive, because the principal knows that the agent is optimizing under the true production technology, so will not take a totally unproductive action if he is known to have a better action available. The analysis of Carroll (2015) showed that the best guarantee for the principal is attained by a linear contract, and we shall recover this result as one instance of our general framework.

Hierarchical Model (i). In this model, the principal offers a contract to a supervisor, again specifying (nonnegative) payment as a function of output. The supervisor, after seeing this contract, in turn offers a contract to the agent, also specifying (nonnegative) payment as a function of output. The agent privately chooses his action, output is produced, and then both the supervisor and agent are paid according to their respective contracts.

In this model, we assume that the supervisor knows the agent's technology, so when she writes a contract, she is solving a standard, Bayesian version of a principal-agent problem, in which the “output” produced by the agent is not the output in the original model but rather the payment received by the supervisor. The principal, as before, knows only some actions available to the agent, but does not know the full technology, and evaluates contracts by the worst-case expected payoff over possible technologies.

Hierarchical Model (ii). The hierarchical structure is as in the previous model, but now the supervisor's knowledge is different: she may know of actions that the principal does not, but is uncertain as to whether there are still more actions available, and writes her contract with the agent to maximize her own worst-case guarantee. Note that the relationship between the supervisor and the agent is now described by the robust principal-agent model above; this implies that the supervisor has an optimal contract in which she offers the agent some fixed fraction of the payment she receives from the principal.

The principal does not know the full technology, nor how much of it is known by the supervisor, and again uses the worst-case criterion.

Hierarchical model (iii). In this version of the hierarchical model, the supervisor and the principal are symmetrically uninformed: the supervisor knows only as much about the technology as the principal does, and (as in model (ii)) maximizes a worst-case guarantee when contracting with the agent.

Model (iii) can be expressed in the language of our general framework below, but it does not satisfy the conditions for our linearity result (in particular, the Responsiveness property is violated), and indeed the result may fail, as we shall show in Section 6.

In Section S-1 of the Online Supplementary Material, we give two more examples to illustrate the breadth of potential applications for our framework. In the first example, a supervisor contracts with a team of two agents who play differentiated roles in producing output. The second example is a simplified version of the model of Dai and Toikka (2022); the principal contracts directly with a team of agents, and we simply assume that payments to the team are split equally among the agents.

One might argue that these models make some demanding assumptions. The limited liability restriction, which we maintain throughout, may be less natural in the kinds of firm-to-firm settings where hierarchies naturally arise than it would be in contracting with individuals. In addition, hierarchical models (ii) and (iii) impose the worst-case criterion as a positive description of the supervisor's behavior, unlike in the principal-agent model, where worst-case maximization can be viewed simply as a language for expressing statements about robustness properties of contracts. Nonetheless, our goal here is not to write down the most defensible model of contracting in hierarchies, but simply to illustrate that there are various models one could consider, only some of which deliver linear contracts, and thereby to motivate the search for their common features.

3 Main Framework and Result

First, some notational conventions. We write urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0004 for the space of Borel distributions on a metric space X. We equip urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0005 with the weak topology. For urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0006, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0007 is the degenerate distribution putting probability 1 on x. We also write urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0008 for the set of nonnegative real numbers, and equip it with the usual topology. We write urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0009 for the convex hull of X, when urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0010. We write urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0011 for the space of continuous functions from X to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0012, equipped with the sup-norm, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0013. Recall that when X is compact, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0014 is a Banach space. We write urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0015 for the subset of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0016 consisting of functions whose values lie in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0017.

3.1 The Modeling Framework

There is a principal, who contracts with a counterparty, which will subsequently produce (stochastic) output that accrues naturally to the principal. The principal can provide incentives by promising payments to the counterparty.

There is an exogenously given set urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0018 of possible output values. We assume Y is nonempty and compact, and normalize urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0019, and denote urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0020. A contract is a function urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0021. Note that this definition incorporates the limited liability restriction: the contract must pay a nonnegative amount.

We are particularly interested in linear contracts, which are of the form
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0022
where urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0023 is a constant. The special case urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0024 is called the zero contract.

We take as given a nonempty-valued correspondence urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0025, the outcome correspondence. urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0026 describes the set of distributions over output y that the counterparty may generate in response to contract w, from the principal's point of view. The multiple-valuedness of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0027 thus reflects the principal's uncertainty (about the production technology, or other aspects of the environment). Note that the interpretation of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0028 is not simply that distribution F may be physically feasible, but rather that it might actually occur in response to contract w. For example, if the principal knows that the counterparty is able to produce output 0 with probability 1, but would never do so in response to w because some other distribution is better incentivized, then we would have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0029. For now, we treat the correspondence Φ as exogenously given; in each of the individual applications in Section 5, we will in turn define Φ from more primitive objects.

Any contract w is then evaluated by its worst-case guarantee for the principal across environments. Since the principal's ex post payoff equals the output she receives minus the payment made to the counterparty, the relevant criterion is
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0030

We now consider the following properties that Φ may have.

Richness.Suppose urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0031, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0032, and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0033 is another distribution such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0034 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0035. Then urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0036.

This property essentially says that the set of possible responses to a given contract is sufficiently broad: for any distribution that the counterparty might produce, any other distribution with the same expected output but higher average payment to the counterparty is also possible. Even more simply put, for any given expected output, the principal worries that the counterparty will extract the highest possible average payment.

Responsiveness.Suppose urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0037 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0038 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0039. Suppose that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0040 for all urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0041, while urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0042. Then urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0043.

This property expresses how the possible outcomes respond to the incentives provided by expected payment. If an “old” contract w is replaced by a “new” contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0044, such that any distribution that the counterparty might have produced under the old contract now pays more (in expectation) than before, while some other distribution F pays less than before, the counterparty will not switch to choosing F.

One way to understand Responsiveness is to consider a standard principal-agent problem without uncertainty: The counterparty is a single agent, and there is some fixed, mutually known set of output distributions F that he can produce, each with an associated cost urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0045. When the principal offers contract w, the agent chooses F to maximize urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0046. Thus urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0047 is the set of maximizers F. This model satisfies Responsiveness: Consider any urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0048, and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0049 for which the hypotheses of Responsiveness are satisfied. If F is not even feasible, clearly urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0050. Otherwise, F is feasible but not optimal under w. Then let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0051 be an optimal choice. So urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0052. Since urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0053 while urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0054, we have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0055, that is, F remains nonoptimal under urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0056.

Intuitively, we would expect Responsiveness to be satisfied when the counterparty is a single agent who maximizes expected value as in the example above, or more generally, when the counterparty has a “leader” who understands the environment and maximizes expected value (such as the supervisor in hierarchical model (i)). But it also turns out to be satisfied in some other models, such as hierarchy (ii) where the leader is not expected-value-maximizing. We discuss further in Section 5.3.

For simplicity, we have not included a participation constraint. An earlier version of the paper (Walton and Carroll (2019)) describes how such a constraint can be accommodated, by restricting the principal's choice of contracts to a subset of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0057, interpreted as the set of contracts that the counterparty is sure to accept, and assuming an appropriate structure on this subset.

3.2 Linearity Result

Now we come to the first main result: our conditions are sufficient for optimality of linear contracts.

Theorem 1.Suppose the correspondence urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0058 has the Richness and Responsiveness properties. Then, for any contract w, there is a linear contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0059 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0060.

We prove Theorem 1 constructively, by using the worst-case scenario under w to determine the slope of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0061. The argument is illustrated in Figure 1. First, for any given level of expected output, say μ, Richness ensures that the principal is worried about the highest expected payment ν; this highest expected payment is given by the concavification of w, call it ŵ. Concavity of ŵ implies that the ratio urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0062, the expected payment per dollar of expected output, is decreasing in μ. So, among all distributions in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0063, the one with the lowest expected output μ is also the one for which the fraction of output ceded to the agent is highest; hence, this must be the worst-case distribution. The resulting outcome is shown as point urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0064 in Figure 1. We then take urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0065 to be the linear contract that passes through this same point. Under urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0066, the payment-per-output ratio is constant, so this new contract both pays more than the old one for higher expected output and pays less for lower expected output. By Responsiveness, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0067 can only motivate the counterparty to produce higher expected output than w—which in turn means higher expected profit for the principal, since linearity ensures that expected output and expected profit are aligned.

Details are in the caption following the image

Constructing a linear contract w′ with as good or better guarantee than some initial contract w.

The above proof sketch is imprecise about the distinction between weak and strict inequalities, and also implicitly assumes that the inf in the definition of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0068 is attained, which it may not be. The full proof below fills in these gaps. While we have used the concavification ŵ for intuition, the formal proof does not need to refer to it.

Proof.We may assume that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0069, since otherwise we can just take urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0070 to be the zero contract. Now, let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0071. Note that this expression is well-defined, since every urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0072 satisfies urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0073. In fact, for every urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0074, we have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0075, and thus λ is bounded above by urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0076.

Now define the linear contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0077. We will show that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0078.

Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0079 be a sequence of distributions in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0080 approaching the inf in the definition of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0081: urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0082. By taking a subsequence, we can assume that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0083 converges to some limiting distribution urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0084. Put urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0085, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0086 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0087. Since urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0088 for each k, we have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0089. We claim that equality must hold. If not, pick urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0090 with urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0091. By definition of λ, there exists urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0092 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0093. Since urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0094, we have

urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0095
Hence, for sufficiently large k, we have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0096 as well. Take urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0097 to be an appropriate convex combination of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0098 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0099 so that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0100. Then we have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0101 (since the latter inequality holds both for the component urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0102 and, trivially, for urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0103). Hence, either urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0104, or else we can apply Richness to the distributions urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0105 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0106 to conclude urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0107. Either way, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0108 contains a distribution with expected output urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0109 and expected payment at least urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0110. This means that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0111. But taking urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0112 gives urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0113, a contradiction. Thus, we conclude urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0114 as claimed.

This implies that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0115. We can complete the proof by showing that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0116. Since every distribution F satisfies urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0117, it suffices to show that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0118 for every urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0119.

Consider any distribution urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0120 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0121; we need to show urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0122. Put urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0123. Let F be a convex combination of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0124 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0125 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0126. We have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0127, since this inequality holds both for the component urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0128 and (trivially) for urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0129. Since urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0130, we have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0131. Moreover, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0132, while for every urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0133, we have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0134; thus, Responsiveness implies urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0135. Since F has the same expected output and the same expected payment under urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0136 as urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0137 does, Richness then implies urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0138. Q.E.D.

This shows that Richness and Responsiveness are sufficient for linear contracts; but are they necessary? In Section 4, we will give a converse result that aims toward addressing this question.

For the moment, we simply note that neither property can be dropped entirely. Richness alone would not give us the result, since we clearly need some assumption on how urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0139 varies with w. For a more concrete example, in Section 6 we will note that in hierarchical model (iii), Φ satisfies Richness, but the conclusion of Theorem 1 can fail.

To see that Responsiveness alone is not sufficient, just consider a standard principal-agent problem without uncertainty, as was used to illustrate Responsiveness above. As is well known, usually a nonlinear contract is strictly optimal. For example, under standard specifications with a discrete output space and just two possible distributions F, an optimal contract pays only for the one realization of output that achieves the highest likelihood ratio, and pays zero for all other realizations.

3.3 Existence of Optimum

We have still been imprecise on one point: The verbal interpretation given to Theorem 1 is that a linear contract is optimal for the principal. Indeed, if an optimal contract exists, then there is one that is linear. However, it may happen that no optimal contract exists. In this case, under the conditions of Theorem 1, the supremum payoff urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0140 is approached, but not attained, by linear contracts.

It can be useful to have a handy way to check that existence is indeed satisfied in any given model. Define the correspondence urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0141 by urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0142 (recall that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0143 was the linear contract of slope α).

Proposition 2.Suppose that Φ satisfies Richness and Responsiveness. If moreover urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0144 is lower hemicontinuous, then there exists a contract maximizing urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0145 (and, in fact, the maximum is attained by a linear contract).

The proof (a straightforward limiting argument) is in the Appendix. In the examples in Section 5, we use this result to show that an optimal contract exists.

4 A Converse Result

We have argued that, given Richness, the addition of Responsiveness is sufficient for optimal contracts to be linear. To argue that we have really identified the right condition, we should show that Responsiveness is necessary as well. Of course, in the framework so far, this cannot be exactly right: many contracts are clearly far from optimal (e.g., any contract that always pays more than the value of output), so violations of Responsiveness among such contracts are irrelevant. This suggests we should look for a more general framework, in which Responsiveness can be defined at the level of a class of models, so that Responsiveness becomes necessary to ensure that all instances within the class deliver linear contracts.

Specifically, we will now consider a framework in which output is not directly measured in payoff units. Instead, the counterparty produces “physical” outputs, for example, the counterparty may be a supplier that can produce different types of goods. Contracts specify payment as a function of the physical output. The principal, in turn, derives some monetary value from each possible physical output. The principal's valuation of outputs is now an additional parameter in the model; it matters for the principal's preferences but is irrelevant to the counterparty's behavior. Here, Responsiveness will be sufficient to ensure that, no matter what this valuation is, a linear contract (one that pays proportionally to the principal's value) is optimal, and we will argue that Responsiveness is essentially necessary for this conclusion as well.

Thus, for this section only, we consider a given nonempty set Z of physical outputs. We take Z to be finite (and endowed with the discrete metric), and we denote a typical element by z. A contract is now a function urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0146. We take as given a nonempty-valued outcome correspondence urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0147.

We reformulate the Richness property for this setting as follows:

Strong Richness.

  • (a) Suppose urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0148, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0149, and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0150 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0151. Then urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0152.
  • (b) For every urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0153, the set urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0154 is closed.

Part (a) of this property is stronger than our original Richness property because it drops the restriction that F and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0155 should have the same expected output; this restriction cannot be formulated when output does not have a numeric value. We do not view this loss as a major sacrifice. In our applications in Section 5, this restriction matters only because it allows us to include tie-breaking assumptions on the counterparty's behavior (e.g., assuming that if the agent is indifferent between multiple distributions, he chooses the one that is better for the principal); without such assumptions, an optimal contract can sometimes fail to exist. Part (b) of Strong Richness is a technical condition that helps rule out some inconvenient boundary cases.

Together, parts (a) and (b) imply that, for each w, there exists a threshold urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0156 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0157.

We can formulate Responsiveness exactly as in our main framework, since it made no reference to the value of output.

Responsiveness.Suppose urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0158, and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0159 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0160. Suppose that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0161 for all urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0162, while urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0163. Then urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0164.

We define a valuation to be a function urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0165 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0166. For any valuation and any contract, the principal's guarantee is then defined as
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0167
Setting the minimum value of v to 0 is just a normalization that brings this setting into line with the assumption urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0168 in our previous framework.

We can now say that a contract w is linear given v if there exists a constant urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0169 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0170 for all z.

We will also say that w is grounded if urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0171. We will focus our attention on grounded contracts (note that, given any v, any linear contract is grounded). This is justified if, for example, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0172 whenever w and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0173 are two contracts that differ by a constant: then, if a contract is not grounded, the principal can subtract a constant from it without changing incentives and so improve her own payoff. (All of our example applications satisfy this property of invariance to constant translations.)

Suppose that Φ satisfies Strong Richness and Responsiveness. Theorem 1 implies that, given any valuation v, for any contract w, there is a contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0174 that is linear given v and satisfies urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0175.

A conjecture might be that this conclusion fails whenever Responsiveness is violated: that is, if Φ satisfies Strong Richness but not Responsiveness, there exists some choice of valuation v for which a nonlinear contract gives a strictly higher guarantee than any linear one. We will show a slightly weaker version of this statement: Given Φ, some contracts can be quickly ruled out as never optimal regardless of v, except in degenerate cases. We can think of these contracts as “irrelevant.” We will show that if there is a violation of Responsiveness involving “relevant” (and grounded) contracts, then v can be chosen so that a nonlinear contract does better than linear.

Given Φ satisfying Strong Richness, let h be the threshold function defined above; and, for urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0176 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0177, write βw for the contract obtained by scaling w pointwise by β.

Now say that a contract w is scaling-dominated if, for every v such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0178, there exists some urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0179 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0180. These are the contracts we regard as irrelevant. They can be identified as nonoptimal using only a very small part of the correspondence Φ (namely, its values on the scalar multiples βw). Thus we can identify whether w is scaling-dominated using only the values of h on its scalar multiples, and the next proposition characterizes explicitly when this happens.

Proposition 3.Assume Strong Richness, and let w be a grounded contract. Then w is scaling-dominated if and only if it satisfies at least one of the following five conditions:

  • (i) urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0181.
  • (ii) There exists urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0182 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0183.
  • (iii) For every positive number K, there exists urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0184 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0185.
  • (iv) For every number urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0186, there exists urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0187 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0188 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0189.
  • (v) There exist urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0190 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0191 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0192 and
    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0193

Moreover, if none of these conditions is satisfied, we can choose v such that w is linear given v, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0194, and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0195 for all urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0196.

With our focus on contracts that are not scaling-dominated, we can now give our statement on the necessity of Responsiveness.

Theorem 4.Assume Φ satisfies Strong Richness but fails to satisfy Responsiveness between some two contracts urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0197, where urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0198 is grounded and not scaling-dominated. Then the valuation v can be chosen so that w gives a strictly higher guarantee than any linear contract.

The proofs of Proposition 3 and Theorem 4 are left to the Appendix; we just give sketches here.

For the “if” direction of Proposition 3, we go through the conditions one by one. In each case, we use the statement of the condition to identify the relevant rescaling βw that does better than w. (In conditions (iii) and (iv), where β depends on K, we must first choose K appropriately depending on the valuation v.) For the “only if” direction, it suffices to prove the last statement of the proposition; thus we wish to choose v to be proportional to the given contract w (thus making w linear) and pick the constant of proportionality so that w is in fact optimal among linear contracts. Writing down the conditions needed for this to happen, it turns out that we run into difficulty precisely when one of (i)–(v) holds.

For Theorem 4, the key observation is that when Responsiveness is violated between contracts urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0199, and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0200 is linear, then w gives a strictly better guarantee than urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0201 does. (Refer back to Figure 1: a violation of Responsiveness means that some low-output distributions are possible under urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0202 but not under w.) Thus, to prove Theorem 4, we can invoke the last statement of Proposition 3 to choose v such that the given urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0203 is optimal among linear contracts. The foregoing observation then implies that w does strictly better than urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0204, and thus better than any linear contract.

As a final remark, we note that the lengthy conditions in Proposition 3 can be somewhat simplified if we assume that Φ satisfies the following additional property.

Scaling Monotonicity.For any urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0205 and any urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0206, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0207.

Given Strong Richness, this property is equivalent to the statement that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0208 is weakly increasing in β. In words, it says that the counterparty responds positively to rescaling of incentives (but does not say anything about responses to changes in the shape of incentives). It is satisfied in the robust principal-agent model and all three versions of the hierarchical model.

If Scaling Monotonicity is imposed, then condition (ii) of the proposition can never arise, and conditions (iii)–(v) can be rewritten respectively as (iii) urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0209, (iv) urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0210, (v) urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0211, where
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0212

5 Applications

We now return to our main framework where output urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0213 is directly measured in payoff units. We proceed to detail the various applications that were previewed in Section 2, showing how they are instances of the general framework. We indicate how to verify Richness and Responsiveness, as well as lower hemicontinuity. (Some of the formalities are deferred to the Online Supplementary Material.) Hierarchical model (iii) does not satisfy Responsiveness and so is left to the next section.

5.1 Robust Principal-Agent Model

In this model, the counterparty consists of a single agent. An action the agent may take is modeled as a pair urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0214. If action urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0215 is taken, output is drawn according to the distribution F and the agent incurs an effort cost of c. We define a technology to be a nonempty, compact subset of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0216, interpreted as the set of actions available to the agent. Given a contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0217 and technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0218, the agent maximizes objective
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0219
over urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0220. We assume that the principal does not know urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0221. Instead, there is an exogenously given technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0222, representing all actions that are known by the principal to be available to the agent. The agent's actual technology is known only to satisfy urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0223.
Given this description of the model, we can define the outcome correspondence. Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0224; this is the set of distributions that can result when the agent maximizes urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0225 over urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0226 given w. Continuity of w and compactness of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0227 ensure that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0228 is nonempty. Finally, as a tie-breaking condition, we assume that when the agent is indifferent among actions, he chooses the one best for the principal; we call such actions principal-preferred. Formally, this set is denoted as urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0229. This assumption helps to ensure that an optimal contract w exists (discussed more momentarily). Now the outcome correspondence is defined as
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0230
The principal then evaluates contracts according to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0231.

This is the same model as the one considered in Carroll (2015). In our framework, we reproduce the main result of that paper, by verifying that Richness and Responsiveness hold in this model. We also verify that the restricted correspondence urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0232 is lower hemicontinuous, so a maximizing linear contract exists; this existence is needed later when we embed this model in a principal-supervisor-agent hierarchy, as it ensures that the supervisor's behavior is well-defined.

Proposition 5.There exists a linear contract maximizing urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0233.

Essentially, Richness holds because, for any F that might be chosen for some technology, the more-remunerative urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0234 might then also be chosen if it turned out to also be available. Responsiveness holds by the same argument as in the principal-agent model without uncertainty sketched in Section 3.1, repeated for each possible technology. The formal proof ends up a bit lengthy because of tie-breaking technicalities.

Proof.(Richness) Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0235, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0236, so that there exists a technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0237 containing action urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0238 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0239 for all urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0240. Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0241 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0242 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0243. Consider an alternative technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0244. Then

urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0245
for all urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0246, so urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0247. If urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0248, then F being principal-preferred implies that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0249 is principal-preferred in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0250. Otherwise, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0251, and hence the agent strictly prefers taking action urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0252 to all other actions in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0253, so urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0254 is principal-preferred since it is the only element of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0255. So urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0256.

(Responsiveness) Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0257 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0258 satisfy the conditions of the Responsiveness property. If we can show for any technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0259 that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0260, then urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0261. Take any technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0262 containing urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0263 for some urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0264, and take urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0265. By the hypothesis of Responsiveness and agent optimality,

urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0266(1)
If any of these equalities are strict, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0267. Otherwise, they are all equalities. In this case, it is possible for urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0268 only if urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0269 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0270 as well. If this holds, principal tie-breaking under w and the hypothesis of Responsiveness along with the equalities in (1) imply
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0271
in which case F is not principal-preferred under urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0272, so urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0273.

Now we have verified Richness and Responsiveness, so by Theorem 1, we can restrict to linear contracts when maximizing urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0274. It remains to verify that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0275 is lower hemicontinuous.

(Lower Hemicontinuity) Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0276, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0277, and let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0278 be an open neighborhood of F in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0279. We want to show that there exists urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0280 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0281 implies that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0282 is nonempty, where urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0283 is the Euclidean ball of radius η around α restricted to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0284.

If urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0285, then any technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0286 containing urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0287 has urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0288 for any urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0289, and F is principal-preferred, giving urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0290 already. So we can assume that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0291. Then choose urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0292 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0293.

Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0294 be a technology for which the agent produces distribution F. We can assume without loss that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0295. By Berge's theorem, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0296 defined by urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0297 is continuous.

We claim that there exists some η such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0298 implies urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0299. If urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0300, then this claim follows from the fact that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0301 together with continuity of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0302. So focus on urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0303, and suppose no such η exists. Then there is some sequence of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0304 and corresponding optimal actions urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0305, which we can assume (by taking a subsequence) converge to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0306, where urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0307. This contradicts that F has largest mean among zero-cost actions in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0308 (which follows from principal-preferred tie-breaking).

Hence, for any urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0309, constructing new technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0310 yields urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0311 as the unique maximizer of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0312 over urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0313, so urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0314 urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0315 urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0316, and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0317. Q.E.D.

One comment on interpretation: The above proof relies (as do many others later) on adding an arbitrary action of the form urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0318 to the technology. It may seem unrealistic to allow the agent to produce large amounts of output at zero cost. However, the fact that the unknown actions are totally unrestricted is not crucial; the logic can be carried over to more detailed models that incorporate lower bounds on plausible effort costs. (See Carroll (2015), Section II.A, for more details.)

5.2 Hierarchical Model (i)

The three hierarchical models that we analyze have the following structure. The principal contracts with a supervisor, who, after observing this contract, writes a contract with an agent. We assume that, for reasons outside the model, the principal cannot contract directly with the agent. We assume that the supervisor does not directly affect production in any way; the only role the supervisor plays is as an intermediary between the principal and the agent. Technology for the agent is the same as in Section 5.1. The contract from the principal to the supervisor is the w of our general framework; the contract from the supervisor to the agent is denoted urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0319, and we assume both contracts depend solely on output, so that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0320.

The agent's objective is the same as in the robust principal-agent model, but now the agent receives payment from the supervisor, not directly from the principal. Thus, given contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0321 and technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0322, the agent maximizes objective urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0323 over urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0324.

In all versions of the hierarchical model, we assume that the principal does not know urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0325. Like the robust principal-agent model, there is an exogenously given technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0326, representing all actions known by the principal to be available to the agent. Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0327 be defined as before, noting that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0328 refers to the contract between supervisor and agent (henceforth “S-A contract”).

In hierarchical model (i), we assume that the supervisor is perfectly informed of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0329, which must include the actions known to the principal, that is, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0330. The supervisor wants to maximize the expected difference between payments from the principal and payments to the agent. In addition, we restrict the set of permitted S-A contracts to some exogenously given compact set urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0331, which is assumed to contain all linear contracts with slope in the interval urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0332. This assumption is necessary to ensure the supervisor always has a best reply. (Without some such restriction, one can find situations, similar to Mirrlees (1999), in which no optimal contract for the supervisor exists. See the earlier version, Walton and Carroll (2019), for an example and more discussion.)

To formally specify the supervisor's behavior, first, for any w, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0333 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0334, define urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0335 urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0336. Thus urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0337 is the set of distributions that are best for the supervisor among the agent's optimal choices. This again represents a tie-breaking condition, and we refer to elements of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0338 as supervisor-preferred. The supervisor's objective in hierarchical model (i) is then
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0339
The subscript in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0340 is a slight abuse of notation: this is a set of distributions, not a single distribution, but the expectation is well-defined since it is the same for all distributions in this set (and the set is nonempty; see the proof of Lemma 6 below). The “i” in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0341 stands for “informed.”
We impose one more layer of tie-breaking—favoring principal-preferred actions—in order to achieve lower hemicontinuity of the outcome correspondence. Explicitly, define urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0342. Now define
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0343
In words, for the fixed principal-supervisor (“P-S”) contract w and true technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0344, this is the set of output distributions that are possible, given that the supervisor is choosing contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0345 to optimize urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0346, and the agent is maximizing given urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0347, along with the tie-breaking conditions.
Finally, the outcome correspondence in hierarchical model (i) is defined as
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0348
and the principal's corresponding objective is denoted urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0349.

This completes the description of the model. We should make sure urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0350 is nonempty-valued; this is done in the lemma below.

Lemma 6.For each w, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0351 is nonempty.

The proof of this lemma, and remaining proofs in this section, are deferred to the Online Supplementary Material, as they are either technical or similar to previous proofs.

To analyze the model, it helps to restate the definition of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0352. The distribution F lies in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0353 if and only if there exists a triple urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0354, where urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0355 is a technology, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0356, and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0357, that satisfies the following conditions:

  • (a) Supervisor maximization: the contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0358 maximizes urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0359 over urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0360;
  • (b) Agent maximization: action urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0361 lies in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0362 and, given contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0363, this action maximizes the agent's payoff over urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0364;
  • (c) Supervisor-preferred tie-breaking: given urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0365, action urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0366 maximizes the supervisor's payoff over actions satisfying (b);
  • (d) Principal-preferred tie-breaking: given urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0367, action urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0368 maximizes the principal's payoff over actions satisfying (b)–(c).

A triple urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0369 satisfying these conditions will be called a PSA(i)-certificate for F under w.

The following lemma shows that in searching for such a certificate, we can focus on cases where urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0370 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0371 (recall that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0372 denotes the zero contract).

Lemma 7.For any w and F, we have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0373 if and only if there exists a PSA(i)-certificate for F under w of the form urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0374.

To see why this is, take the following perspective on the supervisor's problem: Any choice of contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0375 will induce the agent to produce some distribution F. Rather than view the supervisor as choosing urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0376, we can view her as directly choosing what F to induce, and then inducing it in the least costly way. Now, if there is some technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0377 under which the supervisor would choose to induce F, then under urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0378, the supervisor is all the more inclined to induce F, since she can do so costlessly by offering the agent the zero contract. The lemma follows from this observation, together with careful verification of the tie-breaking conditions.

We can now show that the model falls under our general framework, and thus we have the following.

Proposition 8.There exists a linear contract maximizing urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0379.

We verify the Richness and Responsiveness properties, as well as the lower hemicontinuity property, by arguments very similar to those used in the robust principal-agent model. Along the way, Lemma 7 helps to simplify by reducing the space of possibilities to consider.

5.3 Hierarchical Model (ii)

Hierarchical model (ii) closely resembles the previous hierarchical model. The key difference is that the supervisor is not perfectly informed of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0380. Instead, the supervisor is now uncertain about urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0381 but we assume she is at least as well informed as the principal is. Specifically, the principal knows about technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0382, the supervisor knows about technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0383, and the true technology is urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0384, with urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0385. The principal, for her part, is uncertain about both urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0386 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0387. Since the model continues to focus on the principal's problem, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0388 is a primitive of the model, whereas urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0389 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0390 are free variables. We also no longer restrict the supervisor to contracts in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0391, since such a restriction will not be needed for existence of an optimal S-A contract; thus the supervisor may offer any contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0392.

In this model, ignoring the principal for a moment, the relationship between supervisor and agent looks much like the robust principal-agent model. We now formally describe the supervisor's behavior. Define urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0393 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0394 as in hierarchical model (i). The supervisor's objective in hierarchical model (ii) is
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0395
where urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0396 is the informed supervisor objective of hierarchical model (i), and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0397 is the supervisor's knowledge of the technology. We write “u” to denote “uninformed.” In words, the supervisor maximizes expected money received minus money paid, given the agent's strategic response, in the worst case over all possible technologies containing urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0398.
For fixed P-S contract w, and technologies urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0399, define
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0400
This is the analogue of the set urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0401 from hierarchical model (i). It now depends also on urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0402 since this determines the supervisor's behavior.
Now, we define the outcome correspondence for hierarchical model (ii) to be
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0403
and the principal's objective is denoted by urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0404.

As before, we should check nonemptiness.

Lemma 9.For each w, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0405 is nonempty.

We comment that this is where we use the lower hemicontinuity result from the robust principal-agent model: it ensures that the supervisor has an optimal choice of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0406—in fact, one in which urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0407 is proportional to w—so that the union in the definition of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0408 is not empty.

To analyze the model, we proceed in a fashion similar to the previous hierarchical model. Distribution F is in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0409 if there exists a situation in which the agent would choose it—where now such a situation is described by a quadruple urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0410, meeting conditions analogous to (a)–(d) in hierarchical model (i). We refer to such a quadruple as a PSA(ii)-certificate for F under w. We can formulate an analogue of Lemma 7 for this model (see Lemma S-6 in the Online Supplementary Material): F lies in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0411 if and only if there exists such a PSA(ii)-certificate in which urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0412, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0413 and, moreover, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0414.

This fact is useful in showing that the model satisfies our two properties, and thereby obtaining our linearity result.

Proposition 10.There exists a linear contract maximizing urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0415.

Again, the proof follows the same basic argument as for the principal-agent model (and as for hierarchical model (i)).

It might not be obvious that this model satisfies Responsiveness, since the supervisor is no longer modeled as an expected-utility maximizer. However, the key is that Responsiveness is a property on the correspondence Φ, which gathers the counterparty's behavior across all possible environments (in this case, all possible urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0416 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0417); it does not have to hold in each environment individually. In this model, Lemma S-6 shows, in effect, that we can restrict attention to a crucial subset of possible environments: those in which the supervisor knows that she can induce distribution F for free by offering the zero contract, and she chooses to do so. In this subset of environments, the supervisor does act like an expected-utility maximizer, and this suffices to imply Responsiveness.

6 An Example Where Responsiveness Fails

In this section, we describe version (iii) of the hierarchical model, where the principal is no longer uncertain as to what the supervisor does and does not know. Instead, the supervisor, like the principal, knows only that the technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0418 is a superset of the given urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0419, and she contracts with the agent so as to maximize her own worst-case payoff. We give an example to show that linear contracts can fail to be optimal. In the process, we also observe that this model satisfies Richness, but not Responsiveness in general. In fact, this example is an illustration of Theorem 4, as will be discussed briefly later.

Here are the details. We take urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0420 as in the first two versions of the model. For any contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0421 to the agent, and any true technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0422, define urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0423 as before. Define urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0424 as in model (i), and define urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0425 as in model (ii). This is the supervisor's objective.

Assume for the moment that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0426 (in the language of Section 4, w is grounded). As in model (ii), we can apply the robust principal-agent analysis to the supervisor-agent relationship here to conclude that the supervisor has an optimal contract that takes the form urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0427 for some constant urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0428; however, there may also exist optimal choices of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0429 that are not of this form. We assume that the supervisor uses an optimal contract of this form (but if multiple choices of β are optimal, we remain agnostic about which one is chosen). This restriction on the supervisor's behavior will simplify the analysis, but it is not in keeping with model (ii) where no such restriction was made. With a little more work, one can verify that the lessons of this section are unchanged if the restriction is removed; details are included in an earlier version of this paper (Walton and Carroll (2019)).

Accordingly, define
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0430
This is the set of distributions that may be chosen when the agent's technology is urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0431, and the supervisor has presented him with a contract that is optimal and linear (from the supervisor's point of view) given urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0432. The union arises due to the possibility of multiple optimal choices of β. And now, the outcome correspondence is given by
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0433

(For simplicity, we do not bother with principal-preferred tie-breaking; adding such a tie-break would not change the substantive conclusions.)

For completeness, we should also define urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0434 when w is not grounded, although such contracts will play no role in the subsequent analysis. For simplicity, for any such w, let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0435 be the grounded contract obtained by subtracting the constant urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0436 from w, and just put urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0437.

Now urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0438 is fully defined, and accordingly, the principal's objective urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0439 is defined as usual.

Now let us analyze the model. We first formally note, as promised previously, the following.

Proposition 11.The correspondence urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0440 satisfies Richness.

This is fairly immediate; the formal proof is in the Online Supplementary Material.

Next, if the principal offers a (grounded) contract w, what fraction β will the supervisor share with the agent? The analysis of the robust principal-agent problem (see Carroll (2015)) gives the answer: the supervisor will identify action urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0441 for which urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0442 is maximal, and as long as this quantity is positive, the supervisor will set the corresponding value urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0443. (If there happens to be more than one optimal urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0444, then all corresponding β's are optimal for the supervisor. Also, if there is no known action with urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0445, then the supervisor cannot obtain a positive guarantee, so every urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0446 is optimal—they all give the supervisor a guarantee of zero.) Accordingly, let us say that the contract w targets the action urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0447 if this action maximizes urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0448 over urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0449, and the contract is nondegenerate if the corresponding value of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0450 is strictly positive.

We can further use the principal-agent analysis to explicitly characterize the possible responses by the agent (proof in the Online Supplementary Material).

Lemma 12.Suppose w is grounded and nondegenerate, and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0451 is an optimal choice for the supervisor. A distribution urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0452 lies in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0453 for some urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0454 if and only if

urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0455(2)
where urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0456 is the targeted action leading to slope β.

Therefore, if w is grounded and nondegenerate, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0457 consists of all distributions urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0458 that satisfy (2) for some targeted action urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0459.

Henceforth, for concreteness, let us focus on a particular, parametric specification of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0460. Assume that Y is finite, so that we can ignore the continuity restriction on contracts, and let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0461 be elements of Y with urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0462. Also let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0463 be positive numbers with urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0464. Let
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0465
Thus, there are three known actions, all deterministic. For brevity, we will call these actions “action H,” “action L,” and “action 0.”

Suppose first that the principal uses a linear contract, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0466. We can see which action is targeted depending on the value of α:

  • for urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0467, the contract targets action 0;
  • for urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0468, the contract targets action L;
  • for urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0469, the contract targets action H,

where urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0470. (At boundary cases, two actions are targeted.)

For a contract targeting 0, no positive guarantee is possible. For a contract targeting L, Lemma 12 shows that the possible distributions are the ones for which urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0471, or equivalently urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0472. Since the principal's payoff is urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0473, the payoff guarantee from the contract with slope α is
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0474
By identical reasoning, for a linear contract targeting H, the payoff guarantee is
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0475(3)
So overall, the principal's guarantee is
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0476
(And at the boundary values of α, the guarantee is given by the lower of the two neighboring formulas.)
Now consider a value of α such that the linear contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0477 targets L. Consider instead the nonlinear contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0478 given by urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0479, and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0480 for all other y. Evidently, this contract cannot target L. As long as urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0481, it targets H (rather than 0). And in this case, Lemma 12 tells us that the distributions the agent may produce are the ones for which urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0482, or equivalently, those that produce urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0483 with probability at least urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0484. Since the principal receives urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0485 if output urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0486 realizes, and receives at least 0 for any other output, the principal's guarantee from urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0487 is at least
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0488(4)
This is exactly the same as in (3), but note that it applies for a wider range of values of α, since α only needs to be higher than urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0489, which is less than urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0490.

This suggests that we can choose numeric values for the parameters such that the value of (4), for some appropriate urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0491, is higher than the maximum value of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0492. For example, take urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0493. It can be checked that the maximum value of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0494 occurs at urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0495 and is equal to 5. However, this value of α lies in the range urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0496, and the corresponding value of expression (4) is equal to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0497. Thus the guarantee from urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0498 at urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0499 is higher than the guarantee from any linear contract. This comparison is shown in Figure 2.

Details are in the caption following the image

Guarantee from a linear contract, in example for hierarchical model (iii).

What is happening in this example? An intuition is that if the principal were restricted to linear contracts, she would like to leave the supervisor just a modest fraction of the output, but then the supervisor is less inclined than the principal is to offer the agent incentives targeted at action H. The principal can steer the supervisor back toward incentivizing H instead of L by not paying for the output of L. This gets the supervisor to offer stronger incentives to the agent, which in turn guards against the agent taking relatively unproductive actions. Notice, also, that this use of nonlinear contracts to get the supervisor to target H rather than L would not have worked in hierarchical model (i) (or model (ii)), because there the supervisor may know of other actions besides H or L. In the worst case there, the supervisor targets some new action that produces output urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0500 with some probability and 0 otherwise, and nonlinear w will not help the principal avoid this bad outcome.

We can also relate the failure of linearity in model (iii) back to the Responsiveness property. Notice that the property fails between urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0501 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0502 (with the specific value urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0503), since urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0504 pays weakly more than urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0505 for any distribution, and pays equally much when the distribution is supported on urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0506, yet some such distributions are in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0507 but not in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0508. Given that Responsiveness is violated, we should expect from the results of Section 4 that the numerical parameters can be set in such a way that linear contracts are nonoptimal. In fact, our example can be cast in the framework of that section, by viewing “output H, output L, output 0” as physical outputs whose numerical values are initially left unspecified, and considering a contract w that pays 70, 50, 0 for these outputs, respectively (and ŵ pays 70, 0, 0). Applying Theorem 4 to this violation of Responsiveness leads to the valuation that values these outputs at urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0509, thus recovering precisely this example.

7 Analysis of Hierarchical Models (i) and (ii)

Now that we have shown how several models fit into our linearity framework, we return to study hierarchical models (i) and (ii) in more detail. This analysis serves two purposes. First, we demonstrate by example why hierarchical models do not straightforwardly rewrite as special cases of the original robust principal-agent model. This is important, since if the lessons from principal-agent models already effortlessly extended to other organizational environments, our main linearity result would be redundant. And second, we sketch how one may numerically compute the optimal contract slopes and guarantees in hierarchical models (i) and (ii), thus completing the process of solving for the optimal contract in these models.

A further illustration of what one can do using a detailed analysis of multiple different contracting models appears in Section S-2 of the Online Supplementary Material. That section takes the principal-agent model, together with hierarchical models (i) and (ii), and places them side-by-side to compare the principal's payoffs.

7.1 Nonequivalence of Hierarchical Model (i) to the Robust Principal-Agent Model

One may be tempted to try to reduce hierarchical model (i) to a special case of the robust principal-agent model, as follows: collapse the supervisor and agent into a single “modified agent,” whose cost of producing any distribution F is simply the cost that the supervisor would have to pay to incentivize F in the original hierarchical model. We show here why this reduction fails.

Formally, given a known technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0510 in hierarchical model (i), define urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0511 by the following procedure: for each urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0512, let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0513 subject to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0514, and let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0515 be the corresponding value of the min (if there is no urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0516 satisfying the constraint then put urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0517). Then define urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0518. Thus we keep the possible output distributions in urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0519 the same, but adjust the cost of the action to mimic the expected amount the supervisor would have to pay to induce the action under urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0520. We then apply the robust principal-agent framework with known technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0521. Is urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0522, given urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0523, equal to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0524 given urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0525?

To illustrate, consider urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0526 where urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0527 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0528. With this simple known technology, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0529, since urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0530 makes urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0531 a maximizer of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0532 over urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0533, and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0534 makes urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0535 a maximizer of the same objective over urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0536. Now consider a linear contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0537. Suppose the true technology in the hierarchical model (i) case includes just one additional action urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0538 where urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0539. A straightforward computation shows that the supervisor chooses to induce urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0540, offering urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0541 (and using tie-breaking) to get a payoff of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0542, rather than inducing the agent to choose urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0543 (which requires urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0544) for a payoff of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0545, and hence urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0546.

Now we consider urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0547 in the robust principal-agent framework, with the same linear contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0548. The agent's optimal choice is urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0549, since urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0550. Indeed, across all technologies urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0551, the lowest-mean (worst-case) action that can occur is one that has zero cost and mean 3c, so urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0552 with mean 2c is not possible, so urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0553 given urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0554.

Thus one cannot reduce hierarchical model (i) to the robust principal-agent model as proposed. (Likewise, this reduction would not work for model (ii) either.) A simple intuition is that adding an extra, unknown action to the agent's technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0555 in the hierarchical model can potentially increase the supervisor's cost of incentivizing the original known actions, whereas in the principal-agent model, the costs of the known actions remain unchanged. Thus, the uncertainty about urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0556 has different effects in the two models. This is further underscored in the next subsection where we identify the worst-case technology in the hierarchical model (i), which differs in structure from the worst case in the robust principal-agent model.

Admittedly, the counterexample above does not rule out the possibility that some other, subtler argument might be available to reduce the hierarchical model to a special case of the robust principal-agent model. However, it at least suggests (to us) that any such argument would be nonobvious enough so that the hierarchical model is most naturally viewed as a separate model, as we have presented it.

7.2 Optimal Contract Slope

We have argued in Section 5 that optimal contracts in both hierarchical models (i) and (ii) are linear. Here, we characterize the optimal slope of the linear contract. The main task is to identify the worst-case technology for any given linear contract, which allows us to replace the infimum in the principal's objective with a more explicit function. We sketch the steps here; the method is fully laid out in an earlier version of this paper (Walton and Carroll (2019)).

We begin with the analysis for hierarchical model (i). Assume the principal offers a particular linear contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0557. The first step is to show that, in defining urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0558, rather than taking the union over all possible technologies urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0559, we can consider a much smaller class of technologies. In particular, we can focus on technologies where the agent can produce any output distribution, and his cost of doing so depends only on the mean of the distribution and, moreover, this cost is a convex, nondecreasing function of the mean. Intuitively, once we have focused on linear contracts, only the mean output should matter for all parties involved. Thus a technology may be identified with a cost function urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0560, where the agent can produce any distribution F at cost urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0561. Note that such a choice of κ is consistent with the requirement urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0562 if and only if urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0563 for every urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0564.

The next step is to identify the lowest mean output that might be induced under a given linear contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0565 and known technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0566. To do this, we show that for any κ as above, the supervisor's cost of inducing any given mean output μ is urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0567; therefore, the supervisor chooses μ to maximize urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0568. Thus, for any value of μ, the supervisor will induce mean output μ only if urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0569 for all urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0570. This is a family of inequality constraints on κ, and in the worst case, these inequalities will be binding. In particular, the worst-case mean output μ is the lowest value for which κ can be chosen to satisfy the equality urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0571 for all urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0572, while still respecting the constraint that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0573 for all urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0574. We notate this worst-case mean output as urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0575. The differential equation leads to a one-parameter family of solutions for κ, and we simply pick the worst one that satisfies the urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0576 constraint. Finally, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0577. In the earlier version of the paper, we fill in these steps with calculations, showing that the maximum value of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0578 over α can be more explicitly characterized as
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0579
where
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0580
and that the corresponding optimal value of α is equal to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0581 for the choices of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0582 and μ that attain the max. We thus have a fairly explicit description of the optimal contract, and the principal's guarantee, though there is no fully closed-form solution.
We now briefly discuss hierarchical model (ii); again, the details are in the earlier version of the paper. For any linear contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0583, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0584 that the principal offers, we can exploit the analysis of the robust principal-agent model to characterize optimal behavior for the supervisor under each possible urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0585. The task is further simplified by Lemma S-6, telling us that F is a possible response by the agent if and only if it can occur under a technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0586 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0587 and such that the supervisor offers the zero contract under urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0588. So we just need to identify the distributions F for which some such urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0589 exists. This leads to the characterization
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0590
where urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0591 denotes the subset urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0592.
This identifies the minimum expected output that the agent could potentially produce when the principal offers contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0593. We can use this characterization to compute more explicitly the guarantee from any given linear contract, and then optimize over α. In this case, the computation turns out to give a closed-form solution:
urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0594
and the optimal value of α equals urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0595 for the urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0596 that solves the maximization on the right-hand side.

We comment also that, in hierarchical model (ii) (unlike (i) but similar to the robust principal-agent model), the worst case is attained under a technology urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0597 that consists of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0598 plus just one additional action. This suggests a seeming “discontinuity” between models (ii) and (iii): if the supervisor is allowed to know just one additional action that the principal does not know, the result looks very different than if their knowledge definitely coincides exactly. Of course, this difference depends on the fact that the one additional action can be anything, so that there is already great uncertainty on the principal's part about how the supervisor will behave.

8 Conclusion

Economic analysis uses stylized, simplified models to develop concepts. In particular, agency theory commonly works with bilateral principal-agent models. But actual agency relations may be embedded in much more complex organizations. An effective theory should not fall silent when confronted with this variety.

In this paper, we have examined arguments concerning linear contracts, and their ability to provide robustness to uncertainty by aligning the payoffs of the two parties without being sensitive to details of the environment. To avoid assuming any specific organizational structure, we have proposed a “black-box” framework for reasoning about contracts, working in terms of the outcome correspondence Φ mapping contracts to possible responses. Past literature has pointed to the broad space of possible actions as key to robustness of linear contracts. We have formalized this in our framework as a Richness property on the correspondence Φ. But Richness alone is not enough to allow comparisons across contracts, let alone to identify optimal contracts. We identified a further Responsiveness property, requiring that when contracts change, the possible responses vary in a way that resembles maximizing expected payment. We showed that, as a supplement to Richness, Responsiveness is sufficient—and, in an appropriate sense, essentially necessary—to ensure that linear contracts are optimally robust, in the sense of solving a maxmin problem. We illustrated this in more detail by describing several specific ways to write down a model of hierarchical contracting, all of which satisfy Richness, and showing how Responsiveness distinguishes versions that lead to linear contracts from a version that does not. Our more detailed analysis of the individual models also showed that, even though multiple models lead to linear contracts being optimal, these models are not equivalent to each other.

Our focus has been on understanding when and why linear contracts are robust to uncertainty, without relying on the bilateral principal-agent structure. But a similar methodology could potentially be applied to many other classes of contracts. We illustrate this in the Online Supplementary Material, Section S-3, by showing how alternative conditions on Φ would lead to concave rather than linear contracts. Another potential future application of the approach would be foundations for debt contracts: Antić (2021) shows how changing the space of uncertainty in a robust agency model can lead to debt contracts, rather than linear contracts, as optimal (and, a bit farther afield, Malenko and Tsoy (2020) do the same in a screening model); our work naturally suggests the question of whether some alternative properties on the correspondence Φ would reproduce this finding in an organization-free way. More generally, we hope that our work will spur the development of an appropriate methodology to separate the analysis of formal incentives from assumptions on the organizational environment in which they operate.

  • 1 To be precise, in the models we study, it may happen that the optimum of the principal's objective is not attained, so that linear contracts only approach the supremum. For now, we will ignore this distinction.
  • 2 Limited liability is indeed important. If we instead allowed payments to be arbitrarily negative, we would need to add a participation constraint. If this were done following the approach indicated at the end of Section 3.1, one can show that it would always be optimal to use a “selling the firm” contract, giving the counterparty all the output minus some constant.
  • 3 The continuity assumption on contracts is not actually needed for the linearity result; we impose it only to guarantee existence of best responses in the applications, so that everyone's behavior is well-defined. In any case, it has no bite when Y is an arbitrarily fine discrete grid, so we do not view it as a substantive restriction.
  • 4 To be precise, in order for the optimal contract to exist, we should modify the model by specifying that whenever the agent is indifferent between multiple actions, he chooses the one preferred by the principal. For simplicity, we have skipped over this here. We do make the analogous tie-breaking provision (and show in detail that Responsiveness still holds) for the applications in Section 5.
  • 5 Note that, as in model (ii), the supervisor is not restricted to a compact set of contracts.
  • 6 Garrett, Georgiadis, Smolin, and Szentes (2020) independently study a problem of optimal technology design in a principal-agent setting that shares some features with this analysis.
  • Appendix: Additional Proofs

    Here are proofs omitted from Sections 3 and 4 of the main paper. Remaining proofs are in the Online Supplementary Material.

    Proof of Proposition 2.As noted in the text, Theorem 1 ensures that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0599 is approached within the set of linear contracts. Moreover, for any contract urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0600 whose slope α is greater than 1, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0601, so it is sufficient to restrict attention to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0602. Thus we need only verify that, on the restricted domain urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0603, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0604 has a maximum.

    Define urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0605, and let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0606 be a sequence of values such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0607. By compactness we may assume urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0608 has a limit urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0609. Assume for contradiction that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0610. Put urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0611. The definition of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0612 means there exists urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0613 such that

    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0614
    Now, lower hemicontinuity means that for k large enough, there exists urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0615 such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0616. Then
    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0617
    (Here, the first inequality follows from the definition of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0618, and the other steps are straightforward.) This contradicts the assumption urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0619. Q.E.D.

    Proof of Proposition 3.(“If” direction) We proceed through the conditions one by one, showing that each implies that w is scaling-dominated. Let v be a valuation; we need to show that either urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0620 or urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0621 for some β. For brevity, we write urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0622 rather than urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0623 (and likewise urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0624, etc.).

    If (i) holds, then urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0625 is all of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0626, so urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0627.

    If (i) is not satisfied but (ii) is for some β, then urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0628. So,

    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0629
    where the strict inequality comes from urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0630 since urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0631.

    Next, suppose (i)–(ii) are not satisfied but (iii) is. Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0632 be the output at which w attains its minimum value subject to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0633. Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0634 be the output for which w attains its highest value less than urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0635 (this exists since w is grounded and (i) is assumed not to hold, so that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0636). Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0637. Notice that, given any distribution F for which urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0638, if we gradually move probability mass from output levels where urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0639 to those with urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0640, the expected value of w increases by at least urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0641 times the amount of mass moved; therefore, by moving at most urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0642 mass, we can reach a distribution urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0643 with urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0644.

    Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0645. Take urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0646 in condition (iii), and consider the value of β given by that condition.

    Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0647 be the worst-case distribution for contract βw. Thus, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0648. We also know that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0649, so urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0650.

    If urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0651, then urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0652, and we already have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0653. So assume urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0654. As noted above, we can move at most urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0655 probability mass from urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0656 to obtain a distribution urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0657 with urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0658, and

    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0659
    (recalling urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0660). Therefore,
    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0661
    (The second inequality in the above chain uses urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0662.) Finally, applying condition (iii), the last right-hand side is
    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0663

    Next, suppose conditions (i)–(iii) do not hold but (iv) does. Consider the problem of minimizing urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0664 subject to the constraint urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0665; let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0666 attain the minimum, and let r be the corresponding objective value. Note that if the constraint were replaced by urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0667, this new minimization problem would have objective value equal to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0668, by definition of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0669. We observe that in this latter minimization problem, any solution must satisfy the constraint with equality: otherwise the constraint would not be binding, so we could remove it, but then the value of the problem is simply urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0670, that is, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0671, and we are done. This observation implies that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0672, and also that for every z such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0673, we must have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0674 strictly.

    Choose λ such that

    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0675
    (interpreting the min as ∞ if no such z exists). The latter inequality implies urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0676 for all z such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0677. Now, if we consider the problem of minimizing urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0678 subject to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0679, again the constraint must hold with equality at the optimum: otherwise, the constraint is not binding, and the minimum is attained when F is degenerate on some z with urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0680, but we know that the objective value for any such F is higher than urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0681 and the latter value is attained by urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0682. Therefore, this minimization problem is again solved by urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0683. We conclude that
    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0684(5)

    Now put urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0685, and let β be as given by condition (iv). Note that we must have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0686, since (iv) rearranges to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0687. Define the quantity urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0688. Then, for any distribution F such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0689, we have

    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0690
    and, therefore,
    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0691
    (Here, the first inequality comes from rearranging (5), which applies since urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0692.)

    We thus have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0693 as needed.

    Finally, suppose that conditions (i)–(iv) do not hold but (v) does. Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0694 be as in that condition. We aim to show that one of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0695, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0696, or urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0697 holds. So, assume that none of these holds, and seek a contradiction.

    Take urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0698 to minimize urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0699 subject to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0700, and similarly, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0701 to minimize urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0702 subject to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0703. Note that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0704, since otherwise the left side of the comparison (v) equals 1 but the right side is more than 1, impossible. Consequently, we have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0705, since otherwise

    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0706
    contrary to assumption. (The strict inequality step is where we use urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0707.) Also, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0708.

    Write urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0709, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0710, and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0711, so we know that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0712. Put

    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0713

    We claim urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0714. Indeed, since urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0715 and likewise urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0716, we have

    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0717
    Here, the first inequality is because urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0718 is increasing in x for urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0719, the third inequality is similar, and the middle strict inequality comes from condition (v). Cross-multiplying,
    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0720(6)
    Then
    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0721
    where the inequality follows from rearranging (6).

    Moreover, since urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0722, we have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0723. Combining this with the assumptions urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0724 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0725, we have

    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0726
    This is our desired contradiction.

    (“Only if” direction) Evidently, it suffices to prove the last statement of the proposition, since the conclusion of that sentence implies w is not scaling-dominated.

    Assume none of (i)–(v) holds for w. Let urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0727. Set urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0728 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0729 (which we define as ∞ if urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0730). For urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0731, it must be that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0732, otherwise (ii) would hold. Furthermore, this inequality is strict, otherwise (iii) would hold (unless urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0733 but then (i) would hold). For any such urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0734, then urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0735 is well-defined and is ≥1; hence urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0736. We are ensured urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0737, again by the negation of (iii). We also have that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0738: either urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0739 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0740, or if B is nonempty and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0741, property (v) holds. Finally, we know urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0742, otherwise (iv) would hold.

    Choose any finite K such that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0743 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0744; the previous paragraph ensures this is possible. Set the valuation urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0745. This is a valuation (i.e., its minimum value is 0) because w is grounded. Thus w is linear given v, with slope urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0746. For any urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0747, under contract βw, the worst-case expected profit is urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0748 (or lower, if urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0749). So, to establish that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0750, it suffices to show

    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0751(7)
    For urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0752, this holds since urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0753 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0754 is positive. For urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0755, this holds since urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0756 and urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0757 is positive. If urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0758, then urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0759, unless urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0760, in which case urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0761. Thus (7) holds. Finally, for urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0762, urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0763, and we have urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0764 for this case as well. Q.E.D.

    Proof of Theorem 4.Let v be the valuation obtained by applying the last statement of Proposition 3 to urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0765. Thus urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0766 for some urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0767.

    We hold v fixed. We will show that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0768, which is sufficient, since we already know that urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0769 has the highest guarantee among linear contracts.

    Let F be the distribution for which Responsiveness fails. Since urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0770, it suffices to show that

    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0771(8)
    By the hypothesis of Responsiveness, the left-hand side of (8) equals
    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0772
    the last inequality by definition of urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0773. Applying the other part of the hypothesis of Responsiveness, and using urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0774, we have
    urn:x-wiley:00129682:media:ecta200450:ecta200450-math-0775
    which shows (8). Q.E.D.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.