Dissociable regulation of instrumental action within mouse prefrontal cortex
Abstract
Evaluation of the behavioral ‘costs’, such as effort expenditure relative to the benefits of obtaining reward, is a major determinant of goal-directed action. Neuroimaging evidence suggests that the human medial orbitofrontal cortex (mOFC) is involved in this calculation and thereby guides goal-directed and choice behavior, but this region’s functional significance in rodents is unknown despite extensive work characterizing the role of the lateral OFC in cue-related response inhibition processes. We first tested mice with mOFC lesions in an instrumental reversal task lacking discrete cues signaling reinforcement; here, animals were required to shift responding based on the location of the reinforced aperture within the chamber. Mice with mOFC lesions acquired the reversal but failed to inhibit responding on the previously reinforced aperture, while mice with prelimbic prefrontal cortex lesions were unaffected. When tested on a progressive ratio schedule of reinforcement, mice with prelimbic cortical lesions were unable to maintain responding, resulting in declining response levels. Mice with mOFC lesions, by contrast, escalated responding. Neither lesion affected sensitivity to satiety-specific outcome devaluation or non-reinforcement (i.e. extinction), and neither had effects when placed after animals were trained on a progressive ratio response schedule. Lesions of the ventral hippocampus, which projects to the mOFC, resulted in similar response patterns, while lateral OFC and dorsal hippocampus lesions resulted in response acquisition, though not inhibition, deficits in an instrumental reversal. Our findings thus selectively implicate the rodent mOFC in braking reinforced goal-directed action when reinforcement requires the acquisition of novel response contingencies.
Introduction
Several distinct brain structures have been implicated in learning causal relationships between behavioral responses and their consequences, and in adjusting behavior based on changes in response contingencies (Balleine & Dickinson, 1998). For example, lesions of the prelimbic prefrontal cortex (PL) in rats impair instrumental response acquisition under conditions of uncertain reward availability (Corbit & Balleine, 2003). Human neuroimaging reports suggest that the medial orbitofrontal cortex (mOFC), which lies ventral to the PL within the frontal pole, also regulates behavioral responding based on action–outcome response contingencies (Valentin et al., 2007; Tanaka et al., 2008). Nonetheless, rodent studies aimed at understanding how the brain guides actions to obtain desired outcomes have historically focused on more dorsal prefrontal structures.
The mOFC comprises part of the medial prefrontal cortex (PFC) network in rodents and primates that sends projections to the striatum and receives projections from downstream structures via the dorsomedial thalamus (Öngür & Price, 2000). In monkeys, the mOFC appears to process reward expectations relative to instrumental response contingencies and outcome costs (Roberts, 2006); this is in contrast to the lateral OFC (lOFC), famously essential for inhibitory control during stimulus–response reversal learning (Iverson & Mishkin, 1970; McAlonan & Brown, 2003). The rodent mOFC shares some anatomical characteristics with the rostrally situated ventral PL (Öngür & Price, 2000; Thierry et al., 2000), but recent studies show the rodent mOFC sends projections to selective patches of the dorsomedial striatum, as occurs in the primate mOFC (Schilman et al., 2008). These projections are distinctive from those associated with the PL and infralimbic cortices (Berendse et al., 1992). Moreover, the region has distinctive organization and cytoarchitectonic boundaries in the mouse, as in higher organisms (Van de Werd et al., 2010). Site-selective lesions of the mOFC in this species might therefore be expected to have distinct consequences for goal-directed action relative to other medial PFC structures.
Here, we reversed the action–outcome response requirement to acquire food reinforcement in mice with mOFC lesions. Mice with PL lesions served as a comparison, and only mOFC lesions impaired response inhibition. We also tested sensitivity to a progressive ratio schedule of reinforcement; here, mOFC lesion mice escalated responding. Satiety-specific outcome devaluation and post-training lesion experiments suggested this phenotype could be attributed to deficient acquisition of novel response requirements.
Ventral hippocampal (vHC) projections uniquely excite mOFC neurons (Ishikawa & Nakamura, 2003), and may endow this region with information regarding the emotional and motivational salience of an outcome. We therefore tested mice with vHC lesions in the same tasks with the hypothesis that response profiles would resemble those generated by mOFC, and not PL, lOFC or dorsal hippocampal (dHC) lesions. Our findings reveal correlations between structure and function that largely agree with those established in rats and primates, and dissociate major structures within the mouse PFC along both rostral-caudal and medial-lateral axes. Given the increasing utility of transgenic and knockout mice in modeling human behavior, better verification of response inhibition loci in this species may be an essential step in understanding the biological bases of goal-directed action.
Materials and methods
Experimental animals were male C57BL/6 mice (initially 10–12 weeks old) from Charles River Laboratories (Kingston, NY, USA). Mice were food-restricted (90 min access/day) to maintain approximately 92% original body weight. Tests were conducted during the light phase of a 12-h light cycle (07.00 h onwards). Procedures were approved by the Yale University Animal Care and Use Committee.
Instrumental training
Experimenters used standard operant conditioning chambers for mice (16 × 14 × 12.5 cm) controlled by MedPC software (Med-Associates, Georgia, VT, USA). Head entries into three nose poke recesses and a food magazine were detected by photocell, and a dispenser delivered grain-based food pellets (20 mg; Bio-Serv, Frenchtown, NJ, USA) upon completion of the response requirement. Mice were initially trained to perform the operant response (nose poke) with approximately 12, 25-min sessions (1/day) during which one, two or three responses yielded food reinforcement, i.e. a variable ratio 2 (VR2) schedule of reinforcement. The location of the reinforced aperture (right or left) was counterbalanced, with the center nose poke never reinforced. Mice were required to retrieve pellets before earning more.
Surgery
Mice were first trained to obtain food as described above, then administered site-specific N-methyl D-aspartate (NMDA) infusions. For group designations, mice were matched based on reinforcements earned during training. Mice were then anesthetized with 1 : 1 2-methyl-2-butanol and tribromoethanol (Sigma Aldrich, St Louis, MO, USA) diluted 40-fold with saline, or with pentobarbital. The shaved head was placed in a stereotaxic frame (David Kopf Instruments, Tujunga, CA, USA). The scalp was incised, skin retracted, the head leveled based on bregma and lambda, and coordinates were located using the Kopf Instruments digital coordinate system with resolution of 1/100 mm. A single hole was drilled, and NMDA (20 μg/μL; Sigma) or sterile saline was infused over 1 min (0.1 μL/hemisphere) with needles aimed at +2.8 AP, −2.3 DV, ± 0.1 ML for mOFC lesions, or +2.0 AP, −2.5 DV, ± 0.1 ML for PL lesions (Paxinos & Franklin, 2003; Gourley et al., 2008, 2009). Needles remained in the brain for 2 additional min after infusion. Mice were then sutured and allowed at least 1 week for recovery before food restriction resumed. Before testing, mice were given two to three ‘reminder’ sessions identical to training. Here, sham and lesion mice did not differ in the number of reinforced responses performed.
Instrumental ‘reversal’ and extinction tests
In the instrumental reversal task, the location of the active nose poke aperture was ‘reversed’, such that the previously non-reinforced aperture on the opposite side of the chamber was reinforced, with no consequences for responding on the originally reinforced aperture. The schedule of reinforcement was VR2, as in training, with one 15-min session/day for 7 consecutive days; the ‘reversal’ occurred on the first day, and the subsequent test sessions served to generate acquisition curves for responding on the newly reinforced aperture and inhibition of the previously reinforced response (i.e. ‘perseverative’ responding). Reinforced and perseverative responses were analysed by two-factor (lesion × session) repeated-measure (RM) analysis of variance (anova) with Tukey’s post hoc comparisons. Extinction sessions were conducted in the same animals after mice had acquired the reversal; here, all reinforcement was withheld, regardless of animals’ responding, for five daily 15-min test sessions. Responses made on ‘active’ aperture were analysed by two-factor (lesion × session) RM-anova.
Progressive ratio testing
A separate group of mice was tested on the classic progressive ratio schedule of reinforcement (Hodos, 1961), in which mice initially trained to perform an instrumental response for food reinforcement on a highly reinforcing schedule (VR2 here) are, at test, required to respond progressively more for each subsequent reinforcer. The ‘break point ratio’, referring to the highest number of responses an animal completes for a single food reinforcer, serves as the dependent variable. Mice were trained as described above, lesions were placed, and then mice were shifted to a progressive ratio schedule with a linearly increasing response – reinforcement requirement (1, 5, 9, x + 4 responses per reinforcement), with 1 session/day for 5 consecutive days. Sessions ended when animals executed no responses for 5 consecutive min or reached 2 h in the chambers. Break point ratios did not differ between mOFC and PL sham groups, and were combined. Break point ratios were analysed by two-factor (lesion × session) RM-anova with Tukey’s post hoc comparisons. In a separate analysis, break point ratios on Days 2–5 were normalized to each individual animal’s Day 1 value to further evaluate whether responding increased or decreased across multiple sessions. These values were arc-sin transformed to ensure normality (Ferguson, 1978), then analysed by one-factor RM-anova.
In a subsequent experiment, mice were trained to nose poke as described, and then trained on the progressive ratio schedule for five sessions before surgery. Mice were matched based on break point ratios, and lesions were placed as described above. After recovery, five more progressive ratio test sessions were conducted. Sham groups did not differ and were combined, and break point ratios were analysed by two-factor (lesion × session) RM-anova.
Outcome devaluation
We also tested sensitivity to devaluation of the food outcome using a satiety-specific prefeeding procedure. Here, mice were allowed 30 min access to the reinforcer pellets in a clean cage before a 15-min test session conducted in extinction, as is standard practice. Responding was normalized to a non-devalued, i.e. ‘valued’, session conducted the following day – a 15-min test session also conducted in extinction, prior to which food pellets were not available. Again, sham groups did not differ and were combined, and groups were compared by anova. The mice used in this experiment were the same as those used in the first progressive ratio study described above.
vHC, dHC and lOFC lesions
Mice with vHC, dHC and lOFC lesions were also generated and tested in the instrumental reversal and progressive ratio tasks for comparison to mice with medial PFC lesions. We used the same surgical methods as described above, with the following exceptions – for vHC lesions, four holes were drilled in the skull, and NMDA was infused at −3.0 AP, −4.0 DV, ± 2.75 ML and −3.4 AP, −4.0 DV, ± 3.0 ML in a volume of 0.1 μL/site over 1 min, with the needles left in place for 4 additional min. For dHC lesions, four holes were drilled, and NMDA was infused in a volume of 0.1 μL/site over 1 min, with needles remaining for 2 additional min at −1.3 AP, −2.0 DV, ± 1.0 ML and −2.1 AP, −2.2 DV, ± 1.5 ML (from Chowdhury et al., 2005). Note that these mice were used in a water maze experiment for an independent study before instrumental training and testing. For lOFC lesions, two holes were drilled, and NMDA was infused at +2.6 AP, ± 1.2 ML, −2.8 DV (from Bissonette et al., 2008) in a volume of 0.1 μL/site over 1 min, with the needles left in place for 4 additional min.
Histology
After behavioral testing, mice were deeply anesthetized with pentobarbital and transcardially perfused with chilled saline and 4% paraformaldahyde. Brains were stored for 48 h, then transferred to 30% w/v sucrose, and sliced into 40-micron-thick sections on a microtome (−15 ± 1 °C). Every third section was immunostained for NeuN (Millipore, Billerica, MA, USA; Rb; 1 : 500) and glial fibrillary acidic protein (GFAP; Dakocytomation, Carpinteria, CA, USA; Ms; 1 : 1000). AlexaFluor goat IgGs (Invitrogen, Carlsbad, CA, USA; 1 : 300) served as secondary antibodies. Slices were imaged, and lesions were graphically transposed onto corresponding mouse brain atlas images (Paxinos & Franklin, 2003).
Results
The medial PFC (specifically, the PL) is associated with action–outcome associative learning (Balleine & Dickinson, 1998; Killcross & Coutureau, 2003), while lOFC lesions retard stimulus–response learning in reversal tasks (e.g. Schoenbaum et al., 2002). Situated at the junction of the ventrolateral orbital and PL cortices in both primates and rodents, the mOFC may be expected to influence behavioral responding in tasks that require updating stimulus–response or action–outcome associations, but little is known about this structure in these contexts. Here, mice were initially trained to respond on a nose poke aperture in the northeastern corner of an operant conditioning chamber, then were required to shift responding to an aperture in the northwestern corner or vice versa for reinforcement. A VR schedule of reinforcement was used, with no discrete cues signaling reinforcement delivery. Thus, the task required mice to update action–outcome – as opposed to stimulus–response – associative relationships in order to obtain food pellets, and subsequent sensitivity to outcome devaluation confirmed mice responded based on action–outcome response contingencies (below).
In agreement with a classic report on the effects of mOFC lesions in monkeys (Iverson & Mishkin, 1970), mOFC lesions increased ‘perseverative’ responding – responding on the previously-reinforced aperture despite non-reinforcement (main effect of lesion F1,21 = 5.5, P = 0.03; lesion × session F6,126 = 2.1, P = 0.06; Fig. 1C) – while acquisition of the newly reinforced response was unaffected (main effect of lesion F < 1; lesion × session F6,126 = 1.7, P = 0.1; Fig. 1B). Mice with PL lesions also acquired the newly reinforced response (main effect and interaction F < 1; Fig. 1D) but, in this case, perseverative responding was unaffected (effect of lesion F1,7 = 1.2, P = 0.3; session × lesion F < 1; Fig. 1E), indicating distinct roles for these adjacent medial PFC structures.

Distinct response patterns in an instrumental reversal task in mice with discrete medial prefrontal lesions. (A) Experimental timeline. Mice were first trained to perform an instrumental response for food. The break in the timeline corresponds to surgery and recovery. Next, mice were given two to three ‘reminder’ sessions, then the response requirement was reversed on the first of several sessions that constitute the response acquisition and suppression curves below. (B) Mice with medial orbitofrontal cortex (mOFC) lesions appropriately shifted instrumental responding to the newly reinforced aperture, (C) but were unable to coincidentally suppress ‘perseverative’ responding on the previously reinforced aperture. (D) Responding on the newly reinforced aperture during reversal was unaffected by prelimbic prefrontal cortex (PL) lesions. (E) Perseverative responding was also unchanged. (F) Composites of the largest and smallest lesions for all mOFC experiments are shown. (G) PL lesions are also represented; note that approximately one-quarter of these lesions included the infralimbic cortex, as here. All lesions were bilateral, and atlas images are reprinted with permission from Paxinos & Franklin (2003), with coordinates relative to bregma indicated. Symbols represent means (+SEM) per treatment group (*P < 0.05 compared with sham). NMDA, N-methyl-d-aspartate.
mOFC and PL lesions in this and all other experiments were largely separated on both dorso-ventral and rostro-caudal planes, with the typical mOFC lesion at the rostral-most tip of the frontal pole and larger mOFC lesions spreading laterally to include the ventral OFC (Fig. 1F). mOFC lesions were rostral enough to avoid the infralimbic cortex, although some GFAP staining in the rostral PL was noted. Approximately 50% of mOFC lesion mice had some degree of GFAP staining along the needle track in at least one hemisphere. PL lesions were caudal to mOFC lesions and encompassed the PL and anterior cingulate cortex, with spread to the infralimbic cortex in some mice (Fig. 1G). Two mOFC and six PL lesions were unilateral, and two ‘mOFC’ mice appeared to have lesions only in the infralimbic cortex; these animals were excluded.
To further characterize the role of the mOFC in behavioral inhibitory processes, we generated another group of lesion mice and conducted five test sessions in which mice were required to respond on a progressive ratio schedule of reinforcement for food. When break point ratios were analysed, an interaction between lesion and session was detected (F8,160 = 2.6, P = 0.01), and subsequent post hoc tests indicated mice with mOFC lesions escalated responding, achieving break point ratios that differed from sham mice at a trend level of significance during Session 2 (P = 0.057), and that were significantly higher than sham levels during subsequent test sessions (P ≤ 0.03; Fig. 2A).

Medial prefrontal lesions regulate progressive ratio break point ratios. (A) When tested on a progressive ratio schedule of reinforcement, break point ratios achieved by mice with medial orbitofrontal cortex (mOFC) lesions were initially indistinguishable from sham levels on Session 1, but then escalated. By contrast, break point ratios in mice with prelimbic prefrontal cortex (PL) lesions appeared to decline. (B) To verify this impression, break point ratios on Days 2–5 were normalized to those achieved on Day 1, revealing consistently lower response levels in PL mice. (C) Representative GFAP staining for mice in this experiment is shown with mOFC lesion at top and PL at bottom. Symbols represent means (+SEM) per treatment group (*P < 0.05 compared with sham).
By contrast, mice with PL lesions differed from sham mice at a trend level of significance during the final session (P = 0.07), suggestive of a declining response pattern (Fig. 2A). To clarify this possibility, we calculated each animal’s break point ratio as a percentage of its Day 1 baseline. Both sham and mOFC lesion mice shifted responding upward to 135% and 159%, respectively, of Day 1 baseline. By contrast, mice with PL lesions shifted downward, achieving break point ratios that were, on average, 72% of baseline across several sessions (main effect of lesion F2,40 = 4.6, P = 0.02; post hoc P < 0.05; Fig. 2B). Representative GFAP staining in lesion mice from these experiments is shown (Fig. 2C); as indicated in Fig. 1, mOFC and PL lesions were distinguishable by rostro-caudal position within the mPFC.
To confirm that the effects of PL lesions were not simply attributable to insensitivity to the previously learned action–outcome association, we devalued the food outcome with 30-min prefeeding with the reinforcer pellets used in the task. All mice consumed equivalent amounts of food during this prefeeding period (relative to shams, P ≥ 0.2; not shown). Subsequently, sham, mOFC and PL mice showed the expected attenuation of instrumental responding with no difference between groups (F2,32 = 1.4, P = 0.3; Fig. 3A). Mice also extinguished responding at equivalent rates when reinforcement was withheld across several sessions (effect of lesion and lesion × session F < 1; Fig. 3B). These findings are in agreement with the argument that, under normal circumstances, the PL invigorates reinforced instrumental responding by maintaining sensitivity to the motivational value of the outcome (Corbit & Balleine, 2003), but lesions do not impact upon action–outcome associative relationships acquired prior to lesion placement (Ostlund & Balleine, 2005).

Sensitivity to outcome devaluation, non-reinforcement and progressive ratio training prior to lesion placement. (A) Post-training medial PFC lesions did not affect sensitivity to satiety-specific outcome devaluation, as indicated by reduced responding in all groups after the devaluation procedure relative to responses made in the absence of devaluation (represented by the dashed line at 100%). (B) Similarly, when reinforcement was withheld in extinction training, all mice showed the expected decline in responding with no differences between groups. (C) A separate group of mice was trained to respond on a progressive ratio schedule of reinforcement prior to surgery, then tested again after lesions were placed. Under these conditions, neither medial PFC lesion affected responding. Bars here represent mean break point ratios achieved during five test sessions prior to lesion placement and five sessions after (+SEM). Otherwise, symbols and bars represent means (+SEM) for individual sessions. mOFC, medial orbitofrontal cortex; PL, prelimbic prefrontal cortex.
Implicit in this interpretation is the idea that if lesions were placed after the progressive ratio response requirements had been learned, responding would not be affected. To address this possibility, we trained another group of mice to acquire reinforcement, then further trained these animals to respond on a progressive ratio schedule of reinforcement with five test sessions before mOFC or PL lesion placement. We then tested the same animals on a progressive ratio schedule after recovery (five sessions). Before lesion placement, responding did not differ by group designation, as determined by break point ratio (main effect of group and group × session F < 1). After lesions were placed, break points remained unchanged (main effect of group and group × session F < 1; Fig. 3C). Our findings thus suggest that neither the mOFC nor PL is required for progressive ratio performance once response parameters have been learned. These data also provide the first evidence for mOFC involvement in the acquisition, and not expression, of an instrumental response schedule in rodents.
vHC lesions disinhibit instrumental responding
In addition to its well-established role in spatial learning and memory, the hippocampus regulates motivational sensitivity to food and drug reward. Moreover, stimulation of the ventral sector uniquely excites neurons within the mOFC (Ishikawa & Nakamura, 2003), suggesting this region may provide the mOFC with information regarding the motivational salience of an appetitive outcome and thereby contribute to its regulation of goal-directed behavior. In this case, lesions of the vHC might be expected to result in similar response patterns relative to lesions of the mOFC. Indeed, vHC lesion mice successfully shifted responding to a newly reinforced aperture in an instrumental reversal task (main effect of lesion F < 1; lesion × session interaction F6,90 = 1.6, P = 0.15; Fig. 4A), but showed an impairment in response inhibition on the previously reinforced aperture, specifically during the initial test sessions (lesion × session interaction F6,90 = 3, P = 0.01; Sessions 1–3 post hoc P ≤ 0.05; Fig. 4B). Moreover, as with mOFC lesions, vHC lesions increased progressive ratio break points (main effect of lesion F1,10 = 6.5, P = 0.03; Fig. 4C). Histological analyses indicated vHC lesions were largely limited to the ventral 50% of the caudal hippocampus, though some larger lesions spread dorsally, resulting in GFAP staining in the intermediate hippocampus. Lesions tended to be biased towards the rostral extent (e.g. bregma −2.7) of the vHC or the caudal extent (e.g. bregma −3.7; Fig. 4D), but this distinction did not appear to affect behavioral responding in our tasks.

vHC lesions mimic medial orbitofrontal cortex (mOFC) lesions. (A) Because the vHC projects to the mOFC, we generated mice with vHC lesions to compare response patterns. Again, acquisition of responding on the previously non-reinforced aperture was unaffected, while (B) ‘perseverative’ responding in reversal was exaggerated. (C) vHC lesions also increased break point ratios. (D) Histological analyses verified GFAP staining predominantly in the vHC; gray represents the largest lesion and black the smallest. All lesions were bilateral, and coordinates relative to bregma are indicated. Symbols represent means (+SEM) per treatment group (*P < 0.05 compared with sham). NMDA, N-methyl-d-aspartate.
lOFC and dHC lesions have distinctive effects in an instrumental reversal task
A premise of this manuscript is that the mouse mOFC regulates action–outcome response flexibility in a manner that is unique relative to related prefrontal structures. In a final series of experiments, we dissociated the mOFC from lOFC by placing lesions in the lOFC and testing mice in the reversal and progressive ratio tasks. Mice with dHC lesions were also generated, as the dHC has no projections to the OFC (Cenquizca & Swanson, 2007), and recent studies highlight its functional and genetic dissociation from the vHC (Dong et al., 2009; cf. Fanselow & Dong, 2010), so lesions of this region might also be expected to produce distinctive effects in these two tasks. Saline-infused mice did not differ and were combined for representative purposes.
As predicted, lOFC and dHC lesions produced distinctive response patterns in the instrumental reversal task that were dissimilar to mOFC lesion response profiles. First, both lOFC and dHC lesions delayed the acquisition of the reversal (session × lesion interaction F12,132 = 3.9, P < 0.001), though in distinct ways – mice with lOFC lesions responded less than sham mice during Session 3 (P = 0.02), but not later (Fig. 5A). By contrast, mice with dHC lesions appeared to reverse during early sessions, but were unable to achieve optimal responding, as indicated by fewer responses during the final test sessions (Sessions 6–7, P < 0.006), perhaps because optimal responding depended on the spatial location of the aperture within the operant conditioning chamber (Mahut, 1971; Whishaw & Tomie, 1997). Somewhat surprisingly, lOFC lesions facilitated the extinction of responding on the previously reinforced aperture (session × lesion interaction F12,132 = 2.6, P = 0.004, Session 1, P < 0.001), thus the mOFC and lOFC compartments were dissociable on both instrumental reversal response and response suppression measures. dHC lesions had no effect on response inhibition (all P ≥ 0.09; Fig. 5B).

Lateral orbitofrontal cortex (lOFC) and dorsal hippocampal (dHC) lesions produce distinct response patterns. (A) Mice with lesions of the lOFC or dHC were also tested. lOFC lesions delayed the acquisition of responding on the newly reinforced nose poke, resulting in significantly fewer responses during Session 3, but not later. dHC lesions also retarded reversal, resulting in less responding during the final test sessions (6–7). (B) Mice with dHC lesions appropriately extinguished responding on the non-reinforced aperture, while lOFC mice decreased responding during Session 1. (C) dHC lesions elevated break point ratios, while lOFC lesions had no significant effects. (D) GFAP staining confirmed lOFC lesions spared medial PFC structures; representative GFAP staining at ∼ bregma 2.8 is shown. (E) Lesion composites are also provided, with gray outlining the largest lesions and black the smallest. All lesions targeted the lOFC, most spread to the ventral OFC, and 28% of lOFC lesions also resulted in GFAP staining in the dorsolateral orbital cortex (‘DLO’ in Paxinos & Franklin, 2003), as indicated at top. (F) dHC lesions are also represented; note the vHC is largely spared. All lesions were bilateral, and coordinates relative to bregma are indicated. Symbols represent means (+SEM) per treatment group (*P < 0.05, **P < 0.001 compared with sham).
Unlike mOFC lesions, lOFC lesions had no significant effects on break point ratios when mice were required to respond on a progressive ratio schedule of reinforcement, and mice with dHC lesions achieved higher ratios (main effect of lesion F2,30 = 8.5, P < 0.001; post hoc P = 0.004; Fig. 5C), as has been previously reported in rats (Schmelzeis & Mittleman, 1996). Histological analyses indicated lOFC infusions resulted in prominent GFAP staining in the lOFC and lateral ventral OFC that spared medial prefrontal structures in all mice (Fig. 5D and E). Twenty-eight percent of lOFC infusions resulted in particularly large lesions that spread laterally to affect the dorsolateral orbital cortex (‘DLO’ in Paxinos & Franklin, 2003). dHC lesions were restricted to the rostral dHC, and most encompassed all major subregions (Fig. 5F). In several mice, NMDA spread ventrally such that GFAP staining was detected in the intermediate hippocampus, but the vHC was spared. In fact, a subset of animals in both the dHC and vHC groups had prominent GFAP staining within the intermediate hippocampus. Thus, disparate behavioral response patterns in these groups are presumed to be due to cell death within the non-overlapping dorsal and ventral regions, respectively. Two dHC lesions were unexpectedly non-detectable, and two were unilateral; these animals were excluded.
Discussion
It was recently argued that there are no good models of prefrontal function in mice (Bissonette et al., 2008); indeed, few behavioral tasks thought to rely in whole or in part on the PFC – based on lesion studies in rats and non-human primates – have been validated by lesion studies in mice. This is unlike canonically hippocampus-dependent tasks, such as trace conditioning (e.g. Chowdhury et al., 2005) and the Morris water maze (e.g. Pittenger et al., 2002). Moreover, the vast majority of rodent anatomical studies of the PFC use rats. These practices are paradoxically apposed to a growing reliance on transgenic and knockout mice to model psychiatric diseases commonly characterized by disordered goal-directed action and, generally, deficits thought to derive from abnormalities in medial prefrontal cytoarchitecture, biochemistry and/or network activity. The goals of this study were thus twofold: (1) to develop protocols to place anatomically discrete lesions along the medial wall of the mouse PFC and (2) to compare the effects of PL and ventromedial PFC – i.e. mOFC – lesions on behavioral flexibility based on action–outcome (also termed ‘response–outcome’), as opposed to stimulus–response, associations. Findings are summarized in Table 1.
Lesion site | Acquisition of a new response in reversal | ‘Perseverative’ responding in reversal | Progressive ratio break points |
---|---|---|---|
PL | – | – | Decline |
mOFC | – | ↑ | ↑ |
vHC | – | ↑ | ↑ |
lOFC | Delayed | ↓ | – |
dHC | ↓ | – | ↑ |
- Response patterns relative to saline-infused mice are organized by lesion site (in rows) and response type (in columns) in an instrumental reversal task and on a progressive ratio response schedule. Directionality of change is indicated where appropriate. dHC, dorsal hippocampal; lOFC, lateral orbitofrontal cortex; mOFC, medial orbitofrontal cortex; PL, prelimbic prefrontal cortex; vHC, ventral hippocampal.
Effects of mOFC lesions
Most notably, our findings suggest the rodent mOFC facilitates goal-directed response inhibition under circumstances that require the adoption of novel response strategies, with the caveat that lesion effects were detected only in the presence of appetitive reinforcement, i.e. lesions did not affect responding during non-reinforced (extinction) test sessions. Comparable roles for the primate mOFC were recently proposed (Kringelbach, 2005; Roberts, 2006; Tanaka et al., 2008), but evidence for mOFC involvement in inhibitory control in rodents, in general or under specific conditions, is, to date, indirect (Cetin et al., 2004), despite the identification of a medial compartment in both the rat and mouse OFC (Uylings et al., 2003; Van de Werd et al., 2010). It is notable, however, that in previous studies, rats with large medial PFC lesions that included the mOFC showed increased perseverative responding in stimulus–response reversal tasks (Aggleton et al., 1995; Chudasama & Robbins, 2003), while more selective lesions of the cingulate and infralimbic cortices or PL spared (de Bruin et al., 1994; Aggleton et al., 1995; Joel et al., 1997; Ragozzino et al., 1999; Dias & Aggleton, 2000; Boulougouris et al., 2007) or partially spared (Sutherland et al., 1988; Chudasama & Robbins, 2003) responding in a variety of intramodal shifting tasks, as with PL lesions here.
Human neuroimaging reports implicate the mOFC in encoding the value of available actions relative to available reinforcers (Arana et al., 2003; Erk et al., 2002; O’Doherty et al., 2003; Paulus & Frank, 2003; Elliott et al., 2008). A report by Plassmann et al. (2007) showed selective activity in the mOFC during willingness-to-pay calculations, a finding that may be particularly germane to this study, as we argue that, without the mOFC, modestly food-restricted mice were unable to calculate the appropriate ‘pay’– effort expenditure – relative to the outcome value when responding for food on a progressive ratio schedule of reinforcement. Specifically, mice must choose between performing an action and withholding responding to end the session and non-contingently receive chow upon returning to the home cage. Control mice establish a low, steady pattern of responding, while mOFC mice withhold responding only at higher break points. Mice with progressive ratio schedule experience prior to lesion placement responded appropriately, indicating the effect is selective to acquisition of the progressive ratio response requirements.
In contrast to mOFC lesions, PL lesions reduced break point ratios (see also Gourley et al., 2008) and, in monkeys, ventral PL neurons are more active during ‘self-initiated’ response trials – in which animals respond for water reinforcement in the absence of discrete cues – than in cued trials (Bouret & Richmond, 2010). These patterns support the argument put forth in a previous report that the PL serves to motivate instrumental responding when reinforcement is uncertain (Corbit & Balleine, 2003).
Post-training medial PFC lesions preserve sensitivity to action–outcome relationships
Instrumental sensitivity to satiety-specific outcome devaluation was intact in mice with medial prefrontal lesions placed after instrumental training. This finding in PL mice is consistent with a previous report in rats (Ostlund & Balleine, 2005), but whether the mOFC lesion profile is also consistent with previous work is less obvious. Monkeys with broad OFC lesions including the mOFC in an early study were insensitive to prefeeding devaluation, but it is unclear whether the animals were responding for the food outcome or the discrete cues that accompanied reinforcement (Butter et al., 1963). More recently, monkeys with mOFC-inclusive lesions were unable to suppress responding for an object associated with a devalued food, but when asked to perform a response for the food itself, instrumental responding diminished (Baxter et al., 2000; Izquierdo & Murray, 2004; Izquierdo et al., 2004), as here and in a previous study in rats with large OFC lesions (Ostlund & Balleine, 2007). Our results with discrete mOFC lesions thus suggest this region is important for adopting new behavioral strategies based on action–outcome associative relationships when reinforcement requirements change, such as in a spatial reversal, in shifting to a new response schedule or in detour reaching tasks (in monkeys: Wallis et al., 2001), but not in maintaining a representation of the action–outcome associative relationship itself.
Interactions between the hippocampus and mOFC
Large lesions of the hippocampus have historically resulted in hyper-sensitivity to food and drug reward and a general increase in appetitive behavior in rats (Jarrard, 1964; Kimble & Kimble, 1965; Whishaw & Mittleman, 1991; Wilkinson et al., 1993; Schmelzeis & Mittleman, 1996; Mittleman et al., 1998; Kelley & Mittleman, 1999), consistent with the conclusion that the hippocampus gates reward sensitivity and with elevated break point ratios in mice with either dHC or vHC lesions here. In other contexts, the hippocampus can be functionally and anatomically dissociated along the dorso-ventral axis – the dorsal sector is classically associated with spatial learning and memory, and the ventral with the emotional and motivational salience of outcomes (Fanselow & Dong, 2010). vHC lesions result, for example, in reduced hyponeophagia – i.e. increased willingness to seek food despite novel environmental stimuli (Bannerman et al., 2002, 2003). The vHC also sends direct ipsilateral projections to the mOFC, and stimulation of these sites results in the excitation of single units within the mOFC (Ishikawa & Nakamura, 2003). Excitation of these projections may facilitate mOFC-mediated response inhibition; consistent with this hypothesis, vHC lesions mimicked the effects of mOFC lesions, though it is unclear why mOFC lesions resulted in a delay in escalated responding on the progressive ratio schedule of reinforcement, while vHC lesions did not.
The mOFC and vHC may alternatively regulate the acquisition of novel response contingencies via projections that converge onto single neurons within the nucleus accumbens core (French & Totterdell, 2002). This model is consistent with reports that the acquisition of VR response schedules requires NMDA receptors and downstream protein kinase activity within the core subregion (Baldwin et al., 2000, 2002), and potentially with evidence that the vHC regulates the balance between tonic and phasic activation of dopamine neurons within the nucleus accumbens (Floresco et al., 2001), which would be expected to impact upon an animal’s ability to detect and acquire a novel response contingency.
Effects of dHC and lOFC lesions
As anticipated based on connectivity patterns and previous studies, neither dHC nor lOFC lesions produced behavioral profiles that were similar to mOFC lesions. For example, both dHC and lOFC mice showed deficits acquiring the ‘reversed’ response contingency – mice with dHC lesions were unable to fully acquire the new response, presumably due to the spatial component of the response requirement (Whishaw & Tomie, 1997), while mice with lOFC lesions showed acquisition delays.
Recent evidence indicates the rat lOFC encodes reward prediction error and uses this information to guide future choice behavior (Sul et al., 2010). As anticipated by models of reward prediction error (Rescorla & Wagner, 1972), the rat lOFC encodes prediction errors that are both positive, indicating reinforcement for a given action is better than anticipated, and negative, indicating reinforcement is worse than expected. Moreover, recent adaptations of predict error theory can account for learning from ‘missing’ reward in action–outcome associative settings (Redish et al., 2007). Thus, the delay in response reversal in lOFC mice here may reflect inactivation of a region that enables the acquisition of novel choices based on whether previous choices resulted in reward or no reward. This model cannot, however, obviously account for facilitated extinction of non-reinforced responding in lOFC mice. This effect has also been previously reported in rats (Grakalic et al., 2010), and suggests that, under some circumstances, the lOFC retards the extinction of goal-directed activities – perhaps by maintaining sensitivity to stimulus–response associations that promote habitual instrumental responding – but further studies are necessary.
Conclusions
The rodent medial PFC contains multiple cytoarchitectonically distinct subregions that can be differentiated based on efferent and afferent projection patterns, with dorsal regions – including the dorsal PL – sharing similar functions that differ from those of the ventral medial PFC, which includes the mOFC (Heidbreder & Groenewegen, 2003; Vertes, 2004; Schilman et al., 2008). We show that selective mOFC lesions produce distinctive behavioral effects relative to more dorsally and caudally situated PL lesions in mice performing food-reinforced instrumental tasks. Specifically, mOFC lesions increased perseverative responding in an ‘instrumental’ reversal task, as well as responding for food reinforcement on a progressive ratio schedule of reinforcement, resulting in effort expenditure that outstripped the value of the reinforcer. Mice trained to respond on a progressive ratio schedule prior to lesion placement were unaffected; we thus propose a role for the rodent mOFC in facilitating goal-directed response inhibition specifically in the presence of appetitive reinforcement and under circumstances that require the acquisition of novel response strategies. Such a model has relevance to psychiatric illnesses commonly characterized by disordered goal-directed action.
Acknowledgements
The authors thank Dr Philip Corlett for valuable feedback. This work was supported by the National Institutes of Health (DA011717, MH025642 and MH066172 to J.R.T.; MH079680 to S.L.G.), the Connecticut Department of Mental Health and Addiction Services (J.R.T., C.P.), and the Interdisciplinary Research Consortium on Stress, Self-control and Addiction (UL1-DE19586 and the NIH Roadmap for Medial Research/Common Fund, AA017537).
Abbreviations
-
- dHC
-
- dorsal hippocampal
-
- GFAP
-
- glial fibrillary acidic protein
-
- lOFC
-
- lateral orbitofrontal cortex
-
- mOFC
-
- medial orbitofrontal cortex
-
- NMDA
-
- N-methyl-d-aspartate
-
- PFC
-
- prefrontal cortex
-
- PL
-
- prelimbic prefrontal cortex
-
- RM
-
- repeated-measures
-
- vHC
-
- ventral hippocampal
-
- VR
-
- variable ratio