Volume 18, Issue 9 pp. 2627-2637
Full Access

Neural architecture of choice behaviour in a concurrent interval schedule

Tobias Kalenscher

Tobias Kalenscher

Institute of Cognitive Neuroscience, Ruhr-University Bochum, GAFO 05-623, 44780 Bochum, Germany

Search for more papers by this author
Bettina Diekamp

Bettina Diekamp

Institute of Cognitive Neuroscience, Ruhr-University Bochum, GAFO 05-623, 44780 Bochum, Germany

Search for more papers by this author
Onur Güntürkün

Onur Güntürkün

Institute of Cognitive Neuroscience, Ruhr-University Bochum, GAFO 05-623, 44780 Bochum, Germany

Search for more papers by this author
First published: 10 November 2003
Citations: 26
: Dr T. Kalenscher, as above.
E-mail: [email protected]

Abstract

Concurrent interval schedules are classic experimental paradigms that are traditionally employed in psychological research on choice behaviour. To analyse the neural basis of choice in a concurrent fixed interval schedule, pigeons were trained to peck on two response keys. Responses were differentially rewarded in key specific short or long time intervals (SI vs. LI). Using tetrodotoxin, we reversibly blocked the neostriatum caudolaterale (NCL, the avian functional equivalent of the prefrontal cortex), avian caudate-putamen and nucleus accumbens to examine their contribution. A detailed analysis of baseline choice behaviour revealed that response distribution and key affinity were determined by cued or time-related expectancy for rewards on the SI key. The pigeons' response frequency increased on the SI key and decreased on the LI key with increasing temporal proximity to the SI reward and pigeons switched to the LI key after reward delivery. Pecking bursts on the LI key were negatively correlated with bursts on the SI key. Neostriatum caudolaterale inactivation did not affect pecking activity per se but interfered with reward-related temporal modulation of pecking frequency, switching pattern and coupling of LI to SI pecks. Blockade of caudate-putamen resulted in a complete behavioural halt, while inactivation of nucleus accumbens diminished operant behaviour without affecting consummatory responses. These data suggest that the NCL is tuned via indirect striato-pallial projections to integrate cued or time-related reward expectancy into a response selection process in order to set, maintain or shift goals. The NCL possibly feeds forward the resulting motor commands to the caudate-putamen for execution.

Introduction

Decision making is the process of selecting one response option among two or more response alternatives (Schall, 2001). It requires the neural representation of rewards, choice consequences, their relative motivational values and the conversion of choice information into categorical motor commands. Neuropsychological (Bechara et al., 2000) and imaging (Rogers et al., 1999) studies with humans indicate that the prefrontal cortex (PFC), in particular the orbitofrontal cortex, is a key structure for prospective decision making. Single units in the orbital PFC of monkeys encode relative reward preference (Tremblay & Schultz, 1999) and there is an indication that the primate PFC engages in response selection and conversion of evidence into categorical motor commands (Kim & Shadlen, 1999). The PFC communicates with a large range of structures that also contribute to decision making, such as amygdala (Baxter et al., 2000), parietal cortex (Platt & Glimcher, 1999), cingulate cortex (Shima & Tanji, 1998) and, most prominently, the mesolimbic and mesocortical dopamine systems including ventral striatum and ventral tegmental area (Schultz et al., 1997; Schultz, 1997, 2002; Berridge & Robinson, 1998; Waelti et al., 2001).

On the behavioural side, a great wealth of data on decision making has been collected in the past 50 years. In psychological research, choice behaviour has traditionally been approached using concurrent interval schedules, typically with pigeons as laboratory animals. In a concurrent fixed interval (CFI) schedule, the pigeons' key pecks are rewarded with distinct, key-specific reward frequencies. The pigeons' reward frequency-dependent response distribution is described by the so-called matching law: the relative number of pecks on one key matches the relative number of rewards on that key (Herrnstein, 1961, 1970; Shimp, 1969, 1971; White & Davison, 1973).

Such classic psychological schedules constitute experimental approaches to choice behaviour that are different to the paradigms used in current neuroscientific research on decision making. First, birds, and not mammals, are usually used as laboratory animals. Second, the different nature and response requirements of the tasks call into question whether similar neural mechanisms account for choice behaviour.

In this study we, therefore, intend to identify the neural architecture required for guiding choice behaviour in CFI schedules. In particular, we evaluate how the avian ‘PFC’, its mesolimbic dopamine system and executive motor structures contribute to the distribution of responses.

The neostriatum caudolaterale (NCL) is a pallial multimodal association structure in the avian brain. It is neither homologous nor similar to the mammalian striatum but is considered functionally equivalent to the mammalian PFC because of cyto- and chemoarchitectonic, connectional, behavioural and electrophysiological similarities (Mogensen & Divac, 1982, 1993; Waldmann & Güntürkün, 1993; Durstewitz et al., 1999; Kröner & Güntürkün, 1999; Diekamp et al., 2002a,b). In addition, the avian brain contains structures that are widely accepted as being homologous to the mammalian nucleus accumbens (NAc) and somatomotor caudate/putamen (CPu) of the basal ganglia (Reiner et al., 1984; Durstewitz et al., 1999; Mezey & Csillag, 2002). To determine the contribution to choice behaviour, we reversibly inactivated the NCL, NAc and CPu by local injections of tetrodotoxin (TTX) and performed a detailed analysis of the impact on the pigeons' response distribution in a CFI schedule.

Materials and methods

Subjects and surgery

All subjects were kept and treated according to the German guidelines for care and use of animals in neuroscience and were approved by the committee of the State of Nordrhein Westfalen, Germany. Eleven pigeons (Columba livia) were used in this experiment. During training, all animals were kept on a food deprivation schedule at approximately 90% of their free-feeding body weight.

Pigeons were prepared for injections of TTX by chronically implanting four guide cannulae (gauge 22) into the NCL (11 mm length) and two cannulae into the CPu or NAc (15 mm length). Cannulae were implanted under anaesthesia with ketamine (approx. 4 mg/100 g i.m.; Ketavet; Pharmacia & Upjohn, Germany) and xylazine (0.8 mg/100 g i.m.; Rompun; Bayer, Germany). Guide cannulae for the NCL were positioned within the borders of the NCL as defined by Kröner & Güntürkün (1999) and were fixed to the skull with dental acrylic. For the NCL, two cannulae per hemisphere were vertically inserted to reach the following coordinates: A4.5–6.5; L5.0 and 7.5 and D1.0 [all dorsal–ventral coordinates relative to brain surface and according to the pigeon brain atlas by Karten & Hodos (1967)].

According to Reiner et al. (1984) and Durstewitz et al. (1999), the avian homologue of the somatomotor parts of the CPu stretches from the caudolateral lobus parolfactorius (LPO) into the medial palaeostriatum augmentatum. One cannula per hemisphere was inserted to reach the CPu at the following coordinates: A10.5; L3.0 and D5.5. Substantial evidence indicates that the avian homologue of the NAc is located in the rostro-ventro-medial part of the LPO (Wild, 1987; Durstewitz et al., 1999; Mezey & Csillag, 2002), although it is important to mention that at least one study suggests a functional similarity between more caudal areas of the avian LPO and the core of the rodent NAc (Izawa et al., 2003; Cardinal et al., 2001). One cannula per hemisphere was inserted to reach the rostro-ventro-medial part of the LPO at the following coordinates: A11.5; L1.5 and D5.5. To minimize confusion of avian and mammalian nomenclature, we will refer to the LPO/medial palaeostriatum augmentatum as the CPu and to the rostro-ventro-medial LPO as the NAc throughout this article unless specified otherwise. However, as the NCL is similar, but presumably not homologous to the mammalian PFC, we will use the avian terminology when referring to this structure.

The lower tips of all guide cannulae were positioned 1 mm above the actual dorsal–ventral target coordinates, as injection cannulae were 1 mm longer than the guide cannulae. During TTX injections, injection cannulae (gauge 28) of 12 mm (NCL) or 16 mm length (CPu and NAc) were inserted into the implanted guide cannulae. After surgery, animals were allowed to recover for at least 7 days with access to food ad libitum and were then put back on the deprivation schedule.

Apparatus

Pigeons were trained and tested in a cubic aluminium box (35 × 35 × 35 cm) that was equipped with two round pecking keys, two feeders and one houselight. All items were symmetrically arranged at the front wall of the box. Pecking keys had a diameter of 2.5 cm and were positioned 21 cm above the floor 20 cm apart from each other. Feeders were located 13 cm below each pecking key. The white houselight was in the centre of the front wall, 28 cm above the floor. The houselight and green illuminated pecking keys were switched on during the entire training and testing sessions. The houselight, feeder and keys were controlled by a 16-channel I/O card via a computer.

Pretraining

The pigeons first received an autoshaping procedure to acquire the association between responding to a pecking key and subsequent food reward. They were then trained with a 25-s fixed interval schedule on only one of the two keys (see Procedure section for details about fixed interval schedules). After three to four training sessions, the key was turned off, the second key was switched on and pigeons were trained on an 83-s fixed interval schedule. After three to four sessions, pigeons were again trained in the 25-s fixed interval schedule on the first key for one session. Subsequently, both keys were concurrently activated and pigeons were trained in the CFI schedule.

Procedure

In the CFI schedule of this experiment, one key was associated with the fixed short time interval of 25 s (SI key) and the other long interval key (LI key) was associated with an 83-s interval. The assignment of short interval and long interval to the left and right key was balanced. Six animals were trained with the SI key as the left key and the LI key as the right key (SI–LI) and five animals were trained in the LI–SI condition.

Both intervals were initialized at the beginning of a session. Pecks on a key had no effect during the lapse of the interval. Only the first peck after the end of the interval was rewarded by access to food for 3 s. After the 3-s access to food, the interval of the rewarded key was reinitialized. Reward delivery, lapse and reinitialization of the interval on one key did not affect the interval runtime on the other key.

Each training and testing session lasted 60 min. Pigeons were trained until they received 85% of the maximally available rewards on each key within at least three consecutive sessions. Typically, about eight to 10 training sessions per animal were required to reach a stable criterion. After the criterion was reached, pigeons were subjected to surgery and, after recovery, tested to check for potential surgery-induced behavioural changes.

Testing and injections

All 11 pigeons were subjected to two NCL treatment blocks comprising bilateral NCL injections. In addition, five of these animals (three SI–LI and two LI–SI animals) received one bilateral NAc treatment each and one unilateral right and one unilateral left NAc treatment. Five of the remaining animals (three SI–LI and two LI–SI animals) received one bilateral CPu treatment and one unilateral right and one unilateral left CPu treatment. The remaining animal from the LI–SI group had to be removed from CPu–NAc analysis because of misplaced and/or damaged cannulae. The sequence of bilateral NCL and bilateral and unilateral CPu–NAc injections was randomised.

Each treatment block comprised four testing sessions: one session in a baseline condition (no treatment), one TTX condition (TTX injection), one vehicle condition (saline injection) and one post-TTX condition (no injection). The post-TTX condition served to determine whether TTX or saline injections had any long-term deteriorating effects on the animal's behaviour.

In TTX or saline sessions, pigeons received infusions of 1 µL TTX (Fugu sp.; Calbiochem, Germany; 10 ng/1 µL NaCl) or 1 µL saline (0.9% w/v) per cannula (total volume: NCL injections, 4 µL; bilateral CPu–NAc injections, 2 µL and unilateral CPu–NAc injections, 1 µL). For TTX or saline infusions, the injection cannulae were inserted into the guide cannulae. We used a microinfusion pump equipped with two 1-µL Hamilton syringes to deliver the volume at a flow rate of 0.2 µL/min. After each injection, the injection cannulae remained in place for another 5 min to allow for diffusion of the injected volume. To infuse through all four NCL cannulae, we performed this procedure twice. Testing sessions always began exactly 60 min after termination of injection. After TTX sessions, the pigeons were not trained or tested again for at least 48 h to fully reverse TTX effects. Time between the baseline, saline and post-TTX sessions was usually 1 day. A great wealth of previous experience shows that TTX is a suitable method to study neural inactivation without major undesired effects (Ambrogi Lorenzini et al., 1997). The TTX dilution, injection volumes and optimal injection-testing period were adopted from Ambrogi Lorenzini et al. (1997) and Zhuravin & Bures (1991) and tested for appropriateness in previous, unpublished experiments in our laboratory. These preliminary experiments confirmed that the TTX volume employed in the present study equalled the minimum dose required to produce measurable effects. Tetrodotoxin blocks Na+ channels of the cell membrane and is widely used to temporarily and reversibly inactivate discrete brain regions (see Ambrogi Lorenzini et al., 1997). It has a slow diffusion gradient and a high metabolic degradation. Tetrodotoxin in the given volume and dosage only produces considerable effects when injected not more than 1 mm distant from the target site (Zhuravin & Bures, 1991). Accordingly, the molecule concentration of the TTX dilution used in the present study decays to approximately 10% of the original concentration at 1 mm distance from the injection site and to 0% at 2 mm distance. It takes maximum effect 30–120 min following injection with later onset at greater distance from injection area (Zhuravin & Bures, 1991). Therefore, TTX effects are probably restricted to a radius of approx. 1 mm around the injection site. Hence, the main action of TTX in the present study was almost certainly confined to the target areas NCL, CPu and NAc.

Histology

To reconstruct the locations of the guide cannulae, the pigeons were deeply anaesthetized with Equithesin (4.5 mL/kg body mass i.m.) and transcardially perfused with 0.9% saline (40 °C) and a 4% (w/v) paraformaldehyde solution (4 °C). The brains were removed, postfixed and cut into 40-µm frontal sections on a freezing microtome. Every third slice was kept and stained with cresyl violet. The lowest point of the lesion left by the cannulae was considered the injection site.

Measurements

Basic parameters

We recorded the total pecking rate and total number of rewards.

Time histograms

One of the aims of this study was to perform a detailed analysis of the response distribution in order to identify any putative underlying choice pattern. All raw pecking data were sampled at 10 Hz. For each session, we calculated a time histogram for pecks on each key by adding up the number of pecks within 500-ms bins.

Perireward time histogram

To quantify and analyse temporal changes in the relative occurrence of pecks with regard to reward delivery, we additionally computed perireward time histograms (PRTHs) for rewards on the SI and LI key. Computation of the PRTHs was done as follows. Taking the time histogram, we centred a time window of 50 s around every reward on the SI key (or 166 s around rewards on the LI key), covering pecking activity on both keys 25 s before onset and 25 s after onset of SI reward delivery (or 83 s before and 83 s after reward onset on the LI key). Data within the windows were cumulated across all rewards on the SI or LI key, respectively. The resulting PRTH was then normalized to the total number of pecks (cumulated pecks within one bin divided by the total number of pecks × 100). In the following, the PRTH around rewards on the SI key will be referred to as the SI PRTH and that on the LI key as the LI PRTH.

Reward-related modulation of pecking activity on the SI key

We investigated whether expectancy of rewards on the SI key affected pecking activity on the SI and LI keys. As a measure of reward-related increase or decrease in pecking activity, we fitted linear curves into the prereward SI PRTH for pecks on both keys (see inlays of Fig. 2C and D). Only data before reward delivery were considered, all data during or following reward delivery were removed from the analysis. A positive linear gradient indicates an increase in pecking activity over time towards the reward on the SI key and a negative gradient indicates a decrease in pecking activity. As the PRTH for rewards on the LI key did not exhibit any continuous increase/decrease (Fig. 2E), no lines were fitted into the LI PRTH.

Details are in the caption following the image

Reward-related response distributions in baseline and tetrodotoxin (TTX) conditions. (A) Clipping of an exemplar time histogram from one animal during baseline condition. ▪, pecks on the SI key; □, pecks on the LI key. SI- and LI-labelled arrows indicate a rewarded peck on the SI or LI key, respectively. Bursts of pecking activity on the SI key were mostly terminated by a reward on the SI key and followed by a switch to the LI key. Subsequently, this animal remained on the LI key for a few seconds and, independent of the occurrence of reward delivery, switched back to the SI key to engage in another burst of activity. (B) The same type of graph after TTX injections into the neostriatum caudolaterale (NCL). The characteristic response bursts on the SI key disappeared after NCL inactivation and switches between the keys occurred throughout the entire session, not merely after SI rewards. (C–F) Normalized perireward time histograms (PRTH) illustrate the mean response activity prior to and following reward delivery at timepoint 0 (dashed line) for rewards on the SI key (SI PRTH, C and D) and for rewards on the LI key (LI PRTH, E and F). The black curves refer to pecks on the SI key and grey curves to pecks on the LI key. Typical PRTHs are shown for one animal during baseline condition (C and E) and after TTX injections into the NCL (D and F). (C) Under baseline condition the pecking activity on the SI key gradually increased towards reward delivery on the SI key whereas pecking activity on the LI key gradually decreased towards the SI reward. After 3 s access to the reward, the same response pattern was built up. Inset: Fitted lines (dotted black lines) into the prereward baseline SI PRTH as a measure of the reward-related increase/decrease in pecking frequency. (D) After NCL inactivation the typical reward-related temporal modulation of pecking activity is levelled out and fitted lines (inset) show a very shallow gradient. (E) With respect to the LI reward there was no continuous increase or decrease of pecking activity in the baseline condition. Pecks on the LI and SI keys rather occurred in 25-s periodical bursts with an approx. 90° phase shift between the LI and SI key, suggesting that frequency modulations on the LI key were related to events on the SI key rather than rewards on the LI key. (F) The negative coupling of activity typically observed in the LI PRTH is blurred after NCL inactivation.

Coupling between activity on the SI and LI keys

As a measure of the interdependency of key pecks, we calculated the correlation (Pearson's linear correlation coefficient) between data for the SI key and data for the LI key of the prereward LI PRTH. A positive correlation indicates that an increase in pecking frequency on the SI key is accompanied by an increase in frequency on the LI key, whereas a negative correlation indicates that an increase in SI frequency goes together with a decrease in LI frequency. A correlation coefficient near zero suggests that there is no linear relationship between pecks on the LI key and pecks on the SI key. Correlation coefficients were computed only for the LI PRTHs.

Key switches

A decision is manifest when a pigeon switches from one key to the other. To identify any putative hidden reward-related switching patterns, we computed PRTHs with switches, instead of pecks, for every session and every animal. As switches were less frequent than pecks, we opted for a larger bin size of 2 s. Switch PRTHs were normalized with reference to the total number of all switches into one direction (SI to LI switches vs. LI to SI switches). Apart from that, the resulting switch PRTHs were analogous to the peck PRTHs. All switch PRTHs were averaged across animals. We computed a 95% confidence interval for the resulting mean switch PRTH. Values exceeding the 95% confidence interval were considered systematically locked to reward delivery.

Data analysis

We determined the effects of NCL blockade on any of these parameters with a two-factorial anova for repeated measures with the main factors treatment block (measurement 1 vs. measurement 2) and treatment (baseline vs. TTX injection vs. saline injection vs. post-treatment). The effects of bilateral CPu or NAc inactivation were analysed using a one-factorial anova for repeated measures. Left and right unilateral CPu–NAc injections were analysed using a two-factorial anova for repeated measures (treatment and hemisphere as main factors). Significance was assumed when P < 0.05. Whenever appropriate, we tested the directed hypothesis that parameters after TTX injection were smaller compared with control conditions using posthoc one-sided paired comparisons. Moreover, in some cases one-sample t-tests were used to test for differences from zero and two-tailed correlation tests were performed to test for significant correlations between the variables.

Results

Histology

All except two injection sites were located within the target structures. For NCL injections, 42 of 44 sites were located between A 4.25 and 6.5 and two sites were located anterior to A 6.5 (Fig. 1A). For CPu injections, all sites were located within a range of ± 0.5 mm from A 10.5 (Fig. 1B) and for NAc injections, all sites were located within a range of ± 0.25 mm from A 11.5 (Fig. 1C).

Details are in the caption following the image

Injection sites. Schematic coronal sections of the pigeon brain and coordinates according to Karten & Hodos (1967) showing the injection sites for tetrodotoxin and/or saline solutions. The black dots represent the lower tips of the cannulae. Light grey areas depict the target areas: (A) neostriatum caudolaterale according to Kröner & Güntürkün (1999); (B) caudate/putamen according to Reiner et al. (1984) and Durstewitz et al. (1999) and (C) nucleus accumbens according to Durstewitz et al. (1999) and Mezey & Csillag (2002).

Baseline behaviour and neostriatum caudolaterale inactivation

Temporal modulation of pecking frequency

Baseline behaviour A cut-out of a typical time histogram for one animal in the baseline condition is captured in Fig. 2A. Visual inspection suggests that this pigeon distributed its pecks on both keys according to a stereotypic pecking pattern. It spent more time pecking on the SI key than on the LI key. Upon reward delivery on the SI key, the animal switched to the LI key and remained there for some time, irrespective of reward earning on the LI key. After a while, the pigeon switched back to the SI key and continued to exclusively peck on the SI key until receiving the next reward. These observations suggest that the time interval on the SI key, but presumably not on the LI key, determined the pecking pattern, that switches from the SI to the LI key were reward driven, whereas switches from the LI to the SI key were not, and that pecks on the LI key were closely coupled to events on the SI key.

Exemplar baseline PRTHs from one bird for rewards on the SI key (Fig. 2C) and LI key (Fig. 2E) support these speculations. The SI PRTH shows a roughly linear increase of pecking activity on the SI key with increasing temporal proximity to the timepoint of reward delivery at 0 s. Accordingly, pecking activity on the LI key decreased towards the reward on the SI key. This reward-related modulation of pecking activity indicates that pecking activity on both keys was closely linked to the lapse of the short interval. Immediately after reward delivery, pecking frequency on the LI key dramatically increased, presumably because pigeons had switched away from the rewarded SI key to the unrewarded LI key. In the LI PRTH (Fig. 2E), there was no continuous increase or decrease of activity on either key towards the timepoint of LI reward delivery. Pecks occurred rather in periodical activity bursts of approximately 25 s duration. Moreover, pecks on the LI key were negatively correlated with pecks on the SI key. Thus, unlike the SI PRTH (Fig. 2C), the activity bursts in the LI PRTH were not phase locked to the lapse of the interval but spurts on the LI pecks appeared to be phase shifted to the pecking frequency on the SI key.

Tetrodotoxin injections into the neostriatum caudolaterale The time histogram illustrates that the typical bursts of responses on the SI key disappeared after TTX injections into the bilateral NCL and key switches occurred throughout the entire session, not only after SI rewards (Fig. 2B). Accordingly, the SI PRTH (Fig. 2D) shows that the typical increase of SI activity towards the SI reward was substantially less pronounced and the decrease of LI activity was nonexistent or even reversed. In order to statistically compare the reward-related modulation of pecking activity on the SI and LI keys in TTX and control conditions, we contrasted the linear gradients of the curves that were fitted to the prereward SI PRTH of all animals (see fitted lines in the insets of Fig. 2C and D).

One-sample t-tests revealed that the individual mean baseline linear coefficients were significantly different from zero for pecks on the SI key (t10 = 13.94, P < 0.001) and the LI key (t10 = −2.94, P < 0.05). Tetrodotoxin dramatically interfered with the reward-related increase/decrease of pecking activity (Fig. 3A) on the SI key (F3,30 = 46.55, P < 0.001) and the LI key (F3,30 = 23.33, P < 0.001). The increase of pecking activity on the SI key towards reward on the same key was significantly reduced (posthoc tests for TTX vs. each of the other three conditions: all F1,10 > 47.30, all P < 0.001). Likewise, the decrease of pecks on the LI key towards the reward on the SI key was significantly different after TTX injections (posthoc tests: all F1,10 > 18.20, all P < 0.001). In fact, the sign of the normally negative gradient of LI pecks towards SI reward was reversed after TTX injections. No treatment block or interaction effects were found (all F < 2.70, all P > 0.13).

Details are in the caption following the image

Effects of tetrodotoxin (TTX) injections into the neostriatum caudolaterale (NCL) on the temporal modulation of pecking frequency. (A) Mean (± SEM) linear gradients of the lines fitted to the prereward SI perireward time histograms (PRTH) (see insets in Fig. 2B and D) in the baseline and control post-TTX condition are positive for the SI key and negative for the LI key. After TTX injections into the NCL, the gradient on the SI key was significantly decreased, whereas the typically negative gradient of LI activity was significantly changed into the positive direction. (B) Correlation coefficients between pecks on LI and SI keys in the prereward LI PRTH reveal that pecking bursts on the LI key were negatively coupled to pecks on the SI key in the baseline and control conditions. After TTX injections into the NCL, the negative correlation was significantly reduced to near-zero. Main effects: ***P < 0.001.

The LI PRTH of the same bird (Fig. 2F) illustrates that the typical negative correlation between activity bursts on the LI and SI keys in baseline conditions was greatly reduced after NCL inactivation. To quantify the effects of NCL blockade on the negative correlation, we compared the linear correlation coefficients between pecks on the SI and LI keys of the prereward PRTH.

One-sample t-tests revealed that the individual mean baseline correlations significantly differed from zero (t10 = −3.84, P < 0.01). After TTX injections, pecks on the LI key were decoupled from pecks on the SI key (Fig. 3B). The correlation coefficient dropped from r = −0.27 (baseline) to r = −0.01 after TTX injections. anova and posthoc tests revealed that TTX injections significantly reduced the negative correlation compared with all control conditions (F3,30 = 7.47, P < 0.005, posthoc tests, all F1,10 > 5.70, all P < 0.05). No treatment block (order) or interaction effects were found (all F < 1.10, all P > 0.37).

Switching pattern

Baseline pattern Switch PRTHs for switches away from the rewarded key were computed to test whether the probability of the occurrence of a key switch was significantly higher immediately after reward delivery. For the baseline condition, averaged mean switch PRTHs were calculated for switches from the SI to the LI key before/after rewards on the SI key (Fig. 4A) and for switches from the LI to the SI key before/after rewards on the LI key (Fig. 4E). The majority of SI to LI switches occurred immediately after reward delivery (Fig. 4A). The bins at the beginning of an interval crossed the upper threshold of the confidence interval, indicating that the probability of the occurrence of an SI to LI switch was higher immediately after a reward on the SI key than at any other timepoint during an experimental session. Hence, pigeons did indeed systematically switch away from the rewarded key after receiving a reward on the SI key. No such reward-related switch pattern was found after rewards on the LI key (Fig. 4E). The switch PRTH occasionally crossed the confidence interval threshold but not immediately after reward delivery. Switches rather appeared to occur in bursts of approx. 25-s spurts. These bursts presumably corresponded to reward-related switches after rewards on the SI key rather than to rewards on the LI key.

Details are in the caption following the image

Key switch pattern. Switch perireward time histograms (PRTH) for reward-triggered switches (A–D) from the SI to the LI key and (E) from the LI to the SI key. Switch PRTHs are mostly analogous to peck PRTHs but contain information about the temporal incidence of key switches, and not pecks, relative to reward delivery. The horizontal lines represent the upper threshold of the confidence interval. (A) Baseline switch PRTH for switches from the SI to the LI key with respect to reward on the SI key. The probability of a switch was considerably higher after reward delivery, suggesting that pigeons systematically switched from the SI to the LI key after being rewarded on the SI key. (B) SI to LI switch PRTH after tetrodotoxin (TTX) injection into the neostriatum caudolaterale (NCL). The lack of a confidence interval crossing after reward delivery suggests that, after NCL blockade, the animals refrained from switching to the alternative key after being rewarded for a peck on the SI key. (C and D) The baseline switch pattern was conserved in all control measurements. (E) No reward-related switching pattern occurred in the baseline switch PRTH for switches from the LI to the SI key with respect to LI rewards. Switches were presumably related to the time course of the short interval and not to reward delivery on the LI key.

Tetrodotoxin injections into the neostriatum caudolaterale Neostriatum caudolaterale blockade affected the reward-related SI to LI switch pattern (Fig. 4B). After TTX injections, the probability of SI to LI switches was not significantly higher after a reward on the SI key than at any other timepoint during a testing session, suggesting that NCL inactivation reduced the probability that a switch occurred after SI reward delivery. In contrast, the switching patterns in the saline and post-TTX conditions (Fig. 4C and D) were similar to the pattern observed during the baseline condition. The anova confirmed that NCL inactivation reduced the relative number of postreward switches. After TTX injections, the relative number of switches within the first bin after reward offset was significantly reduced compared with all control conditions (F3,30 = 9.33, P < 0.001, posthoc test, all F1,10 > 11.50, all P < 0.05). Again, we did not find any significant treatment block effects in any of the tests or any interaction effects between treatment blocks and treatment (all F < 1.3, all P > 0.30).

Basic parameters

Pecking rate Under baseline conditions, pigeons made on average 2544 responses (± 240 SEM) during a 60-min training session. After TTX injections into the NCL, the mean pecking rate was reduced to 1522 pecks (± 220). The anova revealed a main treatment effect (F3,30 = 24.92, P < 0.001) and posthoc tests confirmed that the pecking rate was significantly reduced after TTX injections compared with control conditions (all F1,10 > 27.00, all P < 0.001). We did not find any significant effects of treatment block in any of the tests or any interaction effects (all F < 0.78, all P > 0.5).

Reward turnover Under baseline conditions, pigeons obtained on average 92.6% (± 0.9) of the maximally available rewards. After NCL blockade, pigeons earned 81.8% (± 4.0) of all available rewards. Accordingly, the anova revealed a treatment effect (F3,30 = 9.08, P < 0.001) on reward turnover and pigeons obtained significantly fewer rewards after TTX injections than under baseline, saline or post-TTX conditions (all F1,10 > 8.50, all P < 0.01). There was no order effect for repeated measurements in any of the tests or any interaction effects between repeated measurement and treatment (all F < 3.50, all P > 0.09).

Does the reduction in pecking rate after neostriatum caudolaterale inactivation account for altered choice behaviour?

Some animals showed a greater reduction in pecking rate after TTX treatment than others and, likewise, NCL inactivation differentially affected reward turnover, linear gradient and switching pattern. If the reduction in pecking rate accounts for the reduced reward turnover and/or any of the choice parameters, then one would expect that those birds with a greater reduction in pecking rate also show a larger effect in any of those other parameters.

To test whether the reduction in pecking rate predicts the TTX effects on reward turnover, linear coefficients and relative number of postreward switches, we correlated the pecking rate reduction with any of these parameters. Tetrodotoxin effects were defined as the difference between parameter values in baseline and TTX conditions. There was a significant correlation between the reduction in pecking rate and the reduction in reward earning (r = 0.629, P < 0.001), indicating that those pigeons which showed a greater reduction in pecking rate after NCL inactivation earned less rewards compared with less affected birds. However, there was only a small and nonsignificant correlation between the reduction in pecking rate and the TTX effect on the linear coefficient of the curve fitted to the SI activity in the SI PRTH (r = 0.139, P > 0.530). Likewise, there was no significant correlation between the reduction in pecking rate and the TTX effect on the coefficient of the curve fitted to the LI data (r = −0.206, P > 0.35). This shows that the TTX-induced changes on the reward-related increase of pecking activity on the SI key or on the reward-related decrease on the LI key were not significantly related to the reduction in pecking rate. Accordingly, there was only a small and not significant correlation between the reduction in pecking rate and the TTX effect on the number of SI to LI switches in the first bin after SI reward offset (r = 0.141, P > 0.53), indicating that there was no significant linear relationship between the reduced pecking rate and the altered switching pattern. These results suggest that the decrease in reward earning was related to the reduction in pecking rate after NCL inactivation but that the reduced pecking rate cannot explain the observed alteration of the switching pattern.

Blockade of the caudate/putamen

After bilateral and unilateral TTX injections into the CPu, animals appeared paralysed and unresponsive when handled or moved. Moreover, although all pigeons were continuously food deprived, they did not feed when being fed. They did not exhibit any sign of distress or any other typical response to human contact. Accordingly, three of five animals entirely ceased to peck after bilateral TTX injections and the remaining two animals also showed a dramatically reduced pecking rate. An anova also showed that the treatment had a significant effect on the total pecking rate (F3,12 = 25.09, P < 0.001) (Fig. 5A) which was significantly lower after TTX injections compared with all control conditions (posthoc tests: all F1,4 > 21.80, all P < 0.01). Accordingly, the total pecking rate was also decreased after unilateral injections into either hemisphere (F3,12 = 10.21, P < 0.005; posthoc tests: all F1,4 > 9.90, all P < 0.05). There was no effect of the injection side in the left or right hemisphere and no interaction effect (all F < 0.45, all P > 0.71). Due to the large reduction in activity after CPu injections, we constrained our examination merely on the analysis of the pecking rate.

Details are in the caption following the image

Mean pecking rate (± SEM) during control conditions and after bilateral and unilateral tetrodotoxin (TTX) injections into (A) the caudate/putamen (CPu) and (B) the nucleus accumbens (NAc). (A) The total pecking rate was significantly reduced after bilateral or unilateral TTX injections into the pigeons' CPu compared with all control conditions. (B) The total pecking rate was significantly reduced after TTX injections into the NAc compared with the control conditions. Unilateral TTX injections into the NAc of either hemisphere had no significant effect on the pecking rate. Main effects: ***P < 0.001; n/s, not significant

Blockade of the nucleus accumbens

After bilateral TTX injections into the NAc, four of five animals entirely ceased to peck and the remaining animal also had a greatly reduced pecking rate (Fig. 5B). Accordingly, the anova revealed a treatment effect (F3,12 = 13.34, P < 0.001) and posthoc tests confirmed that the pecking rate after bilateral TTX injections was significantly reduced compared with all control conditions (all F1,4) > 10.60, all P < 0.05).

After unilateral injections, the anova did not reveal any significant treatment effect or significant treatment vs. hemisphere interactions (all F3,12) < 2.70, all P > 0.09). However, there was a significant main effect for hemisphere (F1,4 = 11.59, P < 0.05). Two-sided posthoc tests showed that the overall pecking rate across all treatment conditions was significantly lower in the left hemisphere treatment block compared with the right hemisphere treatment block (F1,4 = 11.59, P < 0.05).

Despite the reduction in pecking rate, pigeons appeared normal when approached in their home cage and showed the typical signs of distress when removed from their home cage. Most interestingly, they would feed when provided with food. Thus, in contrast to the effects after CPu inactivation, here the consummatory response was still preserved but pigeons refused to work for food.

Discussion

In this study, we outlined, for the first time, a neural network required for guiding choice behaviour in CFI schedules in pigeons. First, we performed an in depth analysis of the choice pattern and described how pigeons typically distribute their pecks between response alternatives. Second, we demonstrated that an intact NCL is required to generate this choice pattern. Third, we showed that the NAc and CPu might play an important role in motivational aspects and motor execution of the response pattern. Combining our results with data from the literature, we propose a network loop for generating choice behaviour. In this network, the NAc can indirectly modulate NCL activity via striato-thalamo-pallial pathways. By this means the NCL is tuned to organize choice behaviour with respect to reward anticipation and generate an optimized response distribution. Motor commands are fed forward to the CPu for execution.

To our knowledge, this study is the first to attempt a neuroscientific account of the underlying mechanism of choice behaviour in a paradigm that has a long history in behavioural research. Our histogram analysis approach has never been used to describe the response distribution in concurrent interval schedules. This study, furthermore, extends previous research by demonstrating a role of the avian NCL in dynamically implementing time- and context-dependent reward expectancy into the response selection process. Our results confirm that the NCL is a suitable model for the mammalian PFC.

Normal choice behaviour

We intended to contrast choice behaviour after TTX injections with normal choice behaviour. It was, therefore, necessary to first identify the response distribution under baseline conditions. On average, pigeons placed progressively more pecks on the SI key with increasing temporal proximity to reward delivery on the SI key. Accordingly, the closer a reward on the SI key, the fewer pecks were placed on the LI key. Birds anticipate reward delivery (Diekamp et al., 2002a,b), hence it is likely that this temporal response profile was determined by expectancy of SI rewards. The ability for reward-related temporal adjustment of pecking frequency requires an internal timer and an algorithm that links time estimation with reward expectancy, resulting in time-related reward anticipation.

A decision is manifest when a pigeon switches from one key to another. Our subjects reliably switched away from the rewarded key immediately following a reward on the SI key, again pointing to the putative choice-determining role of reward anticipation: an SI reward predicts that there is no immediate further reward to be expected on the SI key. As key switches were reward triggered, the behavioural change corresponded to low reward expectancy, presumably in order to engage in more reward-promising activities.

For the long interval, no climbing/descending pecking activity was found but activity oscillated with a frequency of approx. 1/25 Hz and in negative correlation to bursts on the SI key. This suggests that pecks on the LI key were not driven by expectancy of LI rewards but were rather determined by events on the SI key. Accordingly, the probability of switches from LI to SI key was not higher after reward delivery.

In summary, the response distribution on both keys seems to be exclusively determined by anticipation of events on the SI key. Choice behaviour can be characterized by (i) climbing/descending pecking activity related to the timepoint of reward delivery on the SI key; (ii) switches from the SI to the LI key following rewards on the SI key and (iii) negative coupling of pecks on the LI key with pecks on the SI key.

Temporary inactivation of the neostriatum caudolaterale interferes with normal choice behaviour

In the present study, bilateral injections of TTX into the NCL, but not sham injections, disrupted the bird's typical choice pattern. First, NCL inactivation levelled out the reward-related temporal modulation of pecking activity. The characteristic climbing activity on the SI key towards the end of the short interval was significantly reduced after NCL blockade. The normally descending activity on the LI key towards the SI reward was, in fact, even somewhat ascending after NCL inactivation, indicating that the pigeons slightly increased rather than decreased their pecking activity on the LI key with increasing proximity to the SI reward. Taken together, choice behaviour seemed to be not related or inappropriately related to the estimated timepoint of reward delivery, suggesting that pigeons either lacked reward anticipation, incorrectly estimated the time interval or were unable to utilize time information for behavioural planning. Second, the probability of switches from one key to another after rewarded pecks on the SI key was not significantly different from the switch probability at any other moment during a session. As outlined above, SI rewards represented contextual cues predicting the nonoccurrence of immediate further SI rewards. The fact that, after NCL inactivation, switches occurred with equal probability throughout the entire interval and were not confined to the time after SI reward delivery suggests that pigeons were unable to utilize the contextual cues for reward prediction or failed to integrate cued low reward expectancy into behavioural organization. Third, the negative correlation between pecking bursts on LI and SI keys converged towards zero, suggesting that pecks on the LI key were decoupled from pecks on the SI key.

Tetrodotoxin injections decreased the total pecking rate but did not abolish pecking per se. Our results show that the TTX effects on the animal's choice pattern were the result of NCL blockade and cannot be attributed to the TTX-induced reduction in pecking rate, although the rate reduction was related to the decrease in reward earning.

Interestingly, the subjects' basic ability for making choices was not completely abolished after NCL inactivations as they still placed more responses on the SI key than on the LI key, indicating a preserved preference for the SI key. This observation suggests that the NCL does not seem to be necessary for the general ability for preference-based response selection. Dietrich et al. (1997, 1998) have shown that lesions of the rat frontal cortex flatten the response distribution in single Fixed Interval schedules. They propose a role of the rodent PFC in providing a basis for the timing mechanism and time estimation in general. However, in addition to the impact on the temporal response profile, NCL inactivations in the present study also influenced the contextually cued key-switching pattern. Hence, due to the blockade effects on both time- and context-related response profiles, the current results cannot merely be explained by lack of ability for time estimation. A more likely cause for the altered choice pattern seems to be a generally impaired ability to integrate temporal and contextual reward expectation into behavioural organization.

In summary, TTX injections into the NCL affected the pigeons' choice behaviour. Neostriatum caudolaterale inactivation interfered with reward anticipation-related key affinity and temporal modulation of pecking frequency, probably due to an inability to integrate time- and context-dependent reward expectation into the response selection process. This putative function of the NCL is in agreement with studies on humans and primates showing that neurons in the NCL (Kalt et al., 1999; Diekamp et al., 2002b) and PFC (Goldman-Rakic, 1995; Tremblay & Schultz, 2000a,b) exhibit anticipatory responses to upcoming rewards and integrate reward anticipation into working memory for response planning and execution (Rainer et al., 1999; Fuster, 2000; Miller & Cohen, 2001).

Caudate/putamen is essential for response execution and nucleus accumbens is necessary for representing behavioural goals during the course of the interval

Tetrodotoxin injections into the NAc and CPu had substantially different effects from injections into the NCL, suggesting that the effects on choice behaviour after NCL inactivation cannot be attributed to the presence of TTX in brain tissue per se but are more probably the consequence of loss of particular functions subserved by the affected neural subregion.

More importantly, injections into the CPu and NAc also shed light on their putative contribution to choice behaviour and allowed us to speculate about their role in the reward-related response distribution. The temporary inactivation of the pigeon's CPu resulted in a total stop of behaviour. Pigeons did not react to any kind of stimulus, including food and human contact. It seems evident, therefore, that an intact CPu is necessary to execute any kind of motor behaviour. The avian striatum receives a direct projection from the NCL (Kröner & Güntürkün, 1999). Hence, it is possible that the CPu is involved in the motor execution of the choice pattern that is generated and then fed forward by NCL networks.

In contrast, blocking the NAc did not affect normal behaviour. Pigeons responded inconspicuously to various kinds of stimuli, including human contact. Most importantly, the animals did feed when being fed. Nevertheless, most animals refused to peck during an experimental session regardless of their food deprivation. Thus, pigeons refused to work for reward despite showing preserved consummatory responses.

These results are largely consistent with the literature on mammalian NAc lesions. Salamone et al. (1997, 2001) reported that 6-hydroxy-dopamine lesions in the NAc of rodents resulted in a reduced ability to overcome work-related response costs but did not directly influence the primary rewarding property of food. Moreover, dopamine depletion in the NAc did not affect performance in a variable interval schedule but did affect performance when combined with a fixed ratio requirement (Correa et al., 2002).

High response costs impose a delay between the behavioural requirement and the reward, hence the reduced ability to overcome work-related response costs might be the result of reduced tolerance to delayed reward delivery. Accordingly, lesions of the core of the NAc increase impulsive behaviour in rats (Cardinal et al., 2001) and chicks (Izawa et al., 2003) in an adjusting delay procedure, suggesting that the NAc may be necessary to maintain the subjective value of a reinforcer across a delay. As neural activity in the NAc is related to reward expectancy during a delay, Cardinal et al. (2001) and Schultz et al. (2000) proposed that the core of the NAc might play a role in the representation of behavioural goals across a delay. In the present study, the key-specific intervals represented relatively long delays between response onset and reward delivery. It is, therefore, possible that the NAc inactivation substantially decreased the pigeons' ability to uphold the reward value during the interval, resulting in the observed halt of behaviour. The subjects' choice pattern under baseline conditions was determined by reward anticipation on the SI key. If the NAc is important in generating reward expectation during a delay, it may have significantly contributed to the setting or shifting of behavioural goals in the course of the SI interval, such as focusing on the SI key or switching to the LI key. Accordingly, the NAc is part of the brain's dopamine system (Bardo, 1998) and dopaminergic influence on cortical activity has recently been related to goal setting, goal maintenance or goal changes, attentional focus and resistance against interference (Durstewitz & Seamans, 2002).

In contrast to our results, the cited lesion studies showed that lesions of the NAc did not completely abolish operant behaviour in the adjusting delay procedure (Cardinal et al., 2001; Izawa et al., 2003). However, those studies employed much shorter delays compared with the interval lengths in the present study. It is, therefore, likely that our intervals were too long to allow maintenance of the reward value across the interval during NAc inactivation. Moreover, all of the cited studies applied permanent lesions which might permit compensatory plastic processes. They may, therefore, have slightly different effects compared with the relatively short-lived and sudden neural inactivations caused by TTX.

Neural network guiding choice behaviour

These results provide evidence that dopamine may play an important role in generating the reward-related choice pattern. Mesocortical dopamine projections carry a temporal difference error between predicted and actual reward and process information about the ‘goodness’, probability and temporal occurrence of the reward (Schultz et al., 1997; Schultz, 1997, 2002; Waelti et al., 2001; Fiorillo et al., 2003). The origin of this time and context information about reward occurrence is unknown but climbing activity of neurons in the sensory thalamus is one potential candidate to explain the timepoint estimation (Komura et al., 2001; see also Durstewitz, 2003).

In the avian brain, dopaminergic mesencephalic nuclei, such as ventral tegmental area and substantia nigra pars compacta, project to somato-motor and viscero-limbic striatum, including the NAc, via reciprocal, parallel and distinct pathways (Reiner et al., 1984; Metzger et al., 1996; Mezey & Csillag, 2002). The NAc is reciprocally connected to the thalamic anterior dorsomedial nucleus (Wild, 1987; Székely et al., 1994; Csillag, personal communication) which further projects to the intermediate and medial part of the hyperstriatum ventrale and the mediorostral neostriatum (Metzger et al. 1996). The intermediate and medial part of the hyperstriatum ventrale and the mediorostral neostriatum complex is one of the major afferent sources to the NCL (Kröner & Güntürkün, 1999). Hence, the NAc is able to influence NCL activity via a striato-thalamo-pallial connection. In addition, the ventral tegmental area sends direct dopaminergic projections onto the NCL (Durstewitz et al., 1999; Kröner & Güntürkün, 1999). The NCL is a multimodal association structure that is rich in D1 receptors and is reciprocally connected to the secondary sensory areas of all modalities in addition to the striatal and midbrain structures mentioned (Kröner & Güntürkün, 1999). It projects to the limbic and somato-motor parts of basal ganglia, to archistriatum and to amygdala (Durstewitz et al., 1999; Kröner & Güntürkün, 1999). Hence, the avian brain contains three major mesencephalic dopamine systems onto pallial structures and basal ganglia that are presumably homologous to the corresponding mammalian dopamine systems (Durstewitz et al., 1999): the nigro-striatal, meso-striato-thalamo-pallial and meso-pallial loop.

Based on our data and results from the functional and anatomical literature, we propose a network loop guiding choice behaviour in the CFI schedule. The direct or indirect dopamine projections onto the NCL, comprising the meso-pallial and meso-striato-thalamo-pallial pathways, carry information about reward expectancy, including the ‘goodness’, probability and timepoint of reward delivery. They dynamically tune the NCL to set, maintain or shift behavioural goals in correspondence to reward anticipation. The NCL integrates information about reward expectation to dynamically adapt goal sets to changing reward probabilities and enhances affinity to reward-related stimuli and actions when reward expectancy is high, i.e. resulting in an increasing attentional focus on the SI key with increasing proximity to reward delivery. In turn, reverse tuning of NCL neurons might account for the initiation of behavioural changes when reward expectancy is low, such as key switches immediately after reward delivery. Furthermore, NCL networks convert this information into categorical motor commands and feed forward the resulting action commands to motor structures, such as the CPu of the basal ganglia, for execution.

Acknowledgements

This study was supported by grants of the Deutsche Forschungsgemeinschaft through the priority program EXECUTIVE FUNCTIONS (SPP 1107). We would like to thank Dr Sabine Windmann for invaluable discussions and helpful comments.

Abbreviations

  • CFI
  • concurrent fixed interval
  • CPu
  • caudate/putamen
  • LI key
  • response key associated with the long interval
  • LI PRTH
  • perireward time histogram for rewards on the LI key
  • LPO
  • lobus parolfactorius
  • NAc
  • nucleus accumbens
  • NCL
  • neostriatum caudolaterale
  • PFC
  • prefrontal cortex
  • PRTH
  • perireward time histogram
  • SI key
  • response key associated with the short interval
  • SI PRTH
  • perireward time histogram for rewards on the SI key
  • switch PRTH
  • perireward time histogram for switches from SI to LI key related to rewards on the SI key
  • TTX
  • tetrodotoxin.
    • The full text of this article hosted at iucr.org is unavailable due to technical difficulties.