Volume 57, Issue 2 e13471
ORIGINAL ARTICLE
Open Access

Stress effects on learning and feedback-related neural activity depend on feedback delay

Marcus Paul

Marcus Paul

Cognitive Psychology, Faculty of Psychology, Institute of Cognitive Neuroscience, Ruhr University Bochum, Bochum, Germany

Search for more papers by this author
Christian Bellebaum

Christian Bellebaum

Biological Psychology, Institute for Experimental Psychology, Heinrich-Heine University Düsseldorf, Düsseldorf, Germany

Search for more papers by this author
Marta Ghio

Marta Ghio

Biological Psychology, Institute for Experimental Psychology, Heinrich-Heine University Düsseldorf, Düsseldorf, Germany

Search for more papers by this author
Boris Suchan

Boris Suchan

Clinical Neuropsychology, Faculty of Psychology, Institute of Cognitive Neuroscience, Ruhr University Bochum, Bochum, Germany

Search for more papers by this author
Oliver T. Wolf

Corresponding Author

Oliver T. Wolf

Cognitive Psychology, Faculty of Psychology, Institute of Cognitive Neuroscience, Ruhr University Bochum, Bochum, Germany

Correspondence

Oliver T. Wolf, Cognitive Psychology, Faculty of Psychology, Institute of Cognitive Neuroscience, Ruhr University Bochum, Universitätsstr. 150, 44780 Bochum, Germany.

Email: [email protected]

Search for more papers by this author
First published: 03 September 2019
Citations: 12

Funding information

Deutsche Forschungsgemeinschaft (DFG) project (122679504); Collaborative Research Centre 874 (Integration and Representation of Sensory Processes) Project B4

[Correction added on 26 October 2020, after first online publication: Projekt Deal funding statement has been added.]

Abstract

Depending on feedback timing, the neural structures involved in learning differ, with the dopamine system including the dorsal striatum and anterior cingulate cortex (ACC) being more important for learning from immediate than delayed feedback. As stress has been shown to promote striatum-dependent learning, the current study aimed to explore if stress differentially affects learning from and processing of immediate and delayed feedback. One group of male participants was stressed using the socially evaluated cold pressor test, and another group underwent a control condition. Subsequently, participants performed a reward learning task with immediate (500 ms) and delayed (6,500 ms) feedback while brain activity was assessed with electroencephalography (EEG). While stress enhanced the accuracy for delayed relative to immediate feedback, it reduced the feedback-related negativity (FRN) valence effect, which is the amplitude difference between negative and positive feedback. For the P300, a reduced valence effect was found in the stress group only for delayed feedback. Frontal theta power was most pronounced for immediate negative feedback and was generally reduced under stress. Moreover, stress reduced associations of FRN and theta power with trial-by-trial accuracy. Associations between stress-induced cortisol increases and EEG components were examined using linear mixed effects analyses, which showed that the described stress effects were accompanied by associations between the stress-induced cortisol increases and feedback processing. The results indicate that stress and cortisol affect different aspects of feedback processing. Instead of an increased recruitment of the dopamine system and the ACC, the results may suggest enhanced salience processing and reduced cognitive control under stress.

1 INTRODUCTION

The adaptation of behavior relies on the processing of feedback we receive. We frequently make decisions under stressful conditions and consequences can occur immediately, but actions can have consequences that are delayed by seconds, minutes, or even months. Both factors, stress and the timing of feedback, can influence the learning from feedback and its neural correlates. While learning from immediate feedback is associated with medial frontal, particularly the dorsal anterior cingulate cortex (dACC; Cohen, Elger, & Ranganath, 2007; Gehring & Willoughby, 2002; Haber & Knutson, 2010; Kessler, Hewig, Weichold, Silbereisen, & Miltner, 2016; Peterburs, Kobza, & Bellebaum, 2016; Weismüller & Bellebaum, 2016) and striatal feedback processing (Foerde & Shohamy, 2011a, but see Dobryakova & Tricomi, 2013), delayed feedback fosters a hippocampal involvement in learning (Foerde, Race, Verfaellie, & Shohamy, 2013, Foerde & Shohamy, 2011b), as revealed by functional neuroimaging findings from healthy participants and work in patients suffering from amnesia or Parkinson's disease. These structures are known to be susceptible to the influence of stress (Hermans, Henckens, Joëls, & Fernández, 2014; Joëls, Karst, & Sarabdjitsingh, 2018), which raises the question of how stress affects learning and feedback-related neural processes depending on the timing of feedback.

Differences in the processing of immediate and delayed feedback emerged in the ERP component feedback-related negativity (FRN). The FRN is a negative deflection between 220‒380 ms after feedback presentation in the ERP that is larger for negative compared to positive feedback (Miltner, Braun, & Coles, 1997). The source of the FRN has been located in the dACC (Gehring & Willoughby, 2002; Hauser et al., 2014), while other studies have also found a striatal contribution (Becker, Nitsch, Miltner, & Straube, 2014; Foti, Weinberg, Dien, & Hajcak, 2011). According to a prominent theory, neurons in the dACC are inhibited by bursts of dopaminergic activity following reward or positive feedback but disinhibited by negative feedback, with the dACC integrating these dopaminergic reinforcement signals from the midbrain with information about preceding actions to achieve behavioral adaptation (Holroyd & Coles, 2002). Accordingly, the FRN correlates with trial-by-trial adaptations of behavior and the updating of outcome expectations (Cohen & Ranganath, 2007; Van Der Helden, Boksem, & Blom, 2010). After delayed feedback, the FRN difference between positive and negative feedback is reduced, which is in line with the assumption of a shift away from medial frontal and striatal processing of feedback toward medial temporal engagement in learning (Peterburs et al., 2016; Weinberg, Luhmann, Bress, & Hajcak, 2012; Weismüller & Bellebaum, 2016).

Another event-related potential (ERP) component related to feedback processing is the P300, which is a positive deflection in the ERP between 300‒500 ms after feedback presentation. It is often larger for positive compared to negative feedback and is thought to reflect an integration process of positive outcomes over many trials to maximize future rewards (Bellebaum & Daum, 2008; Bellebaum, Polezzi, & Daum, 2010; Kessler et al., 2016). Other studies, however, have found that the P300 is insensitive to feedback valence but sensitive to the magnitude of an outcome (Foti et al., 2011; Goyer, Woldorff, & Huettel, 2008; Yeung & Sanfey, 2004). One prominent interpretation is that the P300 is responsible for context updating, which states that the P300 reflects the revision of mental models of the current task (Donchin, 1981; Donchin & Coles, 1988). In order to interpret the sensitivity to outcome magnitudes of the P300, many authors refer to the motivational salience of a stimulus (Duncan-Johnson & Donchin, 1977; Nieuwenhuis, Aston-Jones, & Cohen, 2005; for reviews see Polich, 2007; San Martín, 2012).

More recently, the analysis of time-frequency dynamics of the EEG has revealed that conflicts and negative behavioral outcomes elicit theta band oscillations over medial frontal electrodes (Cavanagh & Frank, 2014; Cohen, 2014; Cohen, Wilmes, & van de Vijver, 2011). As for the FRN amplitude, frontal theta power increases have been linked to the trial-by-trial adaptation of behavior, predominantly after negative outcomes (Cavanagh, Frank, Klein, & Allen, 2010; van de Vijver, Ridderinkhof, & Cohen, 2011). Functional, temporal, and topographical commonalities of theta power with the FRN following negative outcomes have led to the conclusion that theta oscillations play a central role in the generation of the FRN (Cavanagh, Zambrano-Vazquez, & Allen, 2012; Cohen et al., 2007; Glazer, Kelley, Pornpattananangkul, Mittal, & Nusslock, 2018). Despite these commonalities, frontal theta oscillations have been shown to make unique contributions to feedback processing. The dACC has been proposed to use oscillations in the theta range for communication with the dorsolateral prefrontal cortex (PFC) to realize behavioral adaptation after negative feedback and for the resolution of conflicts (Cohen, 2014; van de Vijver et al., 2011). In line with this, it has been shown with human intracranial EEG that the dACC generates theta oscillations to recruit the lateral PFC in the implementation of behavioral adaptation (Smith et al., 2015). Overall, theta oscillations appear to reflect a neural process that is not specifically linked to the evaluation of outcome stimuli but more generally to conflict and cognitive control (Cavanagh & Frank, 2014; Cohen, 2014).

The processing of feedback and the adaptation of behavior following feedback are sensitive to modulations by acute stress. Studies using fMRI have reported a reduction in the reward-related activity in the medial PFC (Ossewaarde et al., 2011) and the reward system after stress (Kruse, Tapia León, Stalder, Stark, & Klucken, 2018). A pharmacological study demonstrated that the stress hormone cortisol, which is released from the adrenal cortex as a result of an activation of the hypothalamus-pituitary adrenal (HPA) axis by a stressor, decreased the neural activity in the reward system and the ACC (Kinner, Wolf, & Merz, 2016). A neuroimaging study has yielded further evidence for the notion that cortisol is a central mediator of stress effects on reward-related neural activity (Oei, Both, van Heemst, & van der Grond, 2014). On the behavioral level, stress has been shown to increase learning from positive feedback (Lighthall, Gorlick, Schoeke, Frank, & Mather, 2013) or decrease learning from negative feedback (Petzold, Plessow, Goschke, & Kirschbaum, 2010), while the overall learning performance was not affected.

Only few studies have investigated the effects of stress on the electrophysiological correlates of feedback processing so far, and the available evidence is restricted to a modulation of the FRN and theta power. Concerning the FRN, stress was found to increase the amplitude difference between negative and positive outcomes in feedback-based learning tasks (Glienke, Wolf, & Bellebaum, 2015; Wirz, Wacker, Felten, Reuter, & Schwabe, 2017). This finding is so far consistent with the behavioral effects as it indicates stronger processing differences between negative and positive feedback under stress. At the same time, the changes in amplitude for positive and negative feedback processing cannot directly be linked to changes in learning from positive and negative feedback as other processes irrespective of feedback valence also contribute to the result pattern (e.g., Ferdinand, Mecklinger, Kray, & Gehring, 2012), and enhanced differences between negative and positive feedback processing can result from changes for only one type of feedback or both. Other studies using gambling tasks with random feedback and a noise stressor (which occurred in parallel to the task execution) demonstrated decreases of the difference between negative and positive feedback for the FRN. Also, for frontal theta power, previous findings are inconsistent, with both power enhancements for negative feedback (Paul et al., 2018) and reduced power differences for negative and positive feedback in the stress condition (Banis, Geerligs, & Lorist, 2014; Banis & Lorist, 2012), possibly mediated by differences in the type and timing of the stressor as well as the type of task. As for the FRN, there is no 1:1 relationship between theta power and behavioral accuracy during learning. For example, enhanced theta after negative feedback may indicate an enhanced tendency for behavioral adaptation which can, if it is too strong, even be a disadvantage in probabilistic learning tasks.

In the current study, we investigated the effects of stress on learning from and the neural processing of immediate and delayed feedback. To test these effects, participants were subjected to either an acute laboratory stressor (stress group) or a control situation (control group) before they conducted a probabilistic reward learning task with immediate (500 ms) and delayed feedback (6,500 ms). Cortisol concentrations were determined from saliva to capture the stress-induced cortisol reactivity. We examined the effects of stress on feedback-locked ERPs (FRN, P300) and frontal theta power. Moreover, the relationship of the ERPs and theta power to the trial-by-trial subsequent behavioral accuracy was assessed using cross-trial regression analyses.

Based on our previous work with the same stress protocol and related learning paradigms (Glienke et al., 2015; Paul et al., 2018), we hypothesized that stress would increase FRN amplitudes and theta power for negative immediate feedback. Increased FRN amplitudes and theta power were expected to be accompanied by increases in the association of both components with trial-by-trial behavioral accuracy. Due to inconsistencies in previous findings, however, the opposite result pattern is also conceivable (Banis et al., 2014; Banis & Lorist, 2012). For delayed feedback, we expected that stress enhances the FRN and theta power for negative feedback relative to the control group even more strongly than for immediate feedback. Given that delayed feedback processing recruits the hippocampus (Foerde et al., 2013), which is compromised under stress, and that stress fosters striatal learning based on dopaminergic input (Schwabe & Wolf, 2012), the striatum and dACC may take over in an attempt to compensate for hippocampal dysfunction.

Whether the P300 is modulated by stress and feedback delay is currently unknown. Since the P300 has been linked to reward integration over time, a strong association of P300 amplitudes with accuracy on the single-trial level was not expected.

Finally, to investigate the role of the cortisol reactivity as one important mediator of stress effects on EEG correlates of feedback processing more directly, we performed additional linear mixed effects (LME) analyses. As we had assessed the cortisol level before, during, and after stress induction, we could use the cortisol increase to investigate linear relationships between this response measure and the FRN and P300 amplitudes and frontal theta power for immediate and delayed feedback in participants of the stress group. The LME analyses served to explore to what extent the observed stress effects were related to the effects of stress-induced cortisol increases on feedback processing.

2 METHOD

2.1 Participants

Fifty healthy male volunteers between 18‒35 years (mean = 25.3 years, SD = 3.8 years) participated in this study. Prior to testing, participants were screened for the exclusion criteria smoking, previous or current psychiatric or neurological disorders, intake of medication, substance abuse, and a body mass index below 18 or above 29 kg/m2. Additionally, they had to be naïve to the stressor (socially evaluated cold pressor test, SECPT, see below). All participants had normal or corrected-to-normal vision.

Participants were randomly assigned to the stress (n = 25) or the control condition (n = 25). Based on their cortisol reactivity (Δ cortisol), participants were classified as stress responders or nonresponders. Participants showing an increase in cortisol concentrations from baseline to peak of 1.5 nmol/l or higher were classified as responders (Miller, Plessow, Kirschbaum, & Stalder, 2013). Since cortisol has been identified as one important mediator of stress effects on reward processing (Kinner et al., 2016; Montoya, Bos, Terburg, Rosenberger, & van Honk, 2014), we excluded eight nonresponders from the stress group and four responders from the control group for inferential statistical analyses that compared the groups directly. For the investigation of the relationship between Δ cortisol and electrophysiological correlates in the stress group by means of LME analyses, we excluded one participant as an outlier for Δ cortisol (35.7 nmol/l) and three participants as outliers for either the FRN, P300, or frontal theta power values (these participants were also excluded from the group comparisons as they were nonresponders).

The final sample size for group comparisons was in accordance with the ad hoc power analysis (G*Power, version 3.1.9.4; Faul, Erdfelder, Lang, & Buchner, 2007) that revealed a required total sample size of 38 participants to achieve a power of 1-β = .95 to detect a 2 (between-subjects factor) by 4 (within-subject factors) interaction effect with an effect size of f = .329 (with α = .05 and an average correlation among repeated measures of r = .1). The effects size was expected based on previously reported stress effects on the FRN (Glienke et al., 2015).

The study was approved by the ethics board of the Faculty of Psychology at Ruhr University Bochum. All participants gave informed written consent before participation and were reimbursed with 12 €.

2.2 Experimental procedure

To control for the diurnal cycle of the endogenous cortisol concentrations (Kalsbeek et al., 2012), testing was conducted in the afternoon between 1 and 7 p.m. Participants were instructed to abstain from alcohol and excessive exercise the day before the testing and to refrain from anything but water 2 hr before testing.

After their arrival in the lab, participants gave written informed consent and EEG electrodes were prepared (Figure 1a). The first saliva sample (−1 min) and baseline cardiovascular measures were obtained before participants underwent the stress treatment. While the stress group was subjected to the stressful SECPT, the control group was subjected to a control situation. One minute after the stress manipulation, the second saliva sample was collected (+1 min), and post-treatment cardiovascular measures and subjective stress ratings were obtained. Twenty minutes after the stress manipulation, the third saliva sample was collected (+20 min), and participants conducted the reward learning task. After the reward learning task, the last saliva sample was taken (+56 min), and participants were debriefed and reimbursed.

Details are in the caption following the image
(a) Time line of the experiment. After electrode preparation, participants were either subjected to the stressful SECPT or a control procedure. Twenty minutes after the treatment, participants were subjected to the reward learning task. Participants conducted one block with immediate feedback and one block with delayed feedback. Order of the blocks and timing of the onset of the immediate feedback condition were counterbalanced between the participants. Saliva samples were taken at four time points (−1, +1, +20, +56 min). The time is reported relative to the onset of the treatment (SECPT or control procedure). (b) Course of an example trial. Participants had to make a choice between two stimuli within 1,000 ms. The choice was highlighted for 200 ms, followed by a delay period that was 500 ms for the immediate feedback condition and 6,500 ms in the delayed feedback condition. After the delay, feedback was presented for 500 ms. Participants could either receive 20 cents or lose 10 cents. If no response was made within 1,000 ms after stimulus presentation, participants were reminded to respond faster

2.3 Stress induction and assessment

During the SECPT, participants had to immerse their right hand in ice water (0‒2°C) for maximal 3 min, while they were videotaped. Additionally, an unfamiliar female experimenter instructed and observed the participants. During the control situation, participants immersed their right hand in warm water (35‒37°C) and were neither videotaped nor observed by an experimenter.

To assess the effectiveness of the stress induction, subjective and physiological stress measures were obtained. Participants rated the stressfulness, pain, discomfort, and difficulty to keep the hand immersed on scales increasing in steps of 10 from 0 (not at all) to 100 (very much). Systolic and diastolic blood pressure and the heart rate were obtained before, during, and after the treatment using the Dinamap system (Critikon, Tampa, FL). We obtained three measures of blood pressure and heart rate at each time point and averaged the measures at each time point. Salivettes (Sarstedt, Nümbrecht, Germany) were used to collect saliva at four time points (−1 min, +1 min, +20 min, and + 56 min). After testing, saliva samples were stored at −18°C. To determine the cortisol concentrations, saliva was analyzed using a cortisol enzyme-linked immunosorbent assay (Demeditec, Kiel, Germany) with intra-assay coefficients of variance (CV) below 5% and interassay CVs below 15%.

2.4 Reward learning task

During the probabilistic reward learning task that was adapted from a previous study (Figure 1b; Weismüller & Bellebaum, 2016), participants learned to make beneficial choices between stimuli based on immediate and delayed monetary feedback. In each trial, participants made a choice between two Japanese characters that appeared on the left and right side of a computer screen and received 20 cents or lost 10 cents for their choice.

Participants conducted two blocks of 100 trials. In one block, participants received immediate feedback (500 ms), while in the other block feedback was delayed by 6,500 ms. The order of the blocks was counterbalanced between participants. Ten Japanese characters were used as stimuli, five of which were used in the block with immediate feedback and five in the block with delayed feedback (counterbalanced between participants). In both blocks, each of the five Japanese characters was associated with a fixed reward probability of 0%, 20%, 40%, 60%, or 80%. Each of the ten possible stimulus combinations was presented ten times within one block, with the assignment of stimulus to the side on the computer screen counterbalanced between trials.

Participants had to press the left or right Ctrl key on a standard computer keyboard to choose the left or right stimulus within 1,000 ms after stimulus presentation. The choice was highlighted for 200 ms. After that, a fixation cross was presented for either 500 ms (immediate feedback condition) or 6,500 ms (delayed feedback condition) before the feedback was presented for 500 ms. The length of the intertrial interval was jittered between 500 ms and 1,000 ms.

While the delayed feedback block lasted 18 min, the immediate feedback block took 8 min to complete. Since the timing of a task relative to the stressor is a crucial factor for the influence of stress on learning and PFC functioning (Arnsten, 2009; Pabst, Brand, & Wolf, 2013; Schwabe & Wolf, 2013; for a review, see Joëls, Fernandez, & Roozendaal, 2011), we minimized the differences in the timing of the task relative to the stressor between the blocks by varying the onset of the immediate feedback block between participants (Figure 1a). While the delayed-feedback block started 20 min (delayed feedback first) or 38 min (immediate feedback first) after the stressor, the onset of the immediate-feedback block was randomized between 20, 25, and 30 min after the stressor (when the immediate feedback was first) or between 38, 43, and 48 min after the stressor (when delayed feedback was first). During breaks between the blocks, participants remained seated in the EEG chamber and rested.

The response accuracy was determined separately for the immediate and delayed feedback condition. In line with previous studies (Bellebaum et al., 2016; Foerde et al., 2013; Weismüller & Bellebaum, 2016), responses were considered correct when participants chose the stimulus with the higher reward probability. To analyze differences between groups and conditions, percent correct responses was determined.

2.5 EEG recording and data processing

EEG was recorded from 30 passive Ag/AgCl electrodes, which were mounted on the head in an elastic cap (EasyCap, Herrsching, Germany). Electrodes were distributed according to the 10–20 system. Data were digitized at 500 Hz by a 32-channel BrainAmp Standard AC amplifier (Brain Products, Gilching, Germany) and with a time constant of 10 s. Participants were grounded by an electrode at the FPz position, and electrodes at linked mastoids served as references. Impedances were kept below 10 kΩ.

EEG data were analyzed using the FieldTrip toolbox (Oostenveld, Fries, Maris, & Schoffelen, 2011) and MATLAB R2016a (The MathWorks Inc., Natick, MA). Continuous data were segmented from 1,500 ms before to 3,000 ms after feedback presentation and filtered with a 0.5 Hz high-pass, zero-shift Butterworth IIR filter and a 48‒52 Hz band-stop filter for the elimination of line noise. Eyeblinks were removed using an independent component analysis. One component with a symmetrically frontal, positive topography was identified and removed from the data for each participant before the data were back-transformed. After the eyeblink correction, segments with residual artifacts, such as muscle artifacts and sharp edges, were removed by careful visual inspection.

For the ERP analyses, an additional 20 Hz low-pass, zero-shift Butterworth IIR filter was applied to the data. Averages were calculated for the four task conditions (immediate positive feedback, immediate negative feedback, delayed positive feedback, delayed negative feedback). Averages contained on average of 43.5 (SEM = 1.0) immediate positive feedback trials, 50.7 (0.9) immediate negative feedback trials, 43.8 (1.2) delayed positive feedback trials, and 47.9 (1.7) delayed negative feedback trials. Afterward, the averages were baseline corrected using a −200 to 0 ms prefeedback baseline.

Time windows for the quantification of the ERP components were determined from the average ERP of all trials and all participants (across all experimental factors). The FRN was defined as mean amplitude between 215 and 315 ms relative to the feedback onset at electrode FCz. The time window was centered at the latency of the FRN peak in the average ERP. The P300 was defined as local maximum between 300 and 500 ms after feedback onset at electrode FCz.

To obtain the time-frequency spectra, data were convolved with a series of 59 linearly spaced complex Morlet wavelets ranging from 1 to 30 Hz. The wavelets each had a width of 5 cycles, resulting in a σ of 132.6 ms at 6 Hz, which was at the center of the frequency band of interest (4‒8 Hz). Power spectra were averaged over segments of the immediate positive feedback, immediate negative feedback, delayed positive feedback, and delayed negative feedback conditions, respectively. Afterward, the relative signal change was calculated with respect to the −400 to −100 ms prefeedback baseline. To compare the theta power between groups and conditions, we averaged the power between 200 and 600 ms after feedback onset and between 4 and 8 Hz.

To assess the relationship between the ERP and the time-frequency power on the one hand and trial-by-trial behavioral adaptation on the other hand, we performed cross-trial regression analyses. The aim of the analysis was to investigate whether the FRN and the frontal theta power of the current trial were related to the accuracy of the subsequent trial in which the chosen stimulus was presented again. We focused on the ERPs and time-frequency power after negative feedback in this analysis, since a link between FRN and theta power and the adaptation of behavior has been shown for negative feedback trials (Cavanagh et al., 2010; Cohen & Ranganath, 2007; van de Vijver et al., 2011; Van Der Helden et al., 2010). Similar to previous studies (Cohen, 2016), the EEG data of each trial and at each time or time-frequency point at electrode FCz were projected onto a design matrix that comprised one column for the intercept and one column containing the accuracy of the subsequent trial in which the chosen stimulus was presented again. The time series or time-frequency power (y) and the design matrix (X) were subjected to a least squares equation as β = (XTX)−1 XTy, where X is the design matrix, y is the data matrix, T is the transpose, and −1 is the inverse of a matrix. The least squares equation was solved using the mldivide function in MATLAB, which is the least squares solution to linear systems as Ax = B. As a result of this analysis, we obtained a time series or time-frequency map of β coefficients per condition for each participant. β coefficients, which describe the relationship of the data (time series and time-frequency power) and the design matrix (accuracy), were z transformed afterward. Subsequently, time-frequency z values were averaged over the theta band (4‒8 Hz) and between 200 and 600 ms after feedback onset.

2.6 Statistical analyses

Subjective stress ratings and cardiovascular measures were analyzed using multivariate analyses of variance (MANOVA) with the between-subjects factor group (stress, control). The analysis of the cardiovascular measures additionally included the within-subject factor time (pretreatment, during, post-treatment).

Differences in salivary cortisol concentrations were tested using repeated measures ANOVAs with the within-subject factor time (−1 min, +1 min, +20 min, +56 min) and the between-subjects factor group (stress, control).

Accuracy and the amplitudes of the FRN and P300 were analyzed with repeated measures ANOVAs with the between-subjects factor group (stress, control) and the within-subject factor feedback delay (immediate feedback, delayed feedback). The analysis of the FRN and P300 additionally included the within-subject factor feedback valence (positive feedback, negative feedback).

Significant interactions were resolved using post hoc t tests and repeated measures ANOVAs. Post hoc tests were corrected for multiple comparisons using the false discovery rate (FDR) correction (Benjamini & Hochberg, 1995). In all cases of violations to the sphericity assumption, the Greenhouse-Geisser correction was applied and ε values are reported. If unequal variances were detected, degrees of freedom of the t tests were corrected accordingly. The α level of .05 was applied to all parametric tests. Partial eta-squared values are reported as estimates of effect sizes of the MANOVAs and ANOVAs. Effect sizes of pairwise comparisons are reported as Cohen's d.

Frontal theta power was analyzed with the factors group, feedback-delay, and valence. Since theta power increases are often observed over medial and lateral frontal electrodes for negative feedback (Cohen et al., 2011; van de Vijver et al., 2011), theta power was analyzed for all electrodes to determine the topographical specificity of the observed effects. The statistical analysis of the time-frequency data relied on nonparametric cluster-based permutation statistics to correct for the accumulation of alpha errors in multiple comparisons (Maris & Oostenveld, 2007). First, coherent spatial clusters of electrodes exceeding the statistical threshold of α < .05 were detected, and summed t values of each cluster were returned as test statistic. Subsequently, at each of 1,000 iterations during the permutation test, the group affiliations (stress, control) of a random subset of participants were swapped and the first step was repeated to create a null distribution. The test statistic is then compared to the null distribution. For each cluster reaching the cluster-based threshold, summed t values (tsum) and cluster p values are reported. For the statistical analysis of the regression analysis between the time-frequency spectrum and accuracy, a null distribution was created as described before, which was used to z transform the beta coefficients. P values were determined from z values averaged over the theta band (4‒8 Hz) and over the time window of 200‒600 ms after feedback presentation.

Finally, we directly examined the effect of cortisol reactivity (Δ cortisol), which was defined as the increase in cortisol concentrations from baseline (−1 min) to peak (+20 min), and its interactions with the effects of feedback delay and valence on neural feedback processing in participants assigned to the stress group by means of LME analyses. LME analyses were performed by using the lme4 statistical package (version 1.1-18) in the R environment (version 3.5.1). The LME analyses were conducted for 21 participants, including five nonresponders with very low cortisol increases or even cortisol decreases. Separately for the FRN, the P300, and the frontal midline theta power, we specified a model that included the categorical factors feedback delay (recoded as +1 = immediate, −1 = delayed feedback) and feedback valence (recoded as +1 = positive feedback, −1 = negative feedback), and the continuous factor Δ cortisol (mean-centered) as fixed effects predictors. We also modeled all the interactions between these factors. Participants were entered into the model as a random effects factor. Following the approach suggested by Luke (2017), we used the restricted maximum likelihood approach to estimate the model and the R package lmerTest (version 3.0-1; Kuznetsova, Brockhoff, & Christensen, 2017) to evaluate significance in the model by using Satterthwaite approximation for the degrees of freedom. Significant interactions were examined by applying follow-up simple slope analyses using the R package jtool (version 0.7.3).

3 RESULTS

3.1 Subjective stress response

The SECPT successfully elicited a subjective stress response (see Table 1; F(4, 33) = 24.98, p < .001, Wilk's Λ = 0.248, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0001 = .75). Participants rated the SECPT significantly higher concerning the discomfort, t(16.99) = 8.08, p < .001, d = 2.83, pain, t(16.69) = 7.89, p < .001, d = 2.77, stress, t(17.07) = 5.56, p < .001, d = 1.95, and difficulty to keep the hand immersed, t(16.63) = 6.20, p < .001, d = 2.18, compared to the control situation.

Table 1. Subjective stress measures and cardiovascular measures of the stress and the control group
  Control Stress
Subjective stress response
Discomfort 1.90 (1.12) 54.12 (6.36)
Pain 0.95 (0.95) 52.94 (6.52)
Stress 1.90 (1.12) 36.47 (6.12)
Difficulty to keep hand immersed 0.95 (0.95) 43.53 (6.80)
Systolic blood pressure (mmHg)
Pretreatment 124.35 (3.24) 122.73 (3.10)
During treatment 130.87 (3.49) 149.51 (3.41)
Post-treatment 121.03 (2.52) 123.20 (3.24)
Diastolic blood pressure (mmHg)
Pretreatment 65.24 (1.68) 62.61 (1.71)
During treatment 73.17 (1.86) 89.55 (2.03)
Post-treatment 63.75 (1.54) 62.55 (1.61)
Heart rate (BPM)
Pretreatment 65.54 (2.33) 69.63 (1.60)
During treatment 67.16 (2.28) 74.06 (2.20)
Post-treatment 66.10 (2.25) 67.12 (1.78)

Note

  • Differences in subjective ratings and cardiovascular responses were tested with FDR-corrected post hoc t tests. Values represent the means (± SEM).
  • ** < .001;
  • * p < .05.

3.2 Cardiovascular measures

An increased activation of the sympathetic nervous system during the SECPT in the stress group was revealed by a significant Time × Group interaction of the MANOVA (Table 1; F(6, 31) = 16.91, p < .001, Wilk’s Λ = 0.234, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0002 = .77). Follow-up repeated measures ANOVAs indicated Time × Group interactions for the systolic blood pressure, F(2, 72) = 32.89, p < .001, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0003 = .48, the diastolic blood pressure, F(2, 72) = 56.23, p < .001, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0004 = .61, ε = .789, and the heart rate, F(2, 72) = 7.46, p = .001, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0005 = .17. The systolic, t(36) = 3.77, p = .001, d = 1.20, and diastolic blood pressure, t(36) = 5.95, p < .001, d = 1.89, were elevated in the stress group during the SECPT, while there was a trend toward an increased heart rate during the SECPT, t(36) = 2.15, p = .039, d = 0.68. The groups did not differ in cardiovascular measures pre- or post-treatment (all ts < 1.38, all ps > .18).

3.3 Salivary cortisol concentrations

An elevation in salivary cortisol concentrations in the stress group indicated a successful activation of the HPA axis by the stressor (Figure 2; Time × Group interaction: F(3, 108) = 23.87, p < .001, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0006 = .40, ε = .650). Cortisol concentrations were increased 20 min, t(36) = 3.60, p = .001, d = 1.14, and 56 min after the stress induction, t(36) = 2.66, p = .012, d = 0.84, but did not differ between groups before (−1 min) and 1 min after the stress treatment (both ts < 1.11, both ps> .275).

Details are in the caption following the image
Salivary cortisol concentrations are depicted relative to the time of the onset of the treatment (SECPT or control treatment). While there were no differences in salivary cortisol between the groups at baseline (−1) and 1 min after the treatment (+1), cortisol was elevated in the stress group 20 min after the SECPT (+20) and after the reward learning task (+56). Error bars represent the SEM. *p < .05

3.4 Behavior

To assess the influence of stress on the performance in the reward learning task, we determined the accuracy (Figure 3) during the learning from immediate and delayed feedback.

Details are in the caption following the image
Accuracy in percent correct responses, averaged over the immediate and delayed feedback condition and for the control and the stress group. Accuracy in the stress group was relatively reduced in the immediate feedback condition, while it was larger than in the control group in the delayed feedback condition. The asterisk represents the significant Group × Feedback Delay interaction (p = .024). Error bars represent the SEM

The analysis revealed a Feedback Delay × Group interaction: F(1, 36) = 5.21, p = .024, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0007 = .13. Exploring the interaction with FDR-corrected within-group comparisons, we found more correct responses in the delayed feedback condition compared to the immediate feedback condition in the stress group, t(16) = 2.77, p = .014, d = 0.97, while no difference in accuracy between conditions was detected for the control group, t(20) = 0.51, p = .614, d = 0.13.

3.5 Stress and feedback-delay modulations of neural feedback processing

To test whether stress and feedback delay affected neural feedback processing, we analyzed the FRN, the P300, and frontal midline theta power. Additionally, cross-trial regression analyses were conducted to assess the relationship of the ERPs and frontal theta power with the subsequent behavioral accuracy on the single-trial level.

3.5.1 Feedback-related negativity

The analysis of FRN revealed less positive (i.e., larger) amplitudes for negative compared to positive feedback (Figure 4; main effect valence: F(1, 36) = 14.06, p = .001, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0008 = .28). Furthermore, a Group × Valence interaction emerged, F(1, 36) = 5.97, p = .020, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0009 = .14. Follow-up FDR-corrected t tests demonstrated that, while FRN amplitudes were less positive for negative feedback compared to positive feedback in the control group, t(20) = −4.90, p < .001, d = 0.43, the FRN amplitudes for negative and positive feedback did not differ for the stress group, t(16) = −0.83, p = .421, d = 0.10.

Details are in the caption following the image
Results from the ERP analysis. (a) Grand averages are presented for the immediate and delayed feedback conditions and for the control and the stress group. Shaded areas represent time intervals used for the mean amplitude of the FRN and peak detection of the P300. (b) Average amplitudes of the FRN (upper) and the P300 (lower) are depicted. The FRN was reduced in the stress group compared to the control group and in the delayed feedback compared to immediate feedback condition. In the delayed feedback condition, the P300 was larger for positive compared to negative feedback in the control group. The reduction of the FRN in the stress group tended to be larger in the delayed feedback condition. In the stress group, the P300 for positive feedback was reduced and did not differ between positive and negative feedback. There was no effect of group or valence on the P300 in the immediate feedback condition. Error bars represent the SEM

A Valence × Feedback Delay interaction indicated an overall decrease in the difference between negative and positive feedback for delayed feedback relative to immediate feedback, F(1, 36) = 17.43, p < .001, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0010 = .33. Accordingly, FRN amplitudes were significantly less positive for negative compared to positive immediate feedback, t(36) = −6.06, p < .001, d = 0.49, whereas for delayed feedback FRN amplitudes did not differ between negative and positive feedback, t(36) = −0.17, p = .865, d = 0.02. Furthermore, a Group × Valence × Feedback Delay interaction fell short of significance, F(1, 36) = 2.95, p = .095, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0011 = .08.

3.5.2 P300

The P300 was larger after positive compared to negative feedback (Figure 4; main effect valence: F(1, 36) = 6.51, p = .015, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0012 = .15. Furthermore, a Group × Feedback Delay × Valence interaction was found, F(1, 36) = 9.43, p = .004, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0013 = .21. Follow-up ANOVAs revealed a Group × Valence interaction for the P300 after delayed feedback, F(1, 36) = 9.68, p = .004, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0014 = .21, while this interaction was not observed in the immediate feedback condition, F(1, 36) = 0.69, p = .410, urn:x-wiley:00485772:media:psyp13471:psyp13471-math-0015 = .02. Pairwise comparisons demonstrated that the interaction for the delayed feedback was due to a larger P300 after positive feedback than negative feedback in the control group, t(20) = 3.22, p = .004, d = 0.53, while the stress group did not show this difference, t(16) = 1.12, p = .279, d = 0.16. Furthermore, the P300 amplitude after positive feedback was larger in the control group compared to the stress group, t(36) = 2.61, p = .013, d = 0.83.

3.5.3 Frontal midline theta power

Cluster-based permutation tests were applied to investigate the effects of stress and feedback delay on the frontal midline theta power (4‒8 Hz, 200‒600 ms postfeedback). A cluster of electrodes was detected that demonstrated a Group × Feedback Delay × Valence interaction effect, tsum(36) = 45.99, p = .004. This cluster also included the midfrontal electrode FCz, t(36) = 2.38, p = .004. Further permutation tests revealed that both groups showed stronger theta power for negative compared to positive feedback after immediate feedback (control: tsum(20) = 50.12, p = .006; FCz: t(20) = 3.99, p = .006; stress: tsum(16) = 16.38, p = .048; FCz: t(16) = 2.60, p = .048). This theta power difference between negative and positive feedback was larger in the control group compared to the stress group for immediate feedback (Figure 5a,b; Group × Valence interaction: tsum(36) = 27.32, p = .014). The theta power increase in the control group was detected at a cluster of electrodes with a lateral frontal distribution that did not include the FCz, t(36) = 1.68, > .9. In the delayed feedback condition (Figure 5c,d), no difference between the groups was detected for the contrast between negative and positive feedback (Group × Feedback Valence interaction: at all electrodes t ≤ 1.42, and all ps > .9).

Details are in the caption following the image
Time-frequency plots and topographical maps of the difference between negative and positive feedback (negative–positive) are shown for the control group, the stress group, and the difference between groups (control–stress). Time-frequency maps show the relative power changes averages of all electrodes included in a significant cluster. If no significant effect was observed, time-frequency power of electrode FCz is depicted. Topographical maps show the relative power change averaged over the theta band (4‒8 Hz) and over 200‒600 ms postfeedback interval. Significant electrode clusters are highlighted with filled circles. Bar graphs show relative theta power changes (4−8 Hz, 200-600 ms postfeedback) at electrode FCz separately for positive and negative feedback. (a, b) Results of the immediate feedback conditions. (c, d) Results of the delayed feedback condition. Results show a decrease in frontal theta power (4‒8 Hz and 200‒600 ms postfeedback) after stress in immediate feedback. In delayed feedback, frontal theta power is diminished and does not differ between groups

3.5.4 Cross-trial regression analyses of midfrontal EEG components and accuracy

To investigate whether the observed differences in midfrontal EEG components (FRN, theta power) were related to the behavioral accuracy on the single-trial level, we performed regression analyses between each time point or time-frequency point at midfrontal electrode FCz after negative feedback, and the accuracy in the subsequent trial in which the chosen stimulus of the current trial can be chosen again.

We found a significantly stronger relationship between the ERP and subsequent accuracy within the time range of the FRN (286‒308 ms) in the control group compared to the stress group for immediate feedback (Figure 6a, z = −2.16, p = .031). A stronger relationship indicated that larger FRN amplitudes were related to better subsequent performance. The groups did not differ in the delayed feedback condition (Figure 6b, 215‒315 ms, z = −0.16, p = .87). The group difference for immediate feedback tended to be larger compared to delayed feedback (Group × Feedback Delay interaction at time interval 234‒248 ms: z = −1.79, p = .073).

Details are in the caption following the image
Time series of beta coefficients of cross-trial regression analysis between the ERPs time-locked to the presentation of negative feedback and the accuracy in the subsequent trial in which the chosen stimulus was presented again. (a) Cross-trial regression beta coefficients for the immediate feedback condition. (b) Beta coefficients for the delayed feedback condition. Significant differences between control and stress group are highlighted by shaded areas

The frontal theta power (4‒8 Hz, 200‒600 ms postfeedback) had a positive relation with subsequent accuracy (Figure 7) depending on feedback delay and group (Group × Feedback Delay interaction: z = 2.47, p = .014). Theta power and subsequent accuracy had a positive relationship in controls for immediate feedback that was reduced in stressed participants (z = 2.27, p = .023). For delayed feedback, the relationship between theta and behavior did not differ between groups (z = 0.83, p = .408).

Details are in the caption following the image
Time-frequency maps depict the beta coefficients that reveal the strength of the relationship between each time-frequency point and the accuracy in the subsequent trial in which the chosen stimulus was presented again. (a) Regression results for the immediate feedback condition. (b) Results for the delayed feedback condition

3.6 Associations between cortisol, feedback delay, and neural feedback processing

Figure 8 illustrates the relationship between the FRN (Figure 8a), P300 (Figure 8b), and frontal midline theta power (Figure 8c), and Δ cortisol depending on feedback delay and valence, which we examined with LME analyses. Online supporting information, Table S1A, S1B, and S1C provides a summary of the estimated mixed-effect models, with parameter-specific t tests for all effects for the FRN, the P300, and theta power, respectively. The description of the results below will be restricted to effects involving the factor Δ cortisol.

Details are in the caption following the image
Correlation between the amplitude of (a) FRN, (b) P300, and (c) theta power and Δ cortisol depending on valence (negative in gray, positive in black) in the immediate feedback (left) and delayed feedback (right) condition. FRN for immediate feedback was larger with increasing cortisol responses, but the difference between negative and positive feedback tended to decrease with increasing cortisol responses. This association was not observed for delayed feedback. P300 amplitudes were overall negatively correlated with cortisol responses for immediate but not for delayed feedback. Theta power overall decreased with increasing cortisol responses for immediate and delayed feedback

3.6.1 Feedback-related negativity

The analysis for the FRN revealed that the Feedback Delay × Δ Cortisol, F(1, 57) = 14.573, p < .001, interaction was significant. Follow-up simple slope analysis of this interaction revealed that the amplitude of the FRN was significantly modulated by Δ cortisol only for immediate feedback (p = .02) but not for delayed feedback (p = .92). In the immediate feedback condition, FRN amplitudes became less positive and thus larger for larger values of Δ cortisol. The Δ cortisol main effect and all remaining interactions including Δ cortisol as a factor were not significant (all ps > .07).

3.6.2 P300

The P300 analysis revealed a significant main effect of Δ cortisol, F(1, 19) = 6.197, p = .022, which was further qualified by a significant Feedback Delay × Δ Cortisol interaction, F(1, 57) = 7.550, p = .008. Follow-up simple slope analysis of this interaction revealed that the amplitude of the P300 was significantly modulated by Δ cortisol only for immediate feedback (p < .001) but not for delayed feedback (p = .18). Resembling the pattern of the FRN, amplitudes were reduced (i.e., less positive) for increasing values of Δ cortisol. The Valence × Δ Cortisol interaction and the Feedback Delay × Valence × Δ Cortisol three-way interaction were not significant (both ps > .80).

3.6.3 Frontal midline theta power

The analysis of theta power revealed a significant main effect of Δ cortisol, with reduced theta power for larger cortisol increases, F(1, 19) = 15.225, p < .001. We did not find any significant interaction effect (all ps > .25).

4 DISCUSSION

The current study investigated the effects of stress on learning from immediate and delayed feedback and the underlying neural mechanisms of feedback processing. Participants that underwent the stress induction reported increased subjective stress and showed increased cardiovascular and cortisol responses relative to controls. In the stress group, the performance was increased for delayed feedback relative to immediate feedback, but this was not the case in the control group. The neural correlates of feedback processing were also influenced by stress. Stress overall decreased the difference between FRN amplitudes for negative and positive feedback. The P300 was decreased in the stress group relative to the control group for delayed feedback, while it did not differ between groups for immediate feedback. Stress reduced the P300 specifically for positive delayed feedback. As a consequence, the valence sensitivity of the P300 was diminished in the stress group. Frontal theta power was reduced by stress for immediate feedback but not for delayed feedback. Beyond stress-induced modulations of ERP amplitudes and theta power, we observed that stress changes the association of the FRN and frontal theta with future behavior. Cross-trial regression analyses revealed that stress decreased the associations of the FRN and frontal theta power with subsequent performance for immediate feedback trials. Learning from delayed feedback was unrelated to the FRN and frontal theta power in both groups. LME analyses showed that stress-induced cortisol increases were associated with increases in FRN amplitudes for immediate feedback, and the difference between negative and positive feedback tended to decrease with increases in cortisol. For delayed feedback, cortisol increases were not related to FRN amplitudes. Cortisol was related to decreases in P300 amplitudes for immediate, but not delayed, feedback. Theta power overall decreased with increasing cortisol responses.

The FRN and frontal theta oscillations both have been related to the processing of feedback and the subsequent behavioral adaptation (Cavanagh et al., 2010; Cohen & Ranganath, 2007). While imaging studies demonstrated stress-induced reductions in the activity of brain regions responsible for feedback processing (Kruse et al., 2018; Ossewaarde et al., 2011), investigations of stress effects on EEG correlates of feedback processing have yielded inconsistent findings.

With respect to the FRN, some studies reported an increasing effect of stress on the amplitude difference between negative and positive feedback (Glienke et al., 2015; Wirz et al., 2017) and on the functionally related error-related negativity (Dierolf et al., 2018). Other studies, however, demonstrated that FRN amplitude differences between negative and positive feedback are reduced by stress (Banis et al., 2014; Banis & Lorist, 2012).

The present FRN results are in line with the latter findings as the difference between negative and positive feedback was overall reduced by stress in the current study, and stress also decreased the association between the FRN and subsequent behavior for immediate negative feedback, suggesting that, under stress, feedback could not be used for behavioral adaptation.

LME analyses further revealed that stress-related cortisol reactivity was associated with larger FRN amplitudes for immediate feedback. Moreover, the difference between negative and positive feedback decreased with increasing cortisol responses, indicating that the FRN becomes less sensitive to feedback valence in participants characterized by a strong stress response. The stress effect on the FRN described above thus seems to be mainly driven by cortisol effects on feedback processing. This finding appears to contradict not only our previous finding (Glienke et al., 2015) but also our hypothesis that stress would enhance the FRN amplitude difference between negative and positive feedback, especially for delayed feedback. Our reasoning was that stress should promote incremental learning based on prediction error coding by dopamine neurons in the midbrain and their projections to the striatum and ACC, which should be reflected by the FRN amplitude difference between negative and positive feedback. It must be pointed out, however, that the learning paradigm and the analysis strategy in our previous study (Glienke et al., 2015) differed from the present study. There, we focused on the later period of the experiment and found an increased FRN amplitude difference between negative and positive feedback processing under stress only for a condition in which feedback was not contingent on the previous response so that learning was not possible.

According to a more recent view, the FRN reflects also a salience prediction error, possibly in addition to a reward prediction error, which would be in line with the theory that the ACC is primarily an action-outcome predictor and not specifically related to the processing of feedback valence (Alexander & Brown, 2011). Indeed, some studies found that the FRN is sensitive to both positive and negative unexpected outcomes (Ferdinand et al., 2012; Sambrook & Goslin, 2016; Talmi, Atkinson, & El-Deredy, 2013). In light of these findings, stress-induced increases in cortisol reactivity might have caused an increased saliency of feedback stimuli, irrespective of the feedback’s valence, which generally increased the FRN and caused a decreased sensitivity to feedback valence.

For feedback-locked theta modulations, previous results concerning effects of stress are also inconsistent. In a recent study, we found increased frontal theta power for negative feedback following stress (Paul et al., 2018). The current stress-related decrease of the frontal theta power during learning from immediate feedback contradicts this but is in line with a previous EEG study that found a stress-induced decline in frontal theta (Banis et al., 2014). The present result is also in line with previous imaging studies showing that stress reduces the BOLD signal in prefrontal brain regions during feedback processing (Kruse et al., 2018; Ossewaarde et al., 2011), as mediofrontal theta has been linked to prefrontal processes of cognitive control (Cavanagh & Frank, 2014; Cohen, 2014). Beyond the power changes, we found that stress attenuated the association of frontal theta power with subsequent behavioral accuracy. The stress effects on theta power were likely caused by cortisol, as the LME analysis revealed that cortisol increases in the current study were associated with overall decreases in frontal theta power. This is in line with previous findings showing that the administration of cortisol is associated with reduced activation of the dACC (Kinner et al., 2016). Accordingly, the current finding might reflect a reduced control of the medial PFC/dACC over behavior with increasing cortisol reactivity.

Inconsistencies of the theta findings with the results of previous studies may again, at least partially, be related to differences in the tasks that were used. In our previous study, for instance, we applied a category learning task and found an increasing effect of stress on frontal theta power only in a difficult task condition but not in an easy task condition (Paul et al., 2018). This result suggests that task difficulty and thus the amount of cognitive control needed for the task at hand might be critical for the stress effect on frontal theta power (see Cavanagh & Frank, 2014; Cohen, 2014). Increased theta power was interpreted in terms of compensatory cognitive processes to maintain performance under stress in the face of high task demands. In the current study, task difficulty can probably not account for the differences in theta power between immediate and delayed feedback. While learning from delayed feedback was in some studies found to be more difficult than learning from immediate feedback (Maddox, Ashby, & Bohil, 2003; Maddox & Ing, 2005), it was associated with overall decreased rather than increased theta power in the present study. Moreover, theta power was unrelated to subsequent behavioral accuracy for delayed feedback, suggesting that the role of theta oscillations for performance was reduced overall. Instead, theta time-locked to negative feedback may reflect a cognitive control process that indicates the need for behavioral adaptation especially for immediately preceding events, which was also suggested by a very recent related study (Weismüller, Kullmann, Hoenen, & Bellebaum, 2019). This process seems to be affected by stress—and cortisol, in particular. The reduced association between theta power and accuracy for delayed feedback on the single-trial level, however, may have been due to overall reduced theta power. Together, these results indicate that stress reduces medial frontal neural oscillations especially for immediate feedback that is associated with behavioral adaptation.

Similar to previous reports (Bellebaum & Daum, 2008; Bellebaum et al., 2010; Hajcak, Moser, Holroyd, & Simons, 2007), the P300 in the current study was larger for positive compared to negative outcomes. Although other studies found the P300 to be sensitive to outcome magnitude but not valence (Sato et al., 2005; Yeung & Sanfey, 2004) and yet another study reported that the P300 is sensitive to both valence and magnitude of an outcome (Wu & Zhou, 2009), there is consent upon the role of the P300 in categorizing and integrating feedback information to optimize behavioral strategies and obtain maximal gains (San Martín, 2012). The current finding that P300 amplitudes were unrelated to subsequent behavioral accuracy fits with the idea that the P300 reflects the integration of feedback information over time and not trial-by-trial behavioral adaptation (Glazer et al., 2018; Polich, 2007). While stress did not affect the P300 for immediate feedback, it reduced the P300 for delayed feedback specifically after rewards. This suggests that the P300 for delayed feedback reflects a stress-related attenuation of the sensitivity to feedback valence, which is in accordance with previous studies reporting that stress reduces the reward sensitivity (Berghorst, Bogdan, Frank, & Pizzagalli, 2013; Bogdan & Pizzagalli, 2006). The association of reduced P300 amplitudes with increasing cortisol levels suggests that these stress effects on the P300 were mainly mediated by cortisol reactivity following stress.

Previous studies reported sex differences in stress effects on emotional learning (Andreano & Cahill, 2006; Merz & Wolf, 2017; Zoladz et al., 2015) and in the effects of cortisol on the reward system (Kinner et al., 2016). We controlled for potential sex differences by testing only men. Future studies need to explore potential sex differences in the current stress effects on learning from immediate and delayed feedback and the neural correlates of feedback processing.

Finally, we found evidence that learning from delayed feedback is enhanced relative to immediate feedback under stress, while in the control group no differences between feedback delays were seen. This is puzzling, as all measures of feedback processing appear to suggest a stress-induced impairment of feedback processing. It thus seems that this behavioral effect was driven by neural mechanisms that were not reflected in the EEG measures that we analyzed as dependent variables. While studies suggest a stronger hippocampal involvement in learning from delayed feedback (Foerde et al., 2013), an enhanced hippocampal involvement under stress seems unlikely as hippocampal processing has been suggested to be impaired by stress (Schwabe & Wolf, 2012). Furthermore, a stronger contribution of the dorsal striatum or the ACC to learning from delayed feedback under stress can be excluded, as this should be reflected by enhanced FRN amplitudes. At the same time, it is important to note that the dopaminergic system and the striatum are also involved in the processing of delayed feedback. For example, Dobryakova and Tricomi (2013) found striatal activations for feedback stimuli that followed a response after a delay of 25 min, and Weismüller et al. (2018) described a similar effect of reduced dopamine levels in Parkinson’s disease on learning from immediate and delayed feedback. What seems to differ between learning from immediate and delayed feedback is the integration of feedback with the preceding response, which is based more on the dorsal striatum for immediate and more on the hippocampus for delayed feedback processing (Foerde & Shohamy, 2011a). On the other hand, the ventral striatum, which has been linked more to learning stimulus-outcome rather than action-outcome associations (O’Doherty et al., 2004), has been described to be similarly involved for both types of feedback (Foerde & Shohamy, 2011a). It is thus conceivable that delayed feedback was processed more by the ventral striatum under stress and that, given the role of the ventral striatum in learning stimulus-outcome associations, the task was solved mainly by focusing on the relation between the stimuli and the outcomes. This would also explain why we did not see this enhanced feedback processing in the FRN, as the FRN reflects more strongly processes of action-outcome association (Oliveira, McDonald, & Goodman, 2007; Yeung, Holroyd, & Cohen, 2005). Nevertheless, this explanation is speculative, and it remains open as to how the feedback was integrated with the preceding event, stimulus, and/or response over a delay under stress.

Concerning feedback processing, there was only one aspect in the current results in which the pattern for stressed participants and delayed feedback processing differed from all other conditions. While the P300 was generally reduced by stress for delayed feedback, it did not distinguish between negative and positive feedback processing. The current finding may reflect a “more realistic” feedback processing in this condition, as it is consistent with the actual frequencies of the occurrence of negative and positive feedback. Based on the idea that the P300 reflects the integration of reward information over time (San Martín, 2012), this altered feedback processing indicated by the P300 may underlie enhanced task performance for delayed feedback by stressed participants.

In summary, the current study revealed that stress influences feedback learning and neural feedback processing, partially depending on the timing of feedback. The disruption of associations between frontal theta oscillations and the FRN with subsequent behavioral accuracy is a potential mechanism of stress-induced learning impairments for immediate feedback that was, however, compensated for by the stressed participants of the present study, so that overall learning was not impaired under stress. Instead, learning from delayed feedback was even enhanced after stress, although unrelated to neural feedback processes as reflected by the EEG measures. Our findings illustrate complex interactions between stress, feedback delay, and feedback valence. The observed behavioral effects cannot fully be explained by the EEG-derived measures of neural feedback processing. Future studies with different methodological approaches are needed in order to integrate the current findings into a formal model of feedback-based learning under stress.

ACKNOWLEDGMENTS

This research was supported by the Deutsche Forschungsgemeinschaft (DFG) project B4 of the Collaborative Research Centre 874 (Integration and Representation of Sensory Processes)—project number 122679504. We would like to thank Osman Akan, Julia Pietzko, and Svenja Quassowsky for help with data collection. Open access funding enabled and organized by Projekt DEAL.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.