LINEAR-NONLINEAR-POISSON MODELS OF PRIMATE CHOICE DYNAMICS
Corresponding Author
Greg S. Corrado
HOWARD HUGHES MEDICAL INSTITUTE, STANFORD UNIVERSITY SCHOOL OF MEDICINE, AND MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Howard Hughes Medical Institute and Stanford University School of Medicine.
Department of Neurobiology, Stanford University, D200 Fairchild Building, 299 Campus Drive West, Stanford, California 94309 (e-mail: [email protected]).Search for more papers by this authorLeo P. Sugrue
HOWARD HUGHES MEDICAL INSTITUTE, STANFORD UNIVERSITY SCHOOL OF MEDICINE, AND MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Howard Hughes Medical Institute and Stanford University School of Medicine.
Search for more papers by this authorH. Sebastian Seung
HOWARD HUGHES MEDICAL INSTITUTE, STANFORD UNIVERSITY SCHOOL OF MEDICINE, AND MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Howard Hughes Medical Institute and Massachusetts Institute of Technology.
Search for more papers by this authorWilliam T. Newsome
HOWARD HUGHES MEDICAL INSTITUTE, STANFORD UNIVERSITY SCHOOL OF MEDICINE, AND MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Howard Hughes Medical Institute and Stanford University School of Medicine.
Search for more papers by this authorCorresponding Author
Greg S. Corrado
HOWARD HUGHES MEDICAL INSTITUTE, STANFORD UNIVERSITY SCHOOL OF MEDICINE, AND MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Howard Hughes Medical Institute and Stanford University School of Medicine.
Department of Neurobiology, Stanford University, D200 Fairchild Building, 299 Campus Drive West, Stanford, California 94309 (e-mail: [email protected]).Search for more papers by this authorLeo P. Sugrue
HOWARD HUGHES MEDICAL INSTITUTE, STANFORD UNIVERSITY SCHOOL OF MEDICINE, AND MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Howard Hughes Medical Institute and Stanford University School of Medicine.
Search for more papers by this authorH. Sebastian Seung
HOWARD HUGHES MEDICAL INSTITUTE, STANFORD UNIVERSITY SCHOOL OF MEDICINE, AND MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Howard Hughes Medical Institute and Massachusetts Institute of Technology.
Search for more papers by this authorWilliam T. Newsome
HOWARD HUGHES MEDICAL INSTITUTE, STANFORD UNIVERSITY SCHOOL OF MEDICINE, AND MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Howard Hughes Medical Institute and Stanford University School of Medicine.
Search for more papers by this authorAbstract
The equilibrium phenomenon of matching behavior traditionally has been studied in stationary environments. Here we attempt to uncover the local mechanism of choice that gives rise to matching by studying behavior in a highly dynamic foraging environment. In our experiments, 2 rhesus monkeys (Macacca mulatta) foraged for juice rewards by making eye movements to one of two colored icons presented on a computer monitor, each rewarded on dynamic variable-interval schedules. Using a generalization of Wiener kernel analysis, we recover a compact mechanistic description of the impact of past reward on future choice in the form of a Linear-Nonlinear-Poisson model. We validate this model through rigorous predictive and generative testing. Compared to our earlier work with this same data set, this model proves to be a better description of choice behavior and is more tightly correlated with putative neural value signals. Refinements over previous models include hyperbolic (as opposed to exponential) temporal discounting of past rewards, and differential (as opposed to fractional) comparisons of option value. Through numerical simulation we find that within this class of strategies, the model parameters employed by animals are very close to those that maximize reward harvesting efficiency.
REFERENCES
- Barraclough, D. J., Conroy, M. L., & Lee, D. (2004). Prefrontal cortex and decision making in a mixed-strategy game. Nature Neuroscience, 7, 404–410.
- Baum, W. M. (1974). On two types of deviation from the matching law: Bias and undermatching. Journal of the Experimental Analysis of Behavior, 22, 231–242.
- Baum, W. M. (1979). Matching, undermatching, and overmatching in studies of choice. Journal of the Experimental Analysis of Behavior, 32, 269–281.
- Baum, W. M. (1981). Optimization and the matching law as accounts of instrumental behavior. Journal of the Experimental Analysis of Behavior, 36, 387–403.
- Baum, W. M. (1982). Choice, changeover, and travel. Journal of the Experimental Analysis of Behavior, 38, 35–49.
- Baum, W. M., & Aparicio, C. F. (1999). Optimality and concurrent variable-interval variable-ratio schedules. Journal of the Experimental Analysis of Behavior, 71, 75–89.
- Baum, W. M., & Davison, M. (2004). Choice in a variable environment: Visit patterns in the dynamics of choice. Journal of the Experimental Analysis of Behavior, 81, 85–127.
- Buckner, R. L., Green, L., & Myerson J. (1993). Short-term and long-term effects of reinforcers on choice. Journal of the Experimental Analysis of Behavior, 59, 293–307.
- Bussgang, J. J. (1975). Cross-correlation functions of amplitude-distorted Gaussian inputs. In A. H. Haddad (Ed.), Nonlinear systems. Stroudsburg, PA: Dowdon, Hutchinson & Ross.
- Chichilnisky, E. J. (2001). A simple white noise analysis of neuronal light responses. Network, 12, 199–213.
- Chung, S. H., & Herrnstein, R. J. (1967). Choice and delay of reinforcement. Journal of the Experimental Analysis of Behavior, 10, 67–74.
- Davison, M., & Baum, W. M. (2000). Choice in a variable environment: Every reinforcer counts. Journal of the Experimental Analysis of Behavior, 74, 1–24.
- Dayan, P., & Abbott, L. F. (2001). Theoretical neuroscience: Computational and mathematical modeling of neural systems. Cambridge, MA: MIT Press.
- Dorris, M. C., & Glimcher, P. W. (2004). Activity in posterior parietal cortex is correlated with the relative subjective desirability of action. Neuron, 44, 365–378.
- Evarts, E. V. (1966). A technique for recording activity of subcortical neurons in moving animals. Electroencephalography and Clinical Neurophysiology, 24, 83–86.
- Frederick, S., Loewenstein, G., & O'Donoghue, T. (2002). Time discounting and time preference a critical review. Journal of Economic Literature, XL, 351–401.
- Gallistel, C. R., & Gibbon, J. (2000). Time, rate and conditioning. Psychological Review, 107, 289–344.
- Gallistel, C. R., Mark, T. A., King, A., & Latham, P. E. (2001). The rat approximates an ideal detector of changes in rates or reward: Implications for the law of effect. Journal of the Experimental Psychology: Animal Behavior Processes, 27, 354–372.
- Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: John Wiley & Sons.
- Green, L., & Myerson, J. (2004). A discounting framework for choice with delayed and probabilistic rewards. Psychological Bulletin, 130, 769–792.
- Green, L., Myerson, J., & McFadden, E. (1997). Rate of temporal discounting decreases with amount of reward. Memory & Cognition, 25, 715–723.
- Hays, A. V., Richmond, B. J., & Optican, L. M. (1982). A UNIX-based multiple process system for real-time data acquisition and control. WESCON Conference Proceedings, 2, 1–10.
- Herrnstein, R. J. (1961). Relative and absolute strength of responses as a function of frequency of reinforcement. Journal of the Experimental Analysis of Behavior, 4, 267–272.
- Herrnstein, R. J. (1970). On the law of effect. Journal of the Experimental Analysis of Behavior, 13, 243–266.
- Herrnstein, R. J., Loewenstein, G. F., Prelec, D., & Vaughan, W., Jr. (1993). Utility maximization and melioration: Internalities in individual choice. Journal of Behavioral Decision Making, 6, 149–185.
-
Herrnstein, R. J., &
Vaughan, W., Jr. (1980). Melioration and behavioral allocation. In J. E. R. Staddon (Ed.), Limits to action: The allocation of individual behavior (pp. 143–176). New York: Academic.
10.1016/B978-0-12-662650-6.50011-8 Google Scholar
- Heyman, G. M. (1979). A Markov model description of changeover probabilities on concurrent variable-interval schedules. Journal of the Experimental Analysis of Behavior, 31, 41–51.
- Heyman, G. M. (1982). Is time allocation unconditioned behavior? In M. L. Commons, R. J. Herrnstein, & H. Rachlin (Eds.), Quantitative analyses of behavior, Vol. II: Matching and maximizing accounts (pp. 459–490). Cambridge, MA: Ballinger.
- Horner, J. M., Staddon, J. E. R., & Lozano, K. K. (1997). Integration of reinforcement effects over time. Animal Learning & Behavior, 25, 84–98.
- Horwitz, G. D., Chichilnisky, E. J., & Albright, T. D. (2005). Blue-yellow signals are enhanced by spatiotemporal luminance contrast in macaque V1. Journal of Neurophysiology, 93, 2263–2278.
- Houston, A. I., & McNamara, J. (1981). How to maximize reward rate on two variable-interval paradigms. Journal of the Experimental Analysis of Behavior, 35, 367–396.
- Hunter, I., & Davison, M. (1985). Determination of a behavioral transfer function: White-noise analysis of session-to-session response-ratio dynamics on concurrent VI VI schedules. Journal of the Experimental Analysis of Behavior, 43, 43–59.
- JEAB. (1993). The 30 most cited articles from JEAB. Retrieved from http:seab.envmed.rochester.edusocietyhistoryjeabhighlycited.shtml.
- Judge, S. J., Richmond, B. J., & Chu, F. C. (1980). Implantation of magnetic search coils for measurement of eye position: An improved method. Vision Research, 20, 535–538.
- Kay, S. M. (1993). Fundamentals of statistical signal processing. Upper Saddle River, NJ: Prentice Hall.
- Killeen, P. R. (1981). Averaging theory. In C. M. Bradshaw & E. Szabadi (Eds.), Quantification of steady-state operant behaviour (pp. 21–34). Amsterdam: Elsevier.
- Killeen, P. R. (1994). Mathematical principles of reinforcement: Based on the correlation of behavior with incentives in short-term memory. Behavioral and Brain Sciences, 17, 105–172.
- Kirby, K. N., & Maraković, N. N. (1995). Modeling myopic decisions: Evidence for hyperbolic delay-discounting within subjects and amounts. Organization Behavior and Human Decision Processes, 64, 22–30.
- Lau, B., & Glimcher, P. W. (2005). Dynamic response-by-response models of matching behavior in rhesus monkeys. Journal of the Experimental Analysis of Behavior, 84, 555–579.
- Loewenstein, G. (1987). Anticipation and the valuation of delayed consumption. Economic Journal, 97, 666–684.
- Mark, T. A., & Gallistel, C. R. (1994). The kinetics of matching. Journal of Experimental Psychology: Animal Behavior Processes, 20, 79–95.
- Mazur, J. E. (1987). An adjusting procedure for studying delayed reinforcement. In M. L. Commons, J. E. Mazur, J. A. Nevin, & H. Rachlin (Eds.), Quantitative analyses of behavior: Vol. 5. The effect of delay and of intervening events on reinforcement value (pp. 55–73). Hillsdale, NJ: Erlbaum.
- McClure, S. M., Laibson, D. I., Loewenstein, G., & Cohen, J. D. (2004, October 15). Separate neural systems value immediate and delayed monetary rewards. Science, 306, 503–507.
- McDowell, J. J. (1980). An analytic comparison of Herrnstein's equations and a multivariate rate equation. Journal of the Experimental Analysis of Behavior, 33, 397–408.
- McDowell, J. J., Bass, R., & Kessel, R. (1983). Variable-interval rate equations and reinforcement and response distributions. Psychological Review, 90, 364–375.
- McDowell, J. J., & Kessel, R. (1979). A multivariate rate equation for variable-interval performance. Journal of the Experimental Analysis of Behavior, 31, 267–283.
- Palya, W. L., Walter, D., Kessel, R., & Lucke, R. (1996). Investigating behavioral dynamics with a fixed-time extinction schedule and linear analysis. Journal of the Experimental Analysis of Behavior, 66, 391–409.
- Palya, W. L., Walter, D., Kessel, R., & Lucke, R. (2002). Linear modeling of steady-state behavioral dynamics. Journal of the Experimental Analysis of Behavior, 77, 3–27.
- Rachlin, H., Battalio, R., Kagel, J., & Green, L. (1981). Maximization theory in behavioral psychology. Behavioral and Brain Sciences, 4, 371–388.
- Shahan, T. A., & Lattal, K. A. (1998). On the functions of the changeover delay. Journal of the Experimental Analysis of Behavior, 69, 141–160.
- Simoncelli, E. P., Paninski, L., Pillow, J. W., & Schwartz, O. (2004). Characterization of neural responses with stochastic stimuli. In M. S. Gazzaniga (Ed.), The cognitive neurosciences III ( 3rd ed., pp. 327–338). Cambridge, MA: MIT Press.
- Southwick, C. H., & Siddiqi, M. F. (1985). Rhesus monkey's fall from grace. Natural History, 94(2), 63–70.
- Strotz, R. H. (1956). Myopia and inconsistency in dynamic utility maximization. Review of Economic Studies, 23, 165–180.
- Stubbs, D. A., Pliskoff, S. S., & Reid, H. M. (1977). Concurrent schedules: A quantitative relation between changeover behavior and its consequences. Journal of the Experimental Analysis of Behavior, 27, 85–96.
- Sugrue, L. P., Corrado, G. S., & Newsome, W. T. (2004, June 18). Matching behavior and the representation of value in parietal cortex. Science, 304, 1782–1787.
- Sugrue, L. P., Corrado, G. S., & Newsome, W. T. (2005). Choosing the greater of two goods: Neural currencies for valuation and decision making. Nature Reviews in Neuroscience, 6, 363–375.
- Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
- Vaughan, W., Jr. (1981). Melioration, matching, and maximization. Journal of the Experimental Analysis of Behavior, 36, 141–149.
- Vaughan, W., Jr. (1982). Choice and the Rescorla-Wagner model. In M. L. Commons, R. J. Herrnstein, & H. Rachlin (Eds.), Quantitative analyses of behavior. Vol. II: Matching and maximizing accounts (pp. 263–279). Cambridge, MA: Ballinger.
- Vaughan, W., Jr., & Herrnstein, R. J. (1987). Stability, melioration, and natural selection. In L. Green, & J. H. Kagel (Eds.), Advances in behavioral economics. (Vol. 1, pp. 185–215). Norwood, NJ: Ablex.