The primary value learned value (PVLV) model is a possible explanation for the reward-predictive firing properties of dopamine (DA) neurons.[1] It simulates behavioral and neural data on Pavlovian conditioning and the midbrain dopaminergic neurons that fire in proportion to unexpected rewards. It is an alternative to the temporal-differences (TD) algorithm.[2]

It is used as part of Leabra.

References

  1. O'Reilly, R.C.; Frank, M.J.; Hazy, T.E. & Watz, B. (2007). "PVLV: The Primary Value and Learned Value Pavlovian Learning Algorithm". Behavioral Neuroscience. 121 (1): 31–4. CiteSeerX 10.1.1.67.6739. doi:10.1037/0735-7044.121.1.31. PMID 17324049.
  2. "Leabra PBWM". CCNLab.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.