¹⁹ RPE can actually be defined in different ways. In neuroscience literature, the definition may not be very different from reward loss. In the reinforcement learning theory, a more sophisticated definition is usually used, using what is specifically called the temporal difference (TD) error, which we explain here. We consider the case where a single action is taken at each time step (as opposed to considering longer plans), and no discounting is used. For each time step, RPE is then defined as RPE = reward - (V _before - V _after) where V is the state-value function (for the policy being followed, not necessarily the optimal one), in the state before the action was taken or after the action taken, respectively (which could also be denoted by time indices t - 1 and t). The reward is the reward obtained for this particular action, or in other words, at this particular time step for which we are computing the RPE. Note that the sign is flipped compared to the definition of reward loss, but this is just a technical convention with no deeper meaning.

The connection to reward loss can be seen by understanding that in the state-value formalism, V _before -V _after can be interpreted as expected reward. The reason is that by the definition of the state-value function, the state-value function gives the total reward expected when starting from each of the states, so you would expect a reward equal to V _before -V _after to appear. Otherwise the two state-values would be inconsistent; the total expected reward starting from the state “before” must be equal to the total expected reward starting from state “before” plus the expected reward obtained in the transition. So, the agent can expect that reward = V _before - V _after, and if that actually holds, RPE would be zero. If you get less, there is a reward loss, which is here expressed as a negative RPE.

Such an RPE signal is the foundation for reinforcement learning. It is more general than reward loss since it considers the whole future of rewards via the state-values, as explained next in the main text. There are also some small differences: RPE has a different sign, corresponding to the negative of reward loss, and our definition of reward loss considers only the case where it is positive (or RPE is negative) since this is the part corresponding to suffering. Also, typically the discounting formalism of reinforcement learning is included in the definition, in which case V _after would by multiplied by a discounting factor throughout; omitting the discount factor is possible if we consider a finite time horizon. See Sutton and Barto (2018, Ch.15) for more information.