1It is also essential in training animals to perform long sequences of actions; in that context it is called “shaping” (Krueger and Dayan, 2009; Ng et al., 1999). However, reinforcement learning is a much more general concept than just dividing a long sequence into smaller parts. In the case of the cleaning robot, there may not be any end to the cleaning task since more dust appears constantly. The only meaningful goal for the cleaning robot may be to just remove dust and dirt as much as possible, which is exactly captured by the reward formalism.