Maybe the dog just values immediate reward higher even though it understands it could get even more later? How would you control for that?