We can trust the feedback we give it based on the output it provides.
What kind of feedback are you giving? What's the reward function?
What kind of feedback are you giving? What's the reward function?