Well what you said is: "On the contrary, I believe in every verifiable domain RL must drive t...

benreesman • last Tuesday at 10:36 PM • 1 reply • view on HN

Well what you said is:

"On the contrary, I believe in every verifiable domain RL must drive the agent to be the most intelligent (relative to RL award) it can be under the constraints--and often it must become more intelligent than humans in that environment."

And I said it's not that simple, in no way demonstrated, unlikely with current technology, and basically, nope.

Replies

Davidzheng • last Wednesday at 3:42 AM

Ah you're worried about convergence issues? My (Bad) understanding was that the self-driving car stuff is more about inadequacies of models in which you simulate training and data collection than convergence of algorithms but I could be wrong. I mean that statement was just a statement that I think you can get RL to converge to close to optimum--which I agree is a bit of a stretch as RL is famously finicky. But I don't see why one shouldn't expect this to happen as we tune the algorithms.

alt Hacker News

Replies