> serving whichever instance is internally evaluated to have done the best. Then you're kind...

SkiFire13 • last Saturday at 8:49 PM • 0 replies • view on HN

> serving whichever instance is internally evaluated to have done the best. Then you're kind of performing meta reinforcement learning

I thought this didn't work? You basically end up fitting your AI models to whatever is the internal evaluation method, and creating a good evaluation method most often ends up having a similar complexity as creating the initial AI model you wanted to train.

alt Hacker News