logoalt Hacker News

westurneryesterday at 9:40 PM0 repliesview on HN

Task: play tetris

Task: write and optimize a tetris bot

Task: write and safely online optimize a tetris bot with consideration for cost to converge

openai/baselines (7 years ago) was leading on RL and then AlphaZero and Self-Attention Transformer networks.

LLMs are trained with RL, but aren't general purpose game theoretic RL agents?