This doesn’t make sense. They are fundamentally different things, so an observation made about Alphazero does not help you learn anything about LLMs.
I am not sure, self-play with LLMs self generated synthetic data is becoming a trendy topic in LLMs research.
I am not sure, self-play with LLMs self generated synthetic data is becoming a trendy topic in LLMs research.