logoalt Hacker News

resirosyesterday at 4:36 PM2 repliesview on HN

Skimmed quickly the paper. This does not look like RL. It's a genetic algorithm. In a previous life I was working on compbio (protein structure prediction), we built 100s of such heuristic based algorithm (monte carlo simulated annealing, ga..). The moment you have a good energy function (one that provide some sort of gradient), and a fast enough sampling function (llms), you can do looots of cool optmization with sufficient compute.

I guess that's now becoming true with LLMs.

Faster LLMs -> More intelligence


Replies

UncleOxidantyesterday at 5:42 PM

> This does not look like RL. It's a genetic algorithm.

couldn't you say that if you squint hard enough, GA looks like a category of RL? There are certainly a lot of similarities, the main difference being how each new population of solutions is generated. Would not at all be surprised that they're using a GA/RL hybrid.

vjerancrnjakyesterday at 5:14 PM

Genetic algorithm is worse than gradient descent.

If variety is sought, why not beam with nice population statistic.

show 1 reply