> Where does the genome, genetic representation, you are evolving come from
The instruction set of the program that is being searched for.
This is probably the best publicly available summary of the idea I am pursuing:
you're talking about specifically using genetic programming to create new programs as opposed to gradient decend in LLMs to minimize a loss function, right?
How would you construct a genetic algorithm to produce natural language like LLMs do?
Forgive me if i'm misunderstanding, but in programming we have "tokens" which are minimal meaningful bits of code.
For natural languages it's harder. "Words" are not super meaningful on their own, i don't think. (at least not as much as a token) so how would you break down natural language for a genetic algorithm?
A program is composed of arbitrarily many instructions of your set. How are you accounting for this; trying every possible program length? And you are considering the simpler case where the search space is discrete, unlike the continuous spaces in most machine learning problems.
I think you need to think this through some more. You may see there is a reason nobody uses genetic algorithms for real world tasks.