My own 100% hallucinated language experiment is very very weird and still has thousands of lines of generated examples that work fine. When doing complex stuff you could see the agent bounce against the tests here and there, but never produced non-working code in the end. The only examples available were those it had generated itself as it made up the language. It was capable of making things like a JSON parser/encoder, a TODO webapp or a command line kanban tracker for itself in one shot.