> Say you have this new language, with only a tiny amount of examples of there. How do the SOTA labs train on you're language?
Languages don't exist in isolation, they exist on a continuum. Your brand new language isn't brand new, it's built off the semantics and syntax of many languages that have come before it. Most language designers operate under what is known as a "weirdness budget", which is about keeping your language to within some delta of other languages modulo a small number of new concepts. This is to maintain comprehensibility, otherwise you get projects like Hoon / Nock where true is false and up is down that no one can figure out.
Under a small weirdness budget, an LLM should be able to understand your new language despite not being trained on it. if you just explain what's different about it. I've had great success with this so far even on early LLM models. One thing you can do is give it the EBNF grammar and it can just generate strings from that. But that method is prone to hallucinations.