logoalt Hacker News

robbrown451yesterday at 7:51 PM2 repliesview on HN

I used to think that, but ended up going the other direction, partly because I don't have the wherewithall to build a model but then I realized, with existing models that can take more than a tiny amount of context, you can just let any model bootstrap itself with a good prompt sent by the system.

There's a ton of other tricks to it, but mostly keeping the protocol simple for the AI so it can concentrate on coding logic and not stuff like managing BS boilerplate, dependencies, etc. (for instance I make extensive use of things like abstract syntax tree library to help with surgical edits from the LLM)

That said, I would be very open to collaborating with someone who builds such small models, I don't think the system strictly needs it, but it also could have some extra power if it had it.


Replies

andaiyesterday at 7:54 PM

> mine also makes extensive use of things like abstract syntax tree library to help with surgical edits from the LLM

Tell me more! This takes me way back. I did one like this in the GPT-4 days! (8k context window)

show 1 reply
cyanydeezyesterday at 8:38 PM

I'm aware we're not there yet, but think of something like https://chatjimmy.ai/ ; at some point, you're going to be able to dynamically build the harness so it creates the necessary consistency & dynamicism at a speed unheard of.

But yes, I'm aware no ones got anywhere near there, mostly because most of the focus is on exploding the context and parameters. I'm saying that phase is done.

show 1 reply