logoalt Hacker News

tokioyoyoyesterday at 6:48 AM1 replyview on HN

That’s actually really cool, and makes sense in my head! This is somewhat how I imagined it, except my guess would be someone would fine tune a general purpose LLMs (somehow, as it is much cheaper than starting from scratch, idk?) to behave this way rather than instructing it all the way in. And whoever develops the framework would package it with the access to this fine-tuned LLM.

But yeah, what you guys are doing looks sweet! I need to get out of my ass and see what people are doing in this sphere as it sounds fun.


Replies

weitendorfyesterday at 7:12 AM

> fine tune a general purpose LLMs (somehow, as it is much cheaper than starting from scratch, idk?) to behave this way rather than instructing it all the way in

I'd love to do that too but there are basically three ways to teach LLMs how to use it afaik: with data created "in the wild" and a degree of curation or augmentation, or with full-on reinforcement learning/goal-oriented training, or some kind of hybrid based on eg conformance testing and validating LLM output at a less sophisticated level (eg if it tries to call an api that's not in the set that it just saw during discovery, the LLM is being dumb, train it out of doing that).

The thing is they are not really mutually exclusive, and LLM companies will do it anyway to make their models useful if enough people are using this or want to use it. This is what's happened already with eg MCP and skills and many programming languages. Anyway, if prompting works to get it to use it properly it validates that the model can be trained to follow that process too, the same way it knows how to work with React

show 1 reply