logoalt Hacker News

paraditeyesterday at 11:51 AM3 repliesview on HN

Ironically this is a goldmine for AI labs and AI writer startups to do RL and fine-tuning.


Replies

zipy124yesterday at 1:53 PM

That's not quite how that works though. It can for example be possible that fine-tuning a model to avoid the styles described in the article cause the LLM to stop functionaing as well as it can. It might just be an artefact of the architecture itself that to be effective it has to follow these rules. If it was as easy as just providing data and the LLM would then 'encode' that as a rule, we would advance much quicker than we currently are.

einrealistyesterday at 12:17 PM

In the case of those big 'foundation models': Fine-tune for whom and how? I doubt it is possible to fine-tune things like this in a way that satisfies all audiences and training set instances. Much of this is probably due to the training set itself containing a lot of propaganda (advertising) or just bad style.

show 1 reply
kingstnapyesterday at 1:15 PM

Seems more like the kind of thing you would make prompts using.

I can totally see someone taking that page and throwing it into whatever bot and going "Make up a comprehensive style guide that does the opposite of whatever is mentioned here".

show 1 reply