Instead of reacting directly to the issue at hand I suggest you ponder what failure mode is being ac...

carsareok • today at 8:14 AM • 0 replies • view on HN

Instead of reacting directly to the issue at hand I suggest you ponder what failure mode is being activated and why.

They are fundamentally not able to tell truth from fiction, but this also means they don't make errors like we do. They definitely create output we recognize as errors, but that's very different from our failure modes and you have to get used to it.

In my opinion it's better to branch off with an altered context that somehow avoids or mitigates the issue you're running into. Let's say they miss the mark. If you tell them "Don't do that" in the "conversation" this means the error is now and forever part of the context (assuming you stay within context limits and no compaction). Depending on their training this may or may not be detrimental to the quality of the rest of the conversation. You are now entering a section of their training where "error + someone swearing at them"-conversations have happened. I can't tell for sure, but my gut says this is not an advantageous place to be.

They are as I'm sure we all know completion engines and are in a very real way constantly cosplaying being productive "agents". They don't know if they are part of some type of modern Shakespearean play where sitting behind computers is part of the story or if they are in what we call "reality". By training on "conversations" they have become more likely to complete their input in a way that mimics what we call having a back and forth with some degree of technical accuracy.

In the extreme case you have a context that starts like "Please make all junior mistakes in this assignment. Make the code unreadable and be sure to include massive gotchas in subtle parts of the logic.". The results of this context won't be pretty. The other way around is not saying "Please make no errors", it's explaining in detail what you think is the right way. Coding style, if you care, architecture, etc. it all needs to be part of the context if you suspect it will substantially impact the completion. You have to imagine what real-life conversations have started with "Please make no errors". Again, I have no proof of course, but I have a strong feeling that human conversations that started with clearly and properly articulated specifications are qualitatively different from human conversations that started with "make no errors". In one you can see the pointy-haired boss and the other a seasoned engineer. Try to stay on the engineer side of their training.

I completely agree that they should be trained (or instructed) to react in a robotic tone stripped of all human pretense. We are trying to get at useful, general reasoning patterns latent in the data they trained on and, I regret to say, not the "human" parts which are usually a masterclass in cognitive biases and failures to reason.

Edit: the last sentence should be read in the voice of the Matrix's Architect.

alt Hacker News