I use mainly Opus 4.6. I did the same thing and created a skill for summarizing a troubleshooting ...

vanillameow • today at 12:40 PM • 1 reply • view on HN

I use mainly Opus 4.6.

I did the same thing and created a skill for summarizing a troubleshooting conversation. It works decently, as long as my own input in the troubleshooting is minimal. i.e. dangerously-skip-permissions. As soon as I need to take manual steps or especially if the conversation is in Desktop/Web, it will very quickly degrade and just assume steps I've taken (e.g. if it gave me two options to fix something, and I come back saying it's fixed, it will in the summary just kind of randomly decide a solution). It also generally doesn't consider the previous state of the system (e.g. what was already installed/configured/setup) when writing such a summary, which maybe makes it reusable for me, somewhat, but certainly not for others.

Now you could say, "these are all things you can prompt away", and, I mean, to an extent, probably. But once you're talking about taking something like this online, you're not working with the top 1% proompters. The average claude session is not the diligent little worker bee you'd want it to be. These models are still, at their core, chaos goblins. I think Moltbook showed that quite clearly.

I think having your model consider someone else's "fix" to your problem as a primary source is bad. Period. Maybe it won't be bad in 3 generations when models can distinguish noise and nonsense from useful information, but they really can't right now.

Replies

latand6 • today at 2:12 PM

Isn’t what you’ve just described - the context bloat problem, the part about the web?

I’m not sure I quite get the same experience as you with the “assumes steps it never took”. Do you think it’s because of the skills you’ve used?

I also disagree that having at least some solution to a similar problem is inherently bad. Usually it directs the LLM to some path that was verified, if we’re talking about skills

alt Hacker News

Replies