This is the biggest bottleneck for me. What's worse is that LLMs have a bad habit of being very...

Salgat • yesterday at 10:21 PM • 3 replies • view on HN

This is the biggest bottleneck for me. What's worse is that LLMs have a bad habit of being very verbose and rewriting things that don't need to be touched, so the surface area for change is much larger.

Replies

sheept • today at 2:50 AM

Not only that, but LLMs do a disservice to themselves by writing inconcise code, decorating lines with redundant comments, which wastes their context the next time they work with it

➕ show 1 reply

mohsen1 • today at 9:29 AM

I highly recommend adding `/simplify` to your workflow. It walks back over-engineerings quite often for me.

cyanydeez • yesterday at 10:37 PM

It's kind weird; I jumped on the vibe coding opencode bandwagon but using local 395+ w/128; qwen coder. Now, it takes a bit to get the first tokens flowing, and and the cache works well enough to get it going, but it's not fast enough to just set it and forget it and it's clear when it goes in an absurd direction and either deviates from my intention or simply loads some context whereitshould have followed a pattern, whatever.

I'm sure these larger models are both faster and more cogent, but its also clear what matter is managing it's side tracks and cutting them short. Then I started seeing the deeper problematic pattern.

Agents arn't there to increase the multifactor of production; their real purpose is to shorten context to manageable levels. In effect, they're basically try to reduce the odds of longer context poisoning.

So, if we boil down the probabilty of any given token triggering the wrong subcontext, it's clear that the greater the context, the greater the odds of a poison substitution.

Then that's really the problematic issue every model is going to contend with because there's zero reality in which a single model is good enough. So now you're onto agents, breaking a problem into more manageable subcontext and trying to put that back into the larger context gracefully, etc.

Then that fails, because there's zero consistent determinism, so you end up at the harness, trying to herd the cats. This is all before you realize that these businesses can't just keep throwing GPUs at everything, because the problem isn't computing bound, it's contextual/DAG the same way a brain is limited.

We all got intelligence and use several orders of magnitude less energy, doing mostly the same thing.

alt Hacker News

Replies