I find LLMs so much more exhausting than manual coding. It’s interesting. I think you quickly bump into how much a single human can feasibly keep track of pretty fast with modern LLMs.
I assume until LLMs are 100% better than humans in all cases, as long as I have to be in the loop there will be a pretty hard upper bound on what I can do and it seems like we’ve roughly hit that limit.
Funny enough, I get this feeling with a lot of modern technology. iPhones, all the modern messaging apps, etc make it much too easy to fragment your attention across a million different things. It’s draining. Much more draining than the old days
I have always enjoyed the feeling of aporia during coding. Learning to embrace the confusion and the eventual frustration is part of the job. So I don’t mind running in a loop alongside an agent.
But I absolutely loathe reviewing these generated PRs - more so when I know the submitter themselves has barely looked at the code. Now corporate has mandated AI usage and is asking people to do 10k LOC PRs every day. Reviewing this junk has become exhausting.
I don’t want to read your code if you haven’t bothered to read it yourselves. My stance is: reviewing this junk is far more exhausting. Coding is actually the fun part.
A lot of these resonate with me, particularly the mental fatigue. It feels like normal coding forced me to slow my brain down, whereas now my mind is the limit.
For context, I started an experiment to rebuild a previous project entirely with LLMs back in June '25 ("fully vibecoded" - not even reading the source).
After iterating and finally settling on a design/plan/debug loop that works relatively well, I'm now experiencing an old problem like new: doing too much!
As a junior engineer, it's common to underestimate the scope of some task, and to pile on extra features/edge cases/etc. until you miss your deadline. A valuable lesson any new programmer/software engineer necessarily goes though.
With "agentic engineering," it's like I'm right back at square one. Code is so cheap/fast to write, I find myself doing it the "right way" from the get go, adding more features even though I know I shouldn't, and ballooning projects until they reach a state of never launching.
I feel like a kid again (:
The most honest, logical, and practical take I've seen on this. People consistently underestimate the skill and effort it takes to write precisely and think critically both about their problem, and their processes. The closer you are to knowing what to ask for in the way knowledgeable people ask for it with respect to the process you are using to complete work, the closer the output will be to what you want.
I find working more asynchronous with the agents help. I've disabled the in-your-face agent-is-done/need-input notifications [1]. I work across a few different tasks at my own pace. It works quite well, and when/if I find a rhythm to it, it's absolutely less intense than normal programming.
You might think that the "constant" task switching is draining, but I don't switch that frequently. Often I keep the main focus on one task and use the waiting time to draft some related ideas/thoughts/next prompt. Or browse through the code for light review/understanding. It also helps to have one big/complex task and a few simpler things concurrently. And since the number of details required to keep "loaded" in your head per task is fewer, switching has less cost I think. You can also "reload" much quicker by simply chatting with the agent for a minute or two, if some detail have faded.
I think a key thing is to NOT chase after keeping the agents running at max efficiency. It's ok to let them be idle while you finish up what your doing. (perhaps bad of KV cache efficiency though - I'm not sure how long they keep the cache)
(And obviously you should run the agent in a sandbox to limit how many approvals you need to consider)
[1] I use the urgent-window hint to get a subtle hint of which workspace contain an agent ready for input.
EDIT: disclaimer - I'm relative new to using them, and have so far not used them for super complex tasks.
LLMs do not actually make anything better for anyone. You have to constantly correct them. It's like having a junior coder under your wing that never learns from its mistakes. I can't imagine anyone actually feeling productive using one to work.
I've been using Claude Code within VS Code for the most part... it's funny, but from time to time, I forget to click the Claude icon, and start interacting with the default GitHub copilot on the side. I tend to find myself quickly frustrated with the interactions only to realize I wasn't working with Claude/Opus. As soon as I switch, I'm almost always back on track within 10-30 minutes.
That said, it helps to be in tune with your own body and mind. You need breaks now and then and with AI interactions, you will be "ON" more than just working through problems on your own. The AI can work through the boilerplate that lets your mind rest at a relatively blazing pace, leaving you to evaluate and iterate relatively quickly. You will find yourself more "worn out" from the constant thinking faster.
IIRC most people burn out after 4-6 hours of heavy thought work... take a long meal break, then consider getting back into it or not. Identify when it's okay to stop for the day... you may be getting good progress, but if you aren't in the right mindset it's you that may well be introducing mistakes into things.
Beyond this, I tend to plan/track things in TODO.md files as I work/plan through things... keeping track of what needs to be done, combined with history, and even the "why" along the way... AI makes it easy to completely swap out a backend library pretty quickly, especially with a good testing surface in place. But it helps to track why you're doing certain things... why you're making the changes you are on a technical level.
I've found LLMs to be liberating and energizing, not at all exhausting.
I can finally do my preferred workflow: Research, (design, critique), (plan, critique, design), implement.
Design and planning has a quick enough turnaround cycle to not get annoying. By the time the agent is writing code, I have no involvement anymore. Just set it and forget it, come back in half an hour or so to see if it's done yet. Meanwhile, I look at the bigger picture and plan out my next prompt cycles as it churns out code.
For example, this project was entirely written by LLM:
https://github.com/kstenerud/yoloai
I never wrote a single line of this code (I do review it, of course, but even then the heavy lifting for that can be offloaded to an LLM so that I can focus on wider issues, which most often are architectural).
In particular, take a look at the docs/dev subdir to see the planning and design. Once the agent has that, it's MUCH harder for it to screw things up.
Is it as tight as it could be? Nope, but it has a solid architecture, does its job well, and has good debugging infrastructure so fixes are fast. I wouldn't use this approach for embedded or projects requiring maximum performance, but for regular code it's great!
It looks like Stockholm syndrome or a traditional abusive relationship 100 years ago where the woman tries to figure out how to best prompt her husband to do something.
You know you can leave abusive relationships. Ditch the clanker and free your mind.
This was not the article that I expected. The headline is correct in both cases but I assumed that it would be about fighting against the army of LLM scrapers, which is the source of my exhaustion in relation to them. Perhaps that is one for me to write instead.
I wonder if the same people using "agentic AI" are the same that spend days setting up the "perfect" work environment with four screens.
I find LLMs are great for building ideas, improving understanding and basic prototyping. This is more useful at the start of the project lifecycle, however when getting toward release it's much more about refactoring and dealing with large numbers of files and resources, making very specific changes e.g. from user feedback.
For those of us with decades of muscle memory who can fix a bug in 30 seconds with a few Vim commands, LLMs are very likely to be slower in most coding tasks, excepting prototyping and obscure bug spotting.
I've noticed the same thing. I would have three, sometimes four sessions run at the same time. It would be great, but mentally exhausting. To help this, I've set a self-imposed limit of two active chat sessions at a time.
Another thing I found is that it is too easy to keep going. I would work for too long and get even more exhausted. It feels rude to just stop a conversation. LLMs don't really care about social norms like that, but it still felt awkward to me and I would worry about losing the context I had.
To help with that, I wrote my own little plugin that reminds me to start winding down at the end of the work day and starts prompting me (pardon my phrasing) to take the off-ramp; to relay any thoughts and todos I still have in mind and put them down to pick up the next day.
This is in no way production ready, but it might be an inspiration: https://github.com/pindab0ter/wind-down
I'm not sure a faster loop is helping. It may actually be the problem. I have taken to creating 'collaboration' and 'temp_code' folders that I am spending more and more time in. By the time I am actually ready to touch the real code I have often written and re-written the problem statement/plan and expanded it to several files and some test code. I tell the other devs at my company that I spend 90% of the tokens on understanding and clarifying the problem and let the last 10% generate an answer. If I don't do that then I get prototype code that won't survive a single feature change and likely has intentionally hidden bugs, or 'defensive' code as some like to call them (try, except, ignore is a common claude pattern). My favorite is when claude hits the unit tests and says 'that failure was there before we started so I can ignore it...'. To get it to write actually good code you have to have caged the problem to a space that the LLM can optimize without worry, but to do that you have to still do work to understand how to break the problem into pieces small enough that the right answer is the obvious one. At that point letting it take the syntax is just fine by me.
Maybe the right answer is to sometimes slow down, explore and think a little more instead of just letting it try something until it (eventually, sort of) works.
It's exhausting - sometimes it feels like you are continuously redirecting a deviant child who just won't give up on his shenanigans.
I wonder if it's more or less tiring to work with LLMs in YOLO/--dangerously-skip-permissions mode.
I mostly use YOLO mode which means I'm not constantly watching them and approving things they want to do... but also means I'm much more likely to have 2-3 agent sessions running in parallel, resulting in constant switching which is very mentally taxing.
I've just come off a 2 month bender using Claude Code for, well, far too much. I had 5 instances running at once for days on end. It felt amazing, it felt like flying. And then something gave way and I couldn't focus on anything for 3 days. Diagnosed with ADHD some time ago I fell into this kind of trap well before LLM's, but not to this degree.
So I'm writing code by hand today and using Claude to track down type and dependency errors. It feels good, I might do this for a while.
llms aren’t exhausting it’s the hype and all the people around it
same thing happened with crypto - the underlying technology is cool but the community is what makes it so hated
Everytime I read articles here describing the LLM prompt engineering workflow, all I can think is, "This sounds like such a fucking awful job".
I imagine I will greatly reduce my job prospects as a hold out, but honestly, from what I've read I think I'd rather take a hefty pay hit and not go there. It sounds like a mental heath disaster and fast track to serious burnout.
YMMV, I realize I'm in the minority, this is unproductive ranting, yada yada yada
Your human context also needs compacting at some point. After hours of working with an LLM, your prompts tend to become less detailed, you tend to trust the LLM more, and it's easier to go down a solution that is not necessarily the best one. It becomes more of a brute forcing LLM assisted "solve this issue flow". What's funny is that it sometimes feels that the LLM itself is exhausted as well as the human and then the context compacting makes it even worse.
It's like with regular non-llm assisted coding. Sometimes you gotta sleep on it and make a new /plan with a fresh direction.
I am rewriting an agent framework from scratch because another agent framework, combined with my prompting, led to 2023-level regressions in alignment (completely faking tests, echoing "completed" then validating the test by grepping for the string "completed", when it was supposed to bootstrap a udp tunnel over ssh for that test...).
Many top labs [1] [2] already have heavily automated code review already and it's not slowing down. That doesn't mean I'm trusting everything blindly, but yes, over time, it should handle less and less "lower level" tasks and it's a good thing if it can.
[1] https://openai.com/index/harness-engineering/ [2] https://claude.com/blog/code-review
Further I want to vent about two things:
- Things can be improved.
- You are allowed to complain about anything, while not improving things yourself.
I think the mid 2010s really popularized self improvement in a way that you can't really argue with (if you disagree with "put in more effort and be more focused", you're obviously just lazy!). It's funny because the point of engineering is to find better solutions, but technically yes, an always valid solution is just "suck it up".
But moreover, if you do not allow these two premises, what ends up happening in practice for a lot of people, is that basically you can just interpret any slightly pushback as "oh they're just a whiner", and if they're not doing something to fix their problem this instant, that "obviously" validates your claim (and even if they are, it doesn't count, they should still not be a "debbie downer", etc.).
Sometimes a premise can sound extreme, but people forget that premises are not in a complete logical vaccuum, you actually live out and believe said premises, and by taking on a certain position, it's often more about what follows downstream from the behavior than the actual words themselves.
LLM coding is addictive as hell though. you're like a kid at disneyland, everything builds so fast, just one more feature, one more fix... and then you're 4 hours in and your prompts are garbage but you don't want to stop because everything feels so close to done
I get exhausted because of the cognitive overhead of switching between 2 or 3 projects at once. I always want to be manually verifying or prompt writing, and keeping it all straight is taxing. But I’m getting so much more done.
I wanna say that it is indeed a “skill issue” when it comes to debugging and getting the agent in your editor of choice to move forward. Sometimes it takes an instruction to step back and evaluate the current state and others it’s about establishing the test cases.
I think the exhausting part is more probably more tied to the evaluation of the work the agent is doing, understanding its thought process and catching the hang up can be tedious in the current state of AI reasoning.
Most people reading this have probably had the experience of wasting hours debugging when exhausted, only to find it was a silly issue you’ve seen multiple times, or maybe you solve it in a few minutes the next morning.
Working with an agent coding all day can be exhilarating but also exhausting - maybe it’s because consequential decisions are packed more tightly together. And yes cognition still matters for now.
I think that if you build a solid foundation of your project and can articulate somewhat well what it is you want it to do, then you can expect a pretty good result. I typically limit my prompt to a specific file, often specify the lines and outline some of the logic and add references to other files where necessary. Then, Claude gets just enough context to to do what I want it to do.
Another trick I learnt is you can ask Claude to ask you comprehensive questions for clarification. Usually, it will then offer you a choice of 3 options per question that it might have and you can steer it towards the right implementation.
One thing I’ve noticed is sometimes it feels like I’m more of a QA person testing output than solving the problem.
If AI is doing the coding then it gets to solve the problems and I don’t get the satisfaction/dopamine/motivation you get when you solve a programming problem in a clever way.
I’ve found LLM development expands the scope of what I can do to an absurd level. This is what exhausts me.
My limits are now many of the same things that are have always been core to software dev, but are now even more obvious:
- what is the thing we are building? What is the core product or bug fix or feature?
- what are we _not_ building? What do we not care about?
- do I understand the code enough to guide design and architecture?
- can I guide dev and make good choices when it’s far outside my expertise but I know enough to “smell” when things are going off the rails
It’s a weird time
I think the fatigue is specifically about opacity. When you review agent output, you're not just checking correctness—you're trying to reconstruct what state the agent was in when it made each call. That reconstruction is the expensive part. If you already know the agent's tool pattern and drift trajectory while it ran, review shifts from guessing to confirming. Still work, but a different kind.
In agent-mode mode, IMO, the sweet spot is 2-3 concurrent tasks/sessions. You don’t want to sit waiting for it, but you don’t want to context-switch across more than a couple contexts yourself.
the exhaustion pattern i've noticed is specific: it's not writing the code that's hard, it's integration. the model produces clean isolated functions fast. the part that takes mental energy is knowing where those functions should live, what they'll break when they change, and why that architectural decision was made 3 months ago. that context isn't in the code -- it's in your head.
so the bottleneck shifts. before: generating code is slow, integration is easy (you built it). after: generating code is instant, integration requires the same mental load as before because the codebase complexity didn't decrease -- it just grew faster.
I really appreciate the author for writing this.
I learned years ago that I when I write code after 10 PM, I'm go backward instead of forward. It was easy to see, because the test just wouldn't pass, or I'd introduce several bugs that each took 30 minutes to fix.
I'm learning now that it's no different, working with agents.
One way to help, I think, is to take advantage of prompt libraries. Claude makes this easy via Skills (which can be augmented via Plugins). Since skills themselves are just plain text with some front matter, they're easy to update and improve, and you can reuse them as much as you like.
There's probably a Codex equivalent, but I don't know what it is.
Of course. Any scenario where you are expected to deliver results using non-deterministic tooling is going to be painful and exhausting. Imagine driving a car that might dive one way or the other of its own accord, with controls that frequently changed how they worked. At the end of any decently sized journey you would be an emotional wreck - perhaps even an actual wreck.
I mostly do 2-3 agents yoloing with self "fresh eyes" review
LLMs shift you from a software engineer to a management role, with all of the overhead that entails.
LLM coding has made programming feel like playing Factorio to me. It's simultaneously much more addictive and much more strenuous than it's even been for me before. Each commit feels like moving to a new link in the supply chain, but each link is imperfect so I have to drop back down to debug them. At the end of a long evening, "one more assembly line" and "one more prompt" feel exactly the same.
I really hate having to wait 20 seconds to a minute between every interaction with the LLM— I end up alternating between prompting and doom scrolling for several hours, a viciously unsatisfying cycle. (I know I could probably fix this by having multiple agents running at once, but context switching to that level also seems like a stressful doom-scroll-esque experience lol)
> If I reach the point where I am not getting joy out of writing a great prompt...
Man, I envy you. For me, the joy comes from writing good code that I can be proud of. I never got ANY joy from writing a prompt.
I mean, it is a means to an end (getting the LLMs to do the boring stuff) and so it is a necessary evil. Also, the LLMs are at times amazing and at times dumb as rocks even for very similar prompts. That drives me crazy because it feels I have no control over those things.
It seems to me that LLM is a tool after all. One needs to learn to use it effectively.
This is exactly what was needed. Seamlessly transitioning from manual inspection in the Elements/Network panels to agent-led investigation is going to save so much 'context-setting' time.
There's nothing more annoying than the feeling of "oh FFS why you doing that?!".
Its amazing how right and wrong LLMs can be in the output produced. Personally the variance for me is too much... I cant stand when it gets things wrong on the most basic of stuff. I much prefer doing things without output from an LLM.
Does anyone else see this as dystopian? Someone is unironically writing about how exhausted they are and up at night thinking about how they can be a better good-boy at prompting the LLM and reminding us how we shouldn't cope by blaming the AI or its supposed limitations (context size, etc). This is not a dig at the author. It just seems crazy that this is an unironic post. It's like we are gleefully running to the "Laughterhouse" and each reminding our smiling fellow passengers not to be annoyed at the driver if he isn't getting us there fast enough, without realizing the Slaughterhouse (yes, I am stealing the reference).
Another way you can read this is as a new cult member that his chiding himself whenever he might have an intrusive thought that Dear Leader may not be perfect, after all.
I have just started reading books while the agents are working and only checking in every 20 minutes or so. I'm considering just moving all the work onto my home desktop and just use tailscale with a terminal emulator on the ipad and iphone to get out of the house a bit more. I spend a lot of the morning working on specs once they are all ready I get the agents to work.
Exhausting in a GOOD way! I've been using Codex to review my existing Godot components framework at [0] and the project's modularity suits AI well: It can focus on one file/subsystem at a time. I don't use it to generate or edit any code but it has helped me catch a lot of bugs that would have taken me a long time on my own. I've been more productive than ever but boy it never seems to run out of flaws to point out in everything! I often have to ask it to overlook some issues/limitations as intentional so I can catch a break.
>They're dumbing down the model to save money. Context rot!
Coldtea's law: "Never attribute to context rot that which is adequately explained by cost-cutting".
Reminds me of the best saying I ever got from my CS professor. She would make us first write out our code and answer the question, "What will the output be?" before we were allowed to run it.
"If you don't know what you want your code to do, the computer sure as heck won't know either." I keep this with me today. Before I run my code for the first time or turn on my hardware for the first time, I ask myself, "What _exactly_ am I expecting to see here?" and if I can't answer that it makes me take a closer and more adversarial look at my own output before running it.