I feel like I must have plateued and don't know what to do next to level up. I'm currently...

tunesmith • yesterday at 5:43 PM • 21 replies • view on HN

I feel like I must have plateued and don't know what to do next to level up. I'm currently on the $100/month codex plan and it seems fine using 5.5-xhigh all the time. I think of what to do next, have a chat session to determine exactly what to ask for up to the point of being ready to implement, and then codex churns on a commit-sized task whereupon I briefly check it on my local dev server. If necessary I ask for a change. Then I ask it to commit and recommend the next step based off the spec. Oftentimes I have to "approve" an out-of-sandbox request anyway.

I haven't found anything that requires running all night. I could tell it to one-shot a big plan but given how often I realize I want an intermediary thing to be slightly different it seems like a waste of effort.

I'm guessing the next thing I should probably look into is some sort of machine vm I can tunnel my codex-gui requests to so I don't have to deal with the sandbox approvals (I don't want to give it "dangerous" access to my entire mac).

I don't understand what people are doing with their side projects that is leading them to churn through tokens so quickly, to the point of requiring two $200/month subscriptions and a bunch of token charges besides.

Replies

vitally3643 • yesterday at 7:57 PM

That's because you're treating the problem as an engineer instead of an "influencer" or "10xer" or whatever. You're treating it as a problem to be solved with engineering and AI is merely a tool to do so. It is, in my experience, vanishingly rare for an engineer to have a problem that needs to be solved with multiple hours of unattended AI code generation.

I've only found one single application where it makes even the slightest amount of sense to have an AI grind away for hours on end. I'm reverse engineering a widget which contains five separate firmware images. I've dumped the binary from the widget and I set the AI to decompile and reverse engineer these interrelated firmware projects. It's a compelx task, but very well bounded. It's not complicated work, but it's a lot of work, and the end result is a C-shaped pile of text that is only informative, it never would be compilable on its own even if I did it by hand. The quality of the output is tightly bounded by the input assembly and the overall output artifact is documentation in the shape of code.

I don't have any qualms about letting an AI go ham on it unattended because the stakes are zero. But if the AI can beat the assembly into a recognizable C project, it's much easier for me to read and reason about. Easy win, I think.

➕ show 1 reply

albertgoeswoof • yesterday at 6:47 PM

I’ve watched a bunch of layman videos where they create stuff with AI, these people burning through 12 hour tasks are literally not reading the output or understanding what it’s doing. Like they’ll ask for a program, and then right after it’s been created they ask the AI how to run it. Then when there’s a bug, they ask the AI what went wrong, or scrap the entire thing and switch model/harness and try again.

Here’s an example https://m.youtube.com/watch?v=xc1296HY8Fw&ra=m

It’s completely different to a professional workflow (what you described). It’s a toy for consumers

➕ show 1 reply

gerdesj • yesterday at 11:51 PM

"I feel like I must have plateued and don't know what to do next to level up."

Go out for a walk. Wherever you live, there will be a destination or an environment that will enrich your life just by visiting it. Go and take a look at it or experience it and then go back to worrying about tokens.

wrs • yesterday at 7:53 PM

>I think of what to do next

As everyone trying to do real work is finding, that's the actual bottleneck. If the system is keeping up with your thinking, you're doing fine. You can't "level up" your thinking by paying for more tokens. The people doing more automatic stuff are probably outpacing their own thinking, and that will bite them eventually.

calgoo • yesterday at 7:01 PM

I have downgraded my Claude to the $20 one, and basically only use it for the web chat right now. For coding, I use DeepSeek @API Rates configured in Claude Code. I have spent around $4.8 for 320,000,000 tokens. I always felt like i was not using Claude plan, that i had to have the LLM working on something all the time to justify the price. Now with DeepSeek i don't think about it anymore. I don't feel bad when not using the subscription anymore, and i don't worry about limits as i just pay more. Where i really felt this was on running things in parallel as there are no hourly limits anymore!

➕ show 2 replies

gaflo • yesterday at 10:57 PM

Can I ask what exactly you are building? Your experience tracks for me when building a real product -- something I want other people to use. Most of my time on these projects is spent talking to my users and carefully refining my requirements and design.

For personal pet projects I can definitely see how you can blow through your token budget very quickly. If I just point my coding agent to iteratively come up with some heuristics for some NP-hard problem, it will read intermediary outputs and constantly make small changes "in the dark" until it either finds a small improvement or gives up. In a similar vein I found that you can burn many many tokens if you try to let the agent reverse engineer something where you don't have the source code. If you just give it a binary or some interface to work with and a vague task you can easily burn your entire budget with 1 prompt.

I wouldn't want anyone to use these fully vibe coded toy projects though; it is more of an exploratory curiosity for me where I learn more about some problems I'm interested in as well as gauge how good the agents are at tasks that I seem to have a much better intuition on how to approach.

kapperchino • yesterday at 11:35 PM

On the topic of access control, I’m building a coding agent with no shell access, currently only supports rust though. https://github.com/Kapperchino/agent-joe

wincy • yesterday at 7:04 PM

I’m using $200 a month Codex working on a game for my kids for fun and curiosity since I’m a dev, I’ve played games, but I’ve never done dev for games. and have all night tasks but mostly they’re “spend time tending to and adding stuff to my 3D asset pipeline”. My RTX 5090 runs Trellis2 -> ultrashapes -> Trellis2 -> wiring up rigging and setting up animations.

But like 99% of that task is just Codex waiting for the output. So it’ll run for 12 hours but mostly it’s just setting lots of sleeps. I haven’t gotten close to running out of tokens. The $100 a month codex I hit usage limitations almost immediately, about 3 days in of working like crazy with 10 agents going at once, mostly coding an asset pipeline, I ran into my weekly limit and upgraded. So with the $200 a month plan at 4x more credits I haven’t hit any walls at all and can absolutely cook.

➕ show 2 replies

dnautics • yesterday at 5:45 PM

I have been on $100/mo claude and it has been churning out quite good software for months now. like i estimate what would have taken me three ish years, assuming i didn't burn out from failure (i would have). i only hit limits when i double fisted claude with my main project and my side project. just the other day i noticed i had been stuck on 4.5 because i failed to update the npm package.

PeterStuer • yesterday at 6:13 PM

I'm on $100 Claude. I have a setup with bespoke local services that mitigates some high token consumption scenarios with local LAN services. I screen mcp's and hooks for cache poisoning. I run 100% on Opus with max effort, and never came close to hitting 5 hour or weekly limits before the Fable release. I am in Claude Code at least 20hrs a week.

I see people just completely wasting tokens with ridiculous setups, 100% hitting cache misses as well as dumping huge files into context all the time.

Just learn how these things work, or pay the price I guess.

bthornbury • yesterday at 11:10 PM

promote yourself to PM only and use agents for authoring, verification, tests, checking the tests

orchestrator -> parallel subagents with investigation, authoring, verification, benchmarking subagents and integration / final verification handled by parent has improved my productivity too.

I feel like from here its agent swarms against a whole spec but haven't got there yet.

Still getting plenty of bugs in the more complex scenarios, but mostly (in some projects) i never have to look at the code and treat it like a black box

sheremetyev • yesterday at 5:58 PM

> I don't want to give it "dangerous" access to my entire mac

I'm running Claude/Codex inside native macOS sandbox, configured with a simple script - https://github.com/sheremetyev/sandfence

always in "bypass permissions" mode - it works until task is solved, sometime 1 hour or more (which includes running tests etc)

➕ show 1 reply

seviu • yesterday at 8:54 PM

I usually hit the limit when I am frustrated and I don’t want to understand what the problem is.

I am an engineer, and when I understand what’s going on, I never hit any limit.

aerhardt • yesterday at 7:46 PM

Well, if you believe the people who sell the tokens, you should be creating loops that keep yanking the bandit’s arm.

rsanek • yesterday at 10:14 PM

While it's a little unstable, I've found Docker's sbx to be a great sandbox to run agents with --dangerously-skip-permissions

tchock23 • yesterday at 6:32 PM

Same boat here. I’m able to get a lot done on CC at $100/mo and feel like I’m not being creative or productive enough somehow when I hear of people blowing past that in a day.

hedgehog • yesterday at 7:17 PM

Patches to existing sizable codebases and reverse engineering binaries both can run a long time and use a lot of tokens without wandering off into the weeds.

➕ show 1 reply

dyauspitr • yesterday at 8:15 PM

I usually say run the full regression suite, all the simulator tests, install simulators and take a screenshot of every page on all applicable devices and do comprehensive fuzzing and chaos testing before I go to bed. It usually takes atleast 3-4 hours, usually longer, especially the UI/simulator tests.

➕ show 1 reply

coldtea • yesterday at 8:43 PM

>I feel like I must have plateued and don't know what to do next to level up.

Why do you need to "level up"? To have it shit out slop faster?

Just use it rationally for what you need to do.

dheera • yesterday at 6:42 PM

[dead]

alt Hacker News

Replies