I am not doing any of this.
It becomes obsolete in literally weeks, and it also doesn't work 80% of the time. Like why write a mcp server for custom tasks when I don't know if the llm is going to reliably call it.
My rule for AI has been steadfast for months (years?) now. I write (myself, not AI because then I spend more time guiding the AI instead of thinking about the problem) documentation for myself (templates, checklist, etc.). I give ai a chance to one-shot it in seconds, if it can't, I am either review my documentation or I just do it manually.
The ability of newer agents to develop plans that been be reviewed and most importantly do a build test modify cycle has really helped. You can task an agent with some junior programmer task and then go off and do something else.
An alternative is to view the AI agent as a new developer on your team. If existing guidance + one-shot doesn't work, revisit the documentation and guidance (ie dotMD file), see what's missing, improve it, and try again. Like telling a new engineer "actually, here is how we do this thing". The engineer learns and next time gets it right.
I don't do MCPs much because of effort and security risks. But I find the loop above really effective. The alternative (one-shot or ignore) would be like hiring someone, then if they get it wrong, telling them "I'll do it myself" (or firing them)... But to each his own (and yes, AI are not human).
I agree. Software development is on an ascent to a new plateau. We have not reached that yet. Any skill that is built up now is at best built on a slope.
I've found that if it can't get it right within a few shot iteration - it's generally better to switch to writing with auto-complete which is still quite quick compared to the days of old.
I think both are helpful
1. starting fresh, because of context poisoning / long-term attention issues
2. lots of tools makes the job easier, if you give them a tool discovery tool, (based on Anthropics recent post)
We don't have reliable ways to evaluate all the prompts and related tweaking. I'm working towards this with my agentic setup. Added time travel for sessions based on Dagger yesterday, with forking, cloning, registry probably toda
A perspective which has helped me is viewing LLM-based offerings strictly as statistical document generators, whose usefulness is entirely dependent upon their training data set plus model evolution, and whose usage is best modeled as a form of constraint programming[0] lacking a formal (repeatable) grammar. As such, and when considering the subjectivity of natural languages in general, the best I hope for when using them are quick iterations consisting of refining constraint sentence fragments.
Here is a simple example which took 4 iterations using Gemini to get a result requiring no manual changes:
EDIT:For reference, a hand-written script satisfying the above (excluding comments for brevity) could look like:
0 - https://en.wikipedia.org/wiki/Constraint_programming