I have been using Claude to generate OpenSCAD for 3D printing. It works decently when the jobs are simple and can be easily described, but the description part really makes it clear how little vocabulary the ordinary person has to compose a good picture of any real item that isn't just a basic shape. It seems that the trick, like most things with getting LLMs to do something complicated and have it work well, is to be an expert in the field already.
The trick might be to put a multimodal A.I. to describe what it sees in an image, and employ another LLM to put the textual representation into code. Multimodal A.I.s are good at describing images.
Even a handwritten sketch could be a very good starting point for an image recognition from an A.I.