I just tested this with Claude Code and Opus 4.6, with the following prompt: "I have an arbit...

gilbetron • yesterday at 3:05 PM • 1 reply • view on HN

I just tested this with Claude Code and Opus 4.6, with the following prompt:

"I have an arbitrary width rectangle that needs to be broken into smaller random width rectangles (maintaining depth) within a given min/max range. The solution needs to be highly performant from an algorithmic standpoint, well-tested using TDD and Red/Green testing, written in python, and not have any subtle errors."

It got the answer you ended up with (if I'm understanding you correctly) the first time in just over 2 minutes of working, and included a solid test suite examining edge cases and with input validation.

Replies

YesBox • yesterday at 3:37 PM

How can we verify if you dont post the code?

I appreciate you testing, even though it's not a great comparison:

- My feedback cycle of LLM prompting forced me to be more explicit with each call, which benefited your prompt since I gave you exactly what to look for with fewer nuances.

- Maybe GPT 5.1 is old or kneecapped for newer versions of GPT

- Maybe Opus/Claud is just a way better model :P

Please post the code!

Edit: Regarding "exactly what to look for", when solving a new problem, rarely is all the nuance available for the first iteration.

➕ show 1 reply

alt Hacker News

Replies