logoalt Hacker News

AlotOfReadingtoday at 1:48 AM1 replyview on HN

Something being simultaneously described as a "30 sheet, mind-numbingly complex Excel model" and "testable" seems somewhat unlikely, even before we get into whether Claude will be able to test such a thing before it runs into context length issues. I've seen Claude hallucinate running test suites before.


Replies

martinaldtoday at 1:55 AM

It compacted at least twice but continued with no real issues.

Anyway, please try it if you find it unbelievable. I didn't expect it to work FWIW like it did. Opus 4.5 is pretty amazing at long running tasks like this.

show 2 replies