logoalt Hacker News

jumploopsyesterday at 6:03 PM0 repliesview on HN

Indirectly related, but has anyone else found repeatable success with pure markdown skills?

I’ve built a similar workflow (but for system design/execution) and it works surprisingly well with the frontier models.

The skill includes scripts to ensure the work was actually done/followed, but I’ve been testing it without the scripts and it does a decent job.

Yesterday in GPT-5.5 xhigh[0] however I noticed some hallucinations, where the model stated it had created files, when in fact it hadn’t.

A small hiccup like this is usually fine, as the model realizes the files don’t exist sometime later, but in this particular instance, it claimed the files were created and then just continued on.

tl;dr - I fell into the trap of trusting markdown-only workflows, just to be bitten by the models hallucinating steps.

[0]xhigh is on, but in this particular turn there was no reasoning presented, so it may have been a degradation of the LLM/harness.