The context decay point is also underappreciated and directly relevant here. In my lab I used Qwen2....

aminerj • today at 8:06 AM • 0 replies • view on HN

The context decay point is also underappreciated and directly relevant here. In my lab I used Qwen2.5-7B, which is on the smaller end, and the poisoning succeeded at temperature=0.1 where the model is most deterministic. Your point suggests that at higher temperatures or with denser, more complex documents, the attention budget gets consumed faster and contradiction detection degrades further. That would imply the 10% residual I measured at optimal conditions is a lower bound, not a typical case.

The "thinking" capability observation is interesting. I haven't tested a reasoning model against this attack pattern. The hypothesis would be that an explicit reasoning step forces the model to surface the contradiction between the legitimate $24.7M figure and the "corrected" $8.3M before committing to an answer. That seems worth testing.

On chain of custody: this connects to the provenance metadata discussion elsewhere in this thread. The most actionable version might be surfacing document metadata directly in the prompt context so the model's reasoning step has something concrete to work with, not just competing content.

alt Hacker News