If you point an LLM at a middleware and ask it to find vulnerabilities, then not finding this is a shortcoming.
Whether "LLM failed to spot vulnerability that took humans 8 years to find" is a great headline about shortcomings of LLMs is questionable, but it is a good example of a category of bug that is particularly hard to spot for humans and LLMs alike
If you point an LLM at a middleware and ask it to find vulnerabilities, then not finding this is a shortcoming.
Whether "LLM failed to spot vulnerability that took humans 8 years to find" is a great headline about shortcomings of LLMs is questionable, but it is a good example of a category of bug that is particularly hard to spot for humans and LLMs alike