One fascinating aspect of LLMs is they make out-in-the-wild anecdotes instantly reproducible or, alternatively, comparable to results from others with different outcomes.
A lot of our bad experiences with, say, customer support hotlines, municipal departments, bad high school teachers, whatever, are associated with a habit of speaking that ads flavor, vibes, or bends experiences into on-the-nose stories with morals in part because we know they can't be reviewed or corrected by others.
Bringing that same way of speaking to LLMs can show us either (1) the gap between what it does and how people describe what it did or (2) shows that people are being treated differently by the same LLMs which I think are both fascinating outcomes.
We're also seeing a new variant of Cunningham's law:
The best way to get the right answer from an LLM is not to ask it the right question; it's to post online that it got the wrong answer.
> One fascinating aspect of LLMs is they make out-in-the-wild anecdotes instantly reproducible
How? I would argue they do the exact opposite of that.
LLMs are definitely not instantly reproducible. The temperature setting adjust randomness and the models are frequently optimized and fine tuned. You will very different results depending on what you have in your context. And with a tool like Microsoft copilot, you have no idea what is in the context. There are also bugs in the tools that wrap the LLM.
Just because other people on here say “worked for me” doesn’t invalidate OPs claim. I have had similar times where an LLM will tell me “here is a script that does X” and there is no script to be found.