I appreciate the directness of calling LLMs "Bullshit machines." This terminology for LLMs is well established in academic circles and is much easier for laypeople to understand than terms like "non-deterministic." I personally don't like the excessive hype on the capabilities of AI. Setting realistic expectations will better drive better product adoption than carpet bombing users with marketing.
If it bullshits so much, you wouldn't have a problem giving me an example of it bullshitting on ChatGPT (paid version)? Lets take any example of a text prompt fitting a few pages - it may be a question in science or math or any domain. Can you get it to bullshit?
I have still mixed feelings about LLMs.
If I take the example of code, but that extends to many domains, it can sometimes produce near perfect architecture and implementation if I give it enough details about the technical details and fallpits. Turning a 8h coding job into a 1h review work.
On the other hand, it can be very wrong while acting certain it is right. Just yesterday Claude tried gaslighting me into accepting that the bug I was seeing was coming from a piece of code with already strong guardrails, and it was adamant that the part I was suspecting could in no way cause the issue. Turns out I was right, but I was starting to doubt myself