logoalt Hacker News

johneayesterday at 9:31 PM0 repliesview on HN

I generally agree with the concerns of this article, and wonder about the theory of the LLM having a innate inclination to generate bloated code.

Even in this article though, I feel like there is a lot of anthropomorphization of LLMs.

> LLMs and their limitations when reasoning about abstract logic problems

As I understand them, LLMs don't "reason" about anything. It's purely a statistical sequencing of words (or other tokens) as determined by the training set and the prompt. Please correct me if I'm wrong.

Also, regarding this theory that the models may be biased to produce bloated code: I've reposted this once already, and no one has replied yet, and I still wonder:

----------

To me, this represents one of the most serious issues with LLM tools: the opacity of the model itself. The code (if provided) can be audited for issues, but the model, even if examined, is an opaque statistical amalgamation of everything it was trained on.

There is no way (that I've read of) for identifying biases, or intentional manipulations of the model that would cause the tool to yield certain intended results.

There are examples of DeepState generating results that refuse to acknowledge Tienanmen square, etc. These serve as examples of how the generated output can intentionally be biased, without the ability to readily predict this general class of bias by analyzing the model data.

----------

I'm still looking for confirmation or denial on both of these questions...