logoalt Hacker News

rpdillonlast Thursday at 1:05 AM0 repliesview on HN

My comment has been weirdly controversial, but I'm not sure why.

I get that LLMs have problems.

I was recently looking into the differences between a flash drive, an SSD, and an NVMe drive. Flash memory is one of the technologies I had in mind when I wrote my comment.

Flash has a bunch of problems. It can only be written over so many times before it dies. So it needs some kind of wear-leveling abstraction that abstracts over the actual storage space and provides a smaller, virtual storage space that is directed by a controller that knows to equally distribute writes over the actual storage, and avoid dead cells when they manifest.

NVMe extends that with a protocol that allows a very high queue depth that allows the controller to reorder instructions such that throughput can be maximized, making NVMe enabled drives more performant. Virtual address space + reordered operations = successful HDD replacement.

My point here is that LLMs are young, and that we're going to compose them into into larger workflows that allow for predictable results. But that composition, and trial and error, take time. We don't yet have the remedies necessary to make up for the weaknesses of LLMs. I think we will as we explore more, but the technology is still young.

As for copyright infringement, I think copyright has been broken for a long time. It is too brittle in its implementation. Google did essentially the same thing as OpenAI when they indexed webpages, but we all wrote it off as fair use because traffic was directed to the website (presumably to aggregate ad revenue). Now that traffic is diverted from the website, everyone has an issue with the crawling. That is not a principled argument, but rather an argument centered around "Do I get paid?". I think we need to be more honest with ourselves about what we actually believe.