logoalt Hacker News

kgeistlast Sunday at 2:57 PM6 repliesview on HN

>The model uses this internal error signal (the gradient) as a mathematical equivalent of saying, "This is unexpected and important!" This allows the Titans architecture to selectively update its long-term memory only with the most novel and context-breaking information

So one can break a model by consistently feeding it with random, highly improbable junk? Everything would be registered as a surprise and get stored, impacting future interactions


Replies

andy12_last Sunday at 5:33 PM

This is an oversimplification of what Titans does. The model performs nested learned, where the model learns during inference, and during training the model weights learn _how and what_ to learn during inference. If the input contains junk of irrelevant information, the model most likely learned during training to assign low surprise query and key embeddings to those tokens, because learning those junk tokens would have hurt the overall ability of the model to predict subsequent next tokens (and thus, it would have had increased the training loss).

pmichaudlast Sunday at 3:00 PM

I’m guessing that this is the first thing they thought of and the problem only exists in the superficial gloss you’re responding to?

bethekidyouwantlast Sunday at 4:00 PM

In what world can you not always break the response of an AI by feeding it a bunch of random junk?

show 3 replies
idiotsecantlast Sunday at 3:32 PM

The is the start of what I always thought an AI should have - a limbic system. Humans don't store memory based on novelty, they store it based on emotional content. This is where I was afraid of the tiger, this is where I smelled delicious food, this was what it felt like when I was victorious in the hunt.

AI needs an internal emotional state because that's what drives attention and memory. AI needs to want something.

show 2 replies
falcor84last Monday at 3:34 PM

I read that this works on humans too. Minds can break.

photochemsynlast Sunday at 5:08 PM

This is no different from what happens to humans if they're locked into cult programming situations, they'll start believing and regurgitating all kinds of nonsense if their information stream is tightly curated,

Practically, for use with a codebase development effort, if the model remembers the original design decisions, the discussions about costs and benefits, then can remember all that much later in the process, it's going to start getting really good at thinking about what the next step is, or even to make decisions about when a major refactor is neede, etc.