logoalt Hacker News

CGMthrowawayyesterday at 12:50 AM2 repliesview on HN

They compressed the compression? Or identified an embedding that can "bootstrap" training with a headstart ?

Not a technical person just trying to put it in other words.


Replies

mapontoseventhsyesterday at 5:08 AM

To use an analogy: Imagine a spreadsheet with 500 smoothie recipes one in each row, each with a dozen ingredients as the columns.

Now imagine you discover that all 500 are really just the same 11 base ingredients plus something extra.

What they've done here is use SVD, (which is normally used for image compression and noise reduction), to find that "base recipe". Now we can reproduce those other recipes by only recording the one igredient that differs.

More interestingly it might tell us something new about smoothies in general to know that they all share a common base. Maybe we can even build a simpler base using this info.

At least in theory. The code hasn't actually been released yet.

https://toshi2k2.github.io/unisub/#key-insights

show 1 reply
vlovich123yesterday at 1:16 AM

They identified that the compressed representation has structure to it that could potentially be discovered more quickly. It’s unclear if it would also make it easier to compress further but that’s possible.