logoalt Hacker News

visargatoday at 5:19 AM1 replyview on HN

> the interpretability research from Anthropic [1] suggests that structures corresponding to meaning do exist inside those bundles of numbers and that there are signs of activity within those bundles of numbers that seem analogous to thought

I did a simple experiment - took a photo of my kid in the park, showed it to Gemini and asked for a "detailed description". Then I took that description and put it into a generative model (Z-Image-Turbo, a new one). The output image was almost identical.

So one model converted image to text, the other reversed the processs. The photo was completely new, personal, never put online. So it was not in any training set. How did these 2 models do it if not actually using language like a thinking agent?

https://pbs.twimg.com/media/G7gTuf8WkAAGxRr?format=jpg&name=...


Replies

happosaitoday at 6:12 AM

> How did these 2 models do it if not actually using language like a thinking agent?

By having a gazillion of other, almost identical pictures of kids in parks in their training data.