There was a discussion around a very similar model (Qwen3 based) some weeks ago:
https://news.ycombinator.com/item?id=46319826
I found it particularly thought-inspiring how a model with training from that time period completely lacks context/understanding of what it is itself, but then I realized that we are the same (at least for now).