The text rendering is quite impressive, but is it just me or do all these generated 'realistic' images have a distinctly uncanny feel to it. I can't quite put my finger on it what it is, but they just feel off to me.
The lighting is wrong, that's what's telling to me. They look too crisp. No proper shadows, everything looks crystal clear.
Everything is weightless. When real people stand and gesture there’s natural muscle use, hair and clothing drape, papers lay flat on surfaces.
At least for the real life pictures, there’s no depth of field. Everything is crystal clear like it’s composited.
Qwen always suffered from their subpar rope implementation and qwen 2 seems to suffer from it as well. The uncanny feel is down to the sparsity of text to image token and the higher in resolution you go the worse it gets. It's why you can't take the higher ends of the MP numbers serious no matter the model. At the moment there is no model that can go for 4k without problems you will always get high frequency artifacts.
Agree, looks like the same effect they are applying on YouTube Shorts...
For me the only model that can really generate realistic images is nano banana pro (also known as gemini-3-pro-image). Other models are closing the gap, this one is pretty meh in my opinion in realistic images.
I agree. They makes me nauseous. The same kind of light nausea as car sickness.
I assume our brains are used to stuff which we dont notice conciously, and reject very mild errors. I've stared at the picture a bit now and the finger holding the baloon is weird. The out of place snowman feels weird. If you follow the background blur around it isnt at the same depth everywehere. Everything that reflects, has reflections that I cant see in the scene.
I dont feel good staring at it now so I had to stop.