Yes, the idea works and was explored with dreambooth/textual inversion for image diffusion models.
https://dreambooth.github.io/ https://textual-inversion.github.io/
Both of those are of course out of date and require significant training instead of just feeding it a single image.
InstantID (https://replicate.com/zsxkib/instant-id) fixes that issue.
Both of those are of course out of date and require significant training instead of just feeding it a single image.
InstantID (https://replicate.com/zsxkib/instant-id) fixes that issue.