Not OP but one example is that recent VL models are more than sufficient for analyzing your local photo albums/images for creating metadata / descriptions / captions to help better organize your library.
Any pointers on some local VLMs to start with?
Any pointers on some local VLMs to start with?