When there is no alt text do you have like a solution for that? Like VLMs are really powerful, I imagine they can be used to parse through the unlabeled images automatically if needed.