In response to the correction: The paper says that "we propose a Soundscape-to-Image Diffusion model, a generative Artificial Intelligence (AI) model supported by Large Language Models (LLMs)" so there's an LLM involved somewhere presumably?
I'm sorry, you're correct, I missed that. I'll edit my edit!
I'm sorry, you're correct, I missed that. I'll edit my edit!