Image output for 4o in the API would be very nice but i'm not sure if that's at all in the cards.
Audio output in the api now but you lose image input. Why ? That's a shame.