Audio described Youtube please? That'd be so amazing! Even if I couldn't play Zelda yet, I could listen to a playthrough with Gemini describing it.
Hey, I just made simple test on 5 minute downloaded YouTube video uploading it to Gemini app.
Source video title: Zelda: Breath of the Wild - Opening five minutes of gameplay
https://www.youtube.com/watch?v=xbt7ZYdUXn8
Prompt:
Please describe what happening in each scene of this video.
List scenes with timestamp, then describe separately:
- Setup and background, colors
- What is moving, what appear
- What objects in this scene and what is happening,
Basically make desceiption of 5 minutes video for a person who cant watch it.
Result on github gist since there too much text:https://gist.github.com/ArseniyShestakov/43fe8b8c1dca45eadab...
I'd say thi is quite accurate.
BTW I asked detailed narrative descriprion of other purely benchmarking Zelda video with 5 second snapshots:
Video: Zelda TOTK, R5 5600X, GTX 1650, 1080p 10 Minute Gameplay, No Commentary
https://www.youtube.com/watch?v=wZGmgV-8Rbo
Here can be found narrative descriprion source and command:
https://gist.github.com/ArseniyShestakov/47123ce2b6b19a8e6b3...
Then I converted it into narrative voice over with Gemini 2.5 Pro TTS:
https://drive.google.com/file/d/1Js2nDtM7sx14I43UY2PEoV5PuLM...
It's somewhat desynced from original video and voice over take 9 and half minutes instead of 10 in video, but description of what happening on screen is quite accurate.
PS: I used 144p video so details could be also messed up because of poor quality. And ofc I specifically asked for narrative-like descripription