logoalt Hacker News

GaggiXtoday at 8:02 PM1 replyview on HN

This is not local but Gemini models can process very long videos and provide description with timestamps if asked for.

https://ai.google.dev/gemini-api/docs/video-understanding#tr...


Replies

embedding-shapetoday at 9:30 PM

Nor would it be describing things as they happen, but instead needing pre-processing, so in the end, very different :)