With a dedicated GPU and some cleverness it can be relatively quick. I split the response on punctua...

shtack • 10/02/2024 • 0 replies • view on HN

With a dedicated GPU and some cleverness it can be relatively quick. I split the response on punctuation and generate smaller clips in a pipeline. I haven't taken the model apart to try streaming the frames coming out of ffmpeg yet, but that would probably help a lot.

alt Hacker News