logoalt Hacker News

shtack10/01/20241 replyview on HN

Cool, I built a prototype of something very similar (face+voice cloning, no video analysis) using openly available models/APIs: https://bslsk0.appspot.com/

The video latency is definitely the biggest hurdle. With dedicated a100s I can get it down <2s, but it's pricy.


Replies

leobg10/01/2024

This looks awesome. Didn’t seem to hear me, but the video looks great. Can you share what models you are using? You say these are all open models.

show 1 reply