logoalt Hacker News

thot_experimentlast Sunday at 9:22 AM1 replyview on HN

anyone have a tl;dr for me on what the best way to get the video comprehension stuff going is? i use qwen-30b-vl all the time locally as my goto model because it's just so insanely fast, curious to mess with the video stuff, the vision comprehension works great and i use it for OCR and classification all the time


Replies

xrdyesterday at 10:29 PM

How much VRAM do you need for local usage may I ask?