> Try 1Mbps and iterate from there.
From the article:
“Just lower the bitrate,” you say. Great idea. Now it’s 10Mbps of blocky garbage that’s still 30 seconds behind.
Rejecting it out of hand isn't actually trying it.
10Mbps is still way too high of a minimum. It's more than YouTube uses for full motion 4k.
And it would not be blocky garbage, it would still look a lot better than JPEG.
Proper rate control for such realtime streaming would also lower framerate and/or resolution to maintain the best quality and latency they can over dynamic network conditions and however little bandwidth they have. The fundamental issue is that they don't have this control loop at all, and are badly simulating it by polling JPEGs.
10Mbits is more than the maximum ingest bitrate allowed on Twitch. Granted, anyone who watches a recent game or an IRL stream there might tell you that it should go up to 12 or 15, but I don't think an LLM interface should have trouble. This feels like someone on a 4K monitor defeating themselves through their hedonic treadmill.
The problem is I think that they are using moonlight which is "designed" to stream games at very low latency. I very much doubt that people need <30ms response times watching an agent terminal or whatever they are showing!
When you try and use h264 et al at low latency you have to get rid of a lot of optimisations to encode it as quickly as possible. I also highly suspect the vaapi encoder is not very good esp at low bitrates.
I _think_ moonlight also forces CBR instead of VBR, which is pretty awful for this use case - imagine you have 9 seconds of 'nothing changing' and then the window moves for 0.25 seconds. If you had VBR the encoder could basically send ~0kbit/sec apart from control metadata, and then spike the bitrate up when the window moved (for brevity I'm simplifying here, it's more complicated than this but hopefully you get the idea).
Basically they've used the wrong software entirely. They should try and look at xrdp with x264 as a start.