It's not quite as simple as that because the chunks should be self-contained; they need to star...

iainmerrick • today at 9:35 AM • 0 replies • view on HN

It's not quite as simple as that because the chunks should be self-contained; they need to start with an IDR keyframe, which fully resets the decoder. That allows the player to seek to the start of any chunk.

That means when you're encoding the downscaled variants, the encoder wants to know the size of the file segments so it can insert those IDR frames. Therefore it's common to do the encoding and segmentation in a single step (e.g. with ffmpeg's "dash" formatter).

You can have variable-duration or fixed-duration segments. Supposedly some decoders are happier with fixed-duration segments, but it can be fiddly to get the ffmpeg settings just right, especially if you want the audio and video to have exactly the same segment size (here's a useful little calculator for that: https://anton.lindstrom.io/gop-size-calculator/)

For hosting, a typical setup would be to start with a single high-quality video file, have an encoder/segmenter pipeline that generates a bunch of video and audio chunks and DASH (.mpd) and/or HLS (.m3u8) manifests, and put all the chunks and manifests on S3 or similar. As long as all the internal links are relative they can be placed anywhere. The video player will start with the top-level manifest URL and locate everything else it needs from there.

alt Hacker News