logoalt Hacker News

newzinotoday at 7:40 PM1 replyview on HN

Zip stores its central directory at the end of the file. To find what's inside and where each entry starts, you need to read the tail first. That rules out issuing a single Range request to grab one specific asset.

Tar is sequential. Each entry header sits right before its data. If the JSON manifest in the Gwtar preamble says an asset lives at byte offset N with size M, the browser fires one Range request and gets exactly those bytes.

The other problem is decompression. Zip entries are individually deflate-compressed, so you'd need a JS inflate library in the self-extracting header. Tar entries are raw bytes, so the header script just slices at known offsets. No decompression code keeps the preamble small.


Replies

fluidcrufttoday at 7:54 PM

You can also read a zip sequentially like a tar file. Some info is in the directory only but just for getting file data you can read the file records sequentially. There are caveats about when files appear multiple times but those caveats also apply to processing tar streams.