logoalt Hacker News

drbigyesterday at 11:41 PM1 replyview on HN

An approach very close to one I've been thinking about lately.

My three cents: compact the journal when its size exceeds the actual data size. With thresholds or other knobs; with the point being the initial load time should be directly proportional to the amount of actual data. Everything else/older is a backup.


Replies

addaontoday at 12:28 AM

The value of the journal having history (with comments and timestamps) is huge. I think what I'd prefer to see is having a start sequence of replay journal, build in-memory structure, optionally move old journal to backup name and write out minimal/compressed/comment-and-timestamp-stripped journal to new file. Optionally could be based on size delta; e.g. write if it's less than half the size of the old journal. This keeps journals as append only, while still giving access to full history. It does require some external management to avoid file usage growth even faster than a single journal; but it reduces startup time, and allows a management strategy like just deleting backup files older than a given date (once they're in cold backup, if needed).

show 1 reply