logoalt Hacker News

TacticalCodertoday at 1:37 AM2 repliesview on HN

> Next I'm going to set it loose on 263 GB database of every stock quote and options trade in the past 4 years.

Options quotes alone for US equities (or things that trades as such, like ADS/ADR) represent 40 Gbit per second during options trading hours. There are more than 60 million trades (not quotes, only trades) per day. As the stock market is opened approx 250 days per year (a bit more), that's more than 60 billion actual options trades in 4 years. If we're talking about quotation for options, you can add several orders of magnitude to these numbers.

And I only mentioned options. How do you store "every stock quote and options trade in the past 4 years" in 263 GB!?


Replies

jtbakertoday at 1:42 AM

> And I only mentioned options. How do you store "every stock quote and options trade in the past 4 years" in 263 GB!?

I think this would be pretty straightforward for Parquet with ZSTD compression and some smart ordering/partitioning strategies.

dataviz1000today at 2:58 AM

I see, I said "stock quote" instead of "minute aggregates". You are correct that data set is much larger and at ~1.5TB a year [0] I did not download 6TB of data onto my laptop. Every settled trade options or stocks isn't that big.

[0] https://massive.com/docs/flat-files/stocks/quotes