logoalt Hacker News

AshamedCaptainyesterday at 11:45 PM1 replyview on HN

This is a multi-terabyte sized dice that is not at all random AND has most definitely copied the source code in question to begin with.


Replies

kelseyfrogyesterday at 11:52 PM

The die is certainly not multi-terabyte. A more realistic number would be 32k-sided to 50k-sided if we want to go with a pretty average token vocabulary size.

Really, it comes down to encoding. Arbitrarily short utf-8 encoded strings can be generated using a coin flip.

show 1 reply