logoalt Hacker News

rhipitrtoday at 3:32 AM0 repliesview on HN

Depending on quantization I figure they need at least a p4 and likely a p5 EC2 (or similar instance in another provider) for a model with that many parameters. Maybe they are hosting on bare metal but I imagine not. Those instance types (assuming not using spot) are quite expensive to run.