this is the key tension imo. do you think labs are underinvesting in eval infra because scaling headlines are easier to sell?
also curious what would change your mind first: a clear algorithmic breakthrough, or just sustained cost/latency drops from systems work?