The framing in the headline is interesting. As far as I recall, spending 4x more compute on a model ...

QuadmasterXLII • yesterday at 6:00 PM • 1 reply • view on HN

The framing in the headline is interesting. As far as I recall, spending 4x more compute on a model to improve performance by 7% is the move that has worked over and over again up to this point. 101 % of GPT-4 performance (potentially at any cost) is what I would expect an improved routing algorithm to achieve.

Replies

dang • yesterday at 6:38 PM

(The submitted title was "93% of GPT-4 performance at 1/4 cost: LLM routing with weak bandit feedback")

alt Hacker News

Replies