If someone is using these models, they probably can't or won't use the existing SOTA model...

crimsoneer • today at 3:36 PM • 1 reply • view on HN

If someone is using these models, they probably can't or won't use the existing SOTA models, so not sure how useful those comparisons actually are. "Here is a benchmark that makes us look bad from a model you can't use on a task you won't be undertaking" isn't actually helpful (and definitely not in a press release).

Replies

constantcrying • today at 3:50 PM

Completely agree, that there are legitimate reasons to prefer comparison to e.g. deepeek models. But that doesn't change my point, we both agree that the comparisons would be extremely unfavorable.

➕ show 1 reply

alt Hacker News

Replies