Did you read the blog post? They compare to deepswe and call it out as the worst one for false posit...

ryeguy • today at 12:14 AM • 1 reply • view on HN

Did you read the blog post? They compare to deepswe and call it out as the worst one for false positives (failed, but the benchmark assessed it as correct). It also has less language variance.

Replies

CSMastermind • today at 5:23 AM

I mean yes that is what you'd say if you were writing a blog post about your new benchmark.

alt Hacker News

Replies