Hey! We are able to outperform Devstral-Small-2-24B when specializing on repositories, and come well...

ethan_l_shen • yesterday at 8:12 PM • 1 reply • view on HN

Hey! We are able to outperform Devstral-Small-2-24B when specializing on repositories, and come well within the range of uncertainty with our best SERA-32B model. That being said, our model is a bit larger than Devstral 24B. Could you point out what in the paper gave the impression that we were smaller? If theres something unclear we would love to revise

Replies

khimaros • yesterday at 8:52 PM

"SERA-32B is the first model in Ai2's Open Coding Agents series. It is a state-of-the-art open-source coding agent that achieves 49.5% on SWE-bench Verified, matching the performance of much larger models like Devstral-Small-2 (24B)" from https://huggingface.co/allenai/SERA-32B

➕ show 1 reply

alt Hacker News

Replies