logoalt Hacker News

lodovictoday at 7:12 AM0 repliesview on HN

I made something similar to this project, and tested it against a few 3B and 8B models (Qwen and Ministral, both the instruction and the reasoning variants). I was pleasantly surprised by how fast and accurate these small models have become. I can ask it things like "check out this repo and build it", and with a Ralph strategy eventually it will succeed, despite the small context size.