I have been benchmarking many of my use cases, and the GPT Nano models have fallen completely flat o...

phainopepla2 • last Tuesday at 9:03 PM • 1 reply • view on HN

I have been benchmarking many of my use cases, and the GPT Nano models have fallen completely flat one every single except for very short summaries. I would call them 25% effectiveness at best.

Replies

verdverm • last Tuesday at 10:23 PM

Flash is not a small model, it's still over 1T parameters. It's a hyper MoE aiui

I have yet to go back to small models, waiting for the upstream feature / GPU provider has been seeing capacity issues, so I am sticking with the gemini family for now

alt Hacker News

Replies