logoalt Hacker News

hxiitoday at 12:19 AM1 replyview on HN

Any time I see one of these posts about models of this size a quote comes to mind – "Your Scientists Were So Preoccupied With Whether Or Not They Could, They Didn’t Stop To Think If They Should".

Only a select few have the hardware required to run this to begin with, and even then the forecasted performance makes me wonder if it’s worth it at all.


Replies

segmondytoday at 2:08 AM

Completely worth it. At 6tk a second. If I can get 2 hrs of token generation. That's 2hrs * 3600secs * 6tk = 43200 tokens, at about 10tk to a line of code, that's about 4320 lines. Let's even trim it more and slice it by half. That's 2160 lines of code a day. Most professional programmers can't deliver that much consistently in a day.

The key to a model this large is (1) Use it to plan, generate lots of plan and farm out to a smaller model. Then for very specific and complicated portions precisely prompt for what you need.