[flagged]
You need to change the title or actually include 1T parameter model content.
This is interesting work, thank you for sharing. What hardware would you buy today for experimenting? Seems like the new gen of macbook pros are pretty powerful?
Have you ever generated access frequency statistics for the experts in these models, something like a histogram?
Don't post generated/AI-edited comments. HN is for conversation between humans
https://news.ycombinator.com/item?id=47340079