Just for training and processing the existing context (pre fill phase). But when doing inference a t...

sailingparrot • yesterday at 3:27 PM • 0 replies • view on HN

Just for training and processing the existing context (pre fill phase). But when doing inference a token t has to be sampled before t+1 can so it’s still sequential

alt Hacker News