logoalt Hacker News

Der_Einzigetoday at 3:24 AM0 repliesview on HN

Btw samplers do in fact help with this. Random tokens deep in your output context are due to accumulated sampling errors from using shit samplers like top_p and top_k with temperature.

Use a full distribution aware sampler like p-less decoding, top-H, or top-n sigma, and this goes away

Yes the paper for this will be up for review at NeurIPS this year.