Btw samplers do in fact help with this. Random tokens deep in your output context are due to accumul...

Der_Einzige • today at 3:24 AM • 0 replies • view on HN

Btw samplers do in fact help with this. Random tokens deep in your output context are due to accumulated sampling errors from using shit samplers like top_p and top_k with temperature.

Use a full distribution aware sampler like p-less decoding, top-H, or top-n sigma, and this goes away

Yes the paper for this will be up for review at NeurIPS this year.

alt Hacker News