logoalt Hacker News

mhitzalast Sunday at 5:47 PM2 repliesview on HN

I've been using lightly gpt-oss-20b but what I've found is that for smaller (single sentence) prompts it was easy enough to have it loop infinitely. Since I'm running it with llama.cpp I've set a small repetition penalty and haven't encountered those issues since (I'm using it a couple of times a day to analyze diffs, so I might have just gotten lucky since)


Replies

nicolaslemlast Sunday at 7:16 PM

I had the same issue with other models where they would loop repeating the same character, sentence or paragraph indefinitely. Turns out the context size some tools set by default is 2k and this is way too small.

ModelForgelast Sunday at 7:00 PM

I’ve been using the ollama version (uses about 13 Gb RAM on macOS) and haven’t had that issue yet. I wonder if that’s maybe an issue of the llama.cpp port?

show 1 reply