I just fixed this bug in a summarizer. Reasoning tokens were consuming the budget I gave it (1k), so there was only a blank response. (Qwen3.5-35B-A3B)
Most inference engines would return the reasoning tokens though, wouldn't you see that the reasoning_content (or whatever your engine calls it) was filled while content wasn't?
Most inference engines would return the reasoning tokens though, wouldn't you see that the reasoning_content (or whatever your engine calls it) was filled while content wasn't?