This is an interesting observation. So maybe it has nothing to do with the model itself, but everything to do with external configuration. Token-limit exceeded -> empty output. Just a guess, though.
> Token-limit exceeded -> empty output. Just a guess, though.
That'd be really non-obvious behavior, I'm not aware of any inference engine that works like that by default, usually you'd get everything up until the limit, otherwise that kind of breaks the whole expectation about setting a token-limit in the first place...
> Token-limit exceeded -> empty output. Just a guess, though.
That'd be really non-obvious behavior, I'm not aware of any inference engine that works like that by default, usually you'd get everything up until the limit, otherwise that kind of breaks the whole expectation about setting a token-limit in the first place...