4,096 token context window is pretty limiting. That's roughly 3,000 words — fine for "summarize this paragraph" but not enough for anything that needs real context. Still, zero cost and fully local is hard to beat for quick throwaway tasks. Does it handle streaming or is it request-response only?
Try it and see