That's what I've been doing. I use crush normally. While the codebase are by no means huge, they're not tiny either.
Are you using it in an agentic workflow? Just reading the codebase will consume a lot of cached tokens, but seemingly, z.ai counts these as normal input tokens the way they're rate limiting.
Are you using it in an agentic workflow? Just reading the codebase will consume a lot of cached tokens, but seemingly, z.ai counts these as normal input tokens the way they're rate limiting.