For short texts, the translation I usually want the most is fast translation, and local models are actually great for this.
But for high-ish quality translations of substantive texts, you typically want a harness that's pretty different from Claude Code. You want a glossary of technical terms or special names, a structured summary of the wider context, a concise style guide, and you have to chop the text into chunks to ensure nothing is missed. Even with super long context models, if you ask them to translate much at once they just translate an initial portion of it and crap out.
Are you using it for localization or short strings of text in an app? I wonder what you can do to get better results out of smaller models. I'm confident there's a way.
For short texts, the translation I usually want the most is fast translation, and local models are actually great for this.
But for high-ish quality translations of substantive texts, you typically want a harness that's pretty different from Claude Code. You want a glossary of technical terms or special names, a structured summary of the wider context, a concise style guide, and you have to chop the text into chunks to ensure nothing is missed. Even with super long context models, if you ask them to translate much at once they just translate an initial portion of it and crap out.
Are you using it for localization or short strings of text in an app? I wonder what you can do to get better results out of smaller models. I'm confident there's a way.