I guess, in theory, the prior distribution of language would allow for improved performance in some cases, especially where input quality is low.
This is already used in OCR, tesseract uses that.
This is already used in OCR, tesseract uses that.