Call me when it can do Russian Cursive.
Right, it can do modern writing but anything older than a century ( church records and census)and it produces garbage. Yandex Archives figured that out and have CER in a single digit but they have the resources to collect immense data for training. I'm slowly building a dataset for finetuning TROCR model and the best it can do is CER 18% ... which is sort of readable.
Seems to do an OK job:
https://g.co/gemini/share/e173d18d1d80
This is a random image from Twitter with no transcript or English translation provided, so it's not going to be in the training data.