How is the quality of model answers to your queries? Are they stable over time?
I am wondering how to measure that anyway.