I've found them to be quite accurate when given enough context data. For ex, feeding it an article into it's context window and asking questions about it. Relying on the LLM's internal trained knowledge state seems to be less reliable.