Have you considered giving your digital twin a jolly aspect? I've wondered if an AI video agent could be made to appear real time, despite a real processing latency, if the AI were to give a hearty laugh before all of its' responses. >So Carter, what did you do this weekend? >Hohoho, you know! I spent some time working on my pet AI projects!
I wonder if some standard set of personable mannerisms could be used to bridge the gap from 250ms to 1000ms. You don't need to think about what the user has said before you realize they've stopped talking. Make the AI Agent laugh or hum or just say "yes!" before beginning its' response.
This is definitely a good idea! I think the hard part is making it contextual and relevant to the last question/response, in which case the LLM comes into the equation again. Something we're looking at though!
I think I recall that Google did exactly this with their telephone bot (Google assistant?), sneaking in very natural sounding "um"s here and there to mask processing/network latency.