logoalt Hacker News

simonw10/01/20241 replyview on HN

The new Realtime Websocket API appears to send back responses within less than a second. It might be just what you want.


Replies

bcherry10/01/2024

yes and you can use it in text-text mode if you want. a key benefit is for turn-based usages (where you have running back and forth between user and assistant) you only need to send the incremental new input message for each generation. this is better than "prompt caching" on the chat completions API, which is basically a pricing optimization, as it's actually a technical advantage that uses less upstream bandwidth.