logoalt Hacker News

PStamatiouyesterday at 4:57 PM1 replyview on HN

Sesame | Full-time | SF/NYC/Bellevue | On-site | https://www.sesame.com/

Sesame believes in a future where computers are lifelike - with the ability to see, hear, and collaborate with us in ways that feel natural and human. With this vision, we're designing a new kind of computer, focused on making voice personal agents part of our daily lives. More details from Sequoia: https://www.sequoiacap.com/article/partnering-with-sesame-a-...

Our team brings together founders from Oculus and Ubiquity6, alongside proven leaders from Meta, Google, and Apple, with deep expertise spanning hardware and software.

Open Roles: https://jobs.ashbyhq.com/sesame

- ML Engineers

- Product Designers

- Product Managers

- iOS & Android Engineers

- ML Model Serving Engineer

- Embedded OS Architect

- Mechanical Engineer, Product Design

- Embedded Engineers

- Electrical Engineer

- Audio Systems Engineer


Replies

robrenaudyesterday at 5:17 PM

What do y'all think about the latency/quality tradeoff with LLMs?

Human voices don't take 30 seconds to think, retrieve, research, and summarize a high quality answer. Humans are calibrated in their knowledge, they know what they understand and what they don't. They can converse in real time without bullshitting.

Frontier real time-ish LLM generated voice systems are still plagued by 2024 era LLM nonsense, like the inability to count Rs in strawberry. [1]

I'd personally love a voice interface that, constrained by the technology of today, takes the latency hit to deliver quality.

[1] https://www.instagram.com/reel/DTYBpa7AHSJ/?igsh=MzRlODBiNWF...

show 1 reply