logoalt Hacker News

d4rkp4tternyesterday at 1:16 PM5 repliesview on HN

Built out the demo on my M1 Max Macbook and it was absolutely terrible. Around 10 seconds for each reply, and even then it was saying something totally unrelated.


Replies

d4rkp4tternyesterday at 1:25 PM

Also in general I don't know get what the appeal of a 7b full-duplex (speech-to-speech) model is: 7b can't be very intelligent on its own, and for anything useful, you'd need tool-calls, which speech-to-speech models can't do. This is also why ChatGPT voice mode annoys by never doing a web search or reading a link (in fact it pretends to search or read, outright makes up stuff, and when pushed admits it can't really read web pages or do web searches).

There are probably definitely use cases for this though, open to be educated on those.

show 3 replies
mrkstuyesterday at 6:59 PM

Quoted from linked article:

"PersonaPlex accepts a text system prompt that steers conversational behavior. Without focused instructions, the model rambles — it’s trained on open-ended conversation and will happily discuss cooking when asked about shipping.

Several presets are available via CLI (--list-prompts) or API, including a general assistant (default), customer service agent, and teacher. Custom prompts can also be pre-tokenized and passed directly.

The difference is dramatic. Same input — “Can you guarantee that the replacement part will be shipped tomorrow?”:

No prompt: “So, what type of cooking do you like — outdoor grilling? I can’t say for sure, but if you’re ordering today…”

With prompt: “I can’t promise a specific time, but we’ll do our best to get it out tomorrow. It’s one of the top priorities, so yes, we’ll try to get it done as soon as possible and ship it first thing in the morning.”"

jayavanthyesterday at 10:15 PM

what is your context size?

scotty79yesterday at 3:14 PM

On something around rtx 5070 it reacted faster than a human would.

butILoveLifeyesterday at 1:18 PM

[flagged]