logoalt Hacker News

mgtoday at 11:40 AM3 repliesview on HN

    you will be surrounded by an ecosystem of
    devices, none of which stand alone, but are
    more like portals to interact with your agents
I would be really happy with my phone + headphones as the device I use most. But only if I could use Gemini (or ChatGPT or Grok or any other chat agent) in voice mode and say "SSH into my GitHub Codespace soandso and implement feature soandso.". And it replies "Did it. I told copilot (or codex or whatever coding agent lives on that VM) to implement the feature".

And then a minute later I could ask it "Is copilot done yet?" and it replies "No, looks like it is still working on it". And then a minute later I ask again. It replies "Yes, it finished. It changed chart.py and styles.css. Do you want me to tell you what specific changes it made to the files?".

But it looks like none of the chat agents with voice interface have such a connector at the moment? An SSH connector would be the most useful. But a "GitHub Codespace connector" or something like that would also do.

I wonder if that will be a missing piece for long. If so, I would build an agent with voice mode and ssh connector myself. But I guess it should come out from the big guys any moment now?


Replies

jazzypantstoday at 1:35 PM

> Yes, it finished. It changed chart.py and styles.css. Do you want me to tell you what specific changes it made to the files?"

A verbal diff sounds practically useless. Does it first read out the entire left-hand base, and then read out the entire right-hand target? Does it say loudly "REMOVING ... ADDING ... "? How would it read out something like Struct->Field? This seems lower fidelity than a visual confirmation, and I just don't think that voice commands make sense with this kind of work.

show 1 reply
trumpdongtoday at 2:09 PM

I can't tell if this is sarcasm.