logoalt Hacker News

Show HN: Showboat and Rodney, so agents can demo what they've built

79 pointsby simonwtoday at 5:52 PM42 commentsview on HN

Comments

cadamsdotcomtoday at 10:04 PM

Great to see you doing red/green TDD Simon!

Passing tests in your repo are great documentation of the tool at a microscopic level. And rerunning tests only burns tokens on failures (since passed tests just print a dot) so it’s token efficient too.

Some other neat tricks:

- For greater efficiency configure your test runner to print nothing (not even a dot/filename) for test successes. Agents don’t need progress dots, only the exit code & failure details

- Have your agent implement a 10ms timeout per test. pytest has hooks to do this. The agent will see tests time out and mock out all I/O and third party code - why test what one assumes third parties tested already! Your test suite is CPU-bound without a shared database, has no shared data and no tests that interfere with or depend on each other, so tests can run in parallel.

show 1 reply
samuelsontoday at 9:38 PM

I love your content, but I wish you'd make your blog theme responsive for wider screens/non-mobile. I prefer to read content like this on a large screen.

Showboat seems like it could actually be quite useful for humans too, just for making quick notes from a CLI without opening an editor. The "pop" command makes me wonder if there would be a benefit to also having an array-like in addition to the stack-like interface. It seems like it would be fairly trivial to generate an index of markdown blocks so that they could be edited individually.

I like the idea of Rodney, but I wonder if you might actually have better results by asking the agent to generate equivalent Selenium scripts instead. I'm specifically suggesting Selenium because it's been around so long so I assume there's a lot of Selenium in the LLMs training data, but there are other options that might work too.

show 1 reply
Hansenqtoday at 7:10 PM

I was a bit confused as to how everything works until I read it in detail. Really cool tools, but I think one thing that would help in the introduction is: saying explicitly that the generated .md document is for you (the user) to read through, observe the output of the CLI call, and ensure that the output matches what you would expect.

It's basically an automated test, but at a higher abstraction level and with manual verification--using CLI tools rather than a test harness. Really great work!

giancarlostorotoday at 7:06 PM

I'll be sure to try these out. I've been building my own alternative to Beads with a concept called "gates" which do not let you close tasks as complete until a gate passes. Would love to throw these in as "gates" for my current workflow.

johnfntoday at 7:39 PM

Out of curiosity, what is the advantage of using Rodney when Playwright has the same set of features and AI understands how to write a Playwright script very well?

show 1 reply
Sharlintoday at 8:52 PM

I can't wait for tools that allow agents to hold stand-ups, retrospectives and sprint planning sessions, all facilitated by an agentic scrum master.

show 1 reply
simlevesquetoday at 9:05 PM

rodney seems to be pretty much the same as agent-browser: https://github.com/vercel-labs/agent-browser

show 1 reply
sNyZZzzztoday at 8:35 PM

Using Markdown as both docs and executable output is cool, but I’m curious how it scales when agents hit more complex ui.

elibentoday at 6:35 PM

Very interesting! I encountered the problems these tools are trying to tackle just recently while trying to guide an agent into creating an in-browser tool for me. Closing the loop on a web interface isn't as simple as CLI-only tools. I should give this a try.

It's also interesting that you've shifted to Go for your agent-coded CLI tools, Simon.

show 1 reply
mentalgeartoday at 8:06 PM

A bit like jupyter notebooks, isn't it?

show 2 replies
tardismechanictoday at 6:42 PM

See also (the confusingly named) playwright-cli

https://github.com/microsoft/playwright-cli

Different from the cli used for running tests etc that comes bundled with PlayWright

Sample use:

  playwright-cli open https://demo.playwright.dev/todomvc/ --headed
  playwright-cli type "Buy groceries"
  playwright-cli press Enter
  playwright-cli type "Water flowers"
  playwright-cli press Enter
  playwright-cli check e21
  playwright-cli check e35
  playwright-cli screenshot
show 2 replies
nzoschketoday at 7:11 PM

go-rod has been instrumental to my agentic coding loops too. Some uses:

- E2E testing of browser components

- Taking screenshots before and after and having Claude look at them to double check things

- Driving it with an API and CLI as a headless browser

Will definitely give Rodney a look.

water-drummertoday at 8:02 PM

Wait, why should an LLM simply not just write directly to the markdown file instead of going through the extra step of using a cli tool which is basically `echo 'something' >> file.md` but with templates that should really be in a prompt instead of a being in a compiled binary? Did Claude come up with the idea for this as well?

Also, I am sure you must already know about Playwright mcp so why this? If your goal isn't to make the cli human-friendly, which is the only advantage clis have over mcps doing the same thing, then why not just use the mcp? It doesn't even handle multiple sessions and has a single global state file––this is slop.

show 1 reply
measurablefunctoday at 7:14 PM

Google's antigravity does this automatically by creating Task & Walkthrough artifacts.

saberiencetoday at 6:39 PM

Sounds like both of these tools could be one shot by either Claude or Codex.

Or alternatively, just be a skill versus a tool.

My “agents” already demo stuff all the time by just being prompted to do so. I have notations in my standard Agents.md for how I want my documentation, testing etc.

show 1 reply
789bc7wassadtoday at 7:30 PM

[dead]

limonstublechewtoday at 6:49 PM

[dead]

brian200today at 7:03 PM

[flagged]

show 2 replies
toastaltoday at 7:01 PM

If agents can generate text so easily, why would they be limited to Markdown instead of reStructuredText, AsciiDoc, or LaTeX which have rich features that help users understand text? I can understand developers refusing to adopt proper formats for documentation, but this seems odd for the bots. It doesn’t even generate the correct syntax block in Markdown using “bash” instead of “sh-session”.

show 3 replies