logoalt Hacker News

Show HN: Nano PDF – A CLI Tool to Edit PDFs with Gemini's Nano Banana

57 pointsby GavCotoday at 8:44 PM11 commentsview on HN

The new Gemini 3 Pro Image model (aka Nano Banana) is incredible at generating slides, so I thought it would be fun to build a CLI tool that lets you edit PDF presentations using plain English. The tool converts the page you want to edit into an image, sends it to the model API together with your prompt to generate an edited image, then converts the updated image back and stitches into the original document.

Examples:

- `nano-pdf edit deck.pdf 5 "Update the revenue chart to show Q3 at $2.5M"`

- `nano-pdf add deck.pdf 15 "Create an executive summary slide with 5 bullet points"`

Features:

- Edit multiple pages in parallel

- Add entirely new slides that match your deck's style

- Google Search enabled by default so the model can look up current data

- Preserves text layer for copy/paste and search

It can work with any kind of PDF but I expect it would be most useful for a quick edit to a deck or something similar.

GitHub: https://github.com/gavrielc/Nano-PDF


Comments

tecoholictoday at 9:32 PM

> Converts an image to a single-page PDF with a hidden text layer using Tesseract. This is the 'State Preservation' step.

Does this mean the text only pdf page is transformed into an image that covers the full page, but the text is still under there. So, any machine based extraction would still get the text, but would probably loose all the bounding box information and regular users cannot just use their mouse to select text anymore?

lxetoday at 9:03 PM

This is nuts and I absolutely love this. So you convert the PDF into image, edit the image, then convert the image back into a PDF.

treetalkertoday at 9:12 PM

I'd love to see clearer examples: a video, or original pdf / command / result pdf. Very cool!

shevistoday at 9:59 PM

A side effect of replacing entire pages with images is that the file size will expand dramatically. Most PDFs only contain a couple of images

sultsontoday at 9:17 PM

How cool! It's frustrating how tedious many PDF workflows still are. I've been building something similar in this space[0], but web-based where you visually specify the area to edit. The biggest issue for now is the cost per edit as the Pro version amounts to roughly $0.15/image. However, with some finessing, the original Nano Banana seems to do a great job as well. Have you explored UI-based approaches yourself by any chance?

[0]https://docusera.com/

itsmevictortoday at 9:26 PM

Very nice! I wonder whether that could be used to get LLMs to annotate pdfs. Say an "agentic" CLI like Claude Code or Gemini-cli reviews a pdf and finds typos, could it use this to annotate the pdf like underlining them in red or something of that sort? That could be nice.

John7878781today at 10:23 PM

Love this.

After several iterations of edits, would the image quality decrease?

mentalgeartoday at 9:27 PM

Nice - but consider adding an animated screengrap like: https://github.com/pythops/oryx

ThrowawayTestrtoday at 9:56 PM

I recently tried to change a single word in a PDF and nearly tore my hair out (thank you LibreOffice) I'll definitely keep this in mind for next time, thank you.

show 1 reply