logoalt Hacker News

ksriyesterday at 8:18 AM2 repliesview on HN

I have been working on extrasuite (https://github.com/think41/extrasuite). This is like terraform, but for google drive files.

It provides a git like pull/push workflow to edit sheets/docs/slides. `pull` converts the google file into a local folder with agent friendly files. For example, a google sheet becomes a folder with a .tsv, a formula.json and so on. The agent simply edits these files and `push`es the changes. Similarly, a google doc becomes an XML file that is pure content. The agent edits it and calls push - the tool figures out the right batchUpdate API calls to bring the document in sync.

None of the existing tools allow you to edit documents. Invoking batchUpdate directly is error prone and token inefficient. Extrasuite solves these issues.

In addition, Extrasuite also uses a unique service token that is 1:1 mapped to the user. This means that edits show up as "Alice's agent" in google drive version history. This is secure - agents can only access the specific files or folders you explicitly share with the agent.

This is still very much alpha - but we have been using this internally for our 100 member team. Google sheets, docs, forms and app scripts work great - all using the same pull/push metaphor. Google slides needs some work.


Replies

lewisjoeyesterday at 8:19 PM

Excellent project! I see that the agent modifies the google docs using an interesting technique: convert doc to html, AI operates over the HTML and then diff the original html with ai-modified html, send the diff as batchUpdate to gdocs.

IMO, this is a better approach than the one used by Anthropic docx editing skill.

1. Did you compare this one with other document editing agents? Did you have any other ideas on how to make AI see and make edits to documents?

2. What happens if the document is a big book? How do you manage context when loading big documents?

PS:I'm working on an AI agent for Zoho Writer(gdocs alternative) and I've landed on a similar html based approach. The difference is I ask the AI to use my minimal commands (addnode, replacenode, removenode) to operate over the HTML and convert them into ops.

This works pretty well for me.

sothatsityesterday at 10:03 AM

We have been using something similar for editing Confluence pages. Download XML, edit, upload. It is very effective, much better than direct edit commands. It’s a great pattern.

show 2 replies