logoalt Hacker News

kumar_abhiruptoday at 6:47 PM1 replyview on HN

The way imports work in DenchClaw is a bit unconventional, when you tell it to "import my HubSpot", the agent literally opens your browser (using the copied Chrome profile), navigates to HubSpot, triggers the export, and then ingests the downloaded files into the workspace DuckDB. So the bottleneck isn't really a fat in-memory ETL... it's more like processing a CSV/JSON export file on disk.

For the DuckDB side specifically: we shell out to the duckdb CLI binary for every query rather than embedding it in the Node process. So each operation gets its own memory space and dies when it's done. the web server at localhost:3100 stays lean regardless of what you're ingesting. DuckDB's out-of-core execution also means it can handle datasets larger than available RAM natively, which is one of the reasons we picked it over SQLite.

For really large exports (think full HubSpot instance with 100k+ contacts), the practical limit is more about the browser export step than DuckDB. HubSpot itself chunks its exports, and we process those chunks as they land. The DuckDB insert is the fast part.

Honestly for CRM-scale data, even a large sales org's full HubSpot, DuckDB eats it for breakfast. Where it would get interesting is if someone tries to throw analytics-scale data at it, but that's not really the use case. Would love to hear how IndexedDB holds up for you at scale in AccIQ, different trade-offs for sure.


Replies

iamacyborgtoday at 7:13 PM

> The way imports work in DenchClaw is a bit unconventional, when you tell it to "import my HubSpot", the agent literally opens your browser (using the copied Chrome profile), navigates to HubSpot, triggers the export, and then ingests the downloaded files into the workspace DuckDB.

What’s stopping the agent from doing literally any other thing in HubSpot? You know, small stuff like editing/deleting records, sensing emails, launching marketing campaigns, deleting reports, etc.

show 1 reply