logoalt Hacker News

chadwebscraperyesterday at 7:12 PM2 repliesview on HN

Here’s how it works:

1. Paste a URL in, describe what you want

2. Define an interval to monitor

3. Get real time webhooks of any changes in JSON

Lots of customers are using this across different domains to get consistent, repeatable JSON out of sites and monitor changes.

Supports API + HTML extraction, never write a scraper again!


Replies

codingdaveyesterday at 9:11 PM

Writing a scraper isn't the hard part, that is actually fairly trivial at this point in time. Pulling content into JSON from your scrape is also fairly trivial - libraries exist that handle it well.

The harder parts are things like playing nicely so your bot doesn't get banned by sysadmins, detecting changes downstream from your URL, handling dynamically loading content, and keeping that JSON structure consistent even as your sites change their content, their designs, etc. Also, scalability. One customer I'm talking to could use a product like this, but they have 100K URLs to track, and that is more than I currently want to deal with.

I absolutely can see the use case for consistent change data from a URL, I'm just not seeing enough content in your marketing to know whether you really have something here, or if you vibe coded a scraper and are throwing it against the wall to see if it sticks.

show 1 reply
tmalyyesterday at 9:45 PM

this must wreck their google analytics stats

show 1 reply