>We should give up with the idea of databases which are 'open' to the public, but you have to pay to access, reproduction isn't allowed, records cost pounds per page, and bulk scraping is denied. That isn't open.
How about rate limited?
The issue with that is people can then flood everything with huge piles of documents, which is bad enough if it's all clean OCR'd digital data that you can quickly download in its entirety, but if you're stuck having to wait between downloading documents, you'll never find out what they don't want you to find out.
It's like having you search through sand, it's bad enough while you can use a sift, but then they tell you that you can only use your bare hands, and your search efforts are made useless.
This is not a new tactic btw and pretty relevant to recent events...
Systems running core government functions should be set up to be able to efficiently execute their functions at scale, so I'd say it should only restrict extreme load, ie DoS attacks
If the rate limit is reasonable (allows full download of the entire set of data within a feasible time-frame), that could be acceptable. Otherwise, no.
No. Open is open. Beyond DDoS protections, there should be no limits.
If load on the server is a concern, make the whole database available as a torrent. People who run scrapers tend to prefer that anyway.
This isn't someone's hobby project run from a $5 VPS - they can afford to serve 10k qps of readonly data if needed, and it would cost far less than the salary of 1 staff member.