Something is either public record - in which case it should be on a government website for free, and the AI companies should be free to scrape to their hearts desire...
Or it should be sealed for X years and then public record. Where X might be 1 in cases where you don't want to hurt an ongoing investigation, or 100 if it's someone's private affairs.
Nothing that goes through the courts should be sealed forever.
We should give up with the idea of databases which are 'open' to the public, but you have to pay to access, reproduction isn't allowed, records cost pounds per page, and bulk scraping is denied. That isn't open.
The story is about a tool that allows journalists to get advanced warning of court proceedings so them can choose to cover things of public interest.
It's not about any post-case information.
>We should give up with the idea of databases which are 'open' to the public, but you have to pay to access, reproduction isn't allowed, records cost pounds per page, and bulk scraping is denied. That isn't open.
How about rate limited?
> Something is either public record - in which case it should be on a government website for free, and the AI companies should be free to scrape to their hearts desire...Or it should be sealed for X years and then public record.
OR it should be allowed for humans to access the public record but charge fees for scrapers
I don't know what the particular issue is in this case but I've read about what happens with Freedom of Information (FOI) requests in England: apparently most of the requests are from male journalists/writers looking for salacious details of sex crimes against women, and the authorities are constantly using the mental health of family members as an argument for refusing to disclose material. Obviously there are also a few journalists using the FOI system to investigate serious political matters such as human rights and one wouldn't want those serious investigations to be hampered but there is a big problem with (what most people would call) abuse of the system. There _might_ perhaps be a similar issue with this court reporting database.
England has a genuinely independent judiciary. Judges and court staff do not usually attempt to hide from journalists stuff that journalists ought to be investigating. On the other hand, if it's something like an inquest into the death of a well-known person which would only attract the worst kind of journalist they sometimes do quite a good job of scheduling the "public" hearing in such a way that only family members find out about it in time.
A world government could perhaps make lots of legal records public while making it illegal for journalists to use that material for entertainment purpose but we don't have a world government: if the authorities in one country were to provide easy access to all the details of every rape and murder in that country then so-called "tech" companies in another country would use that data for entertainment purposes. I'm not sure what to do about that, apart, obviously, from establishing a world government (which arguably we need anyway in order to handle pollution and other things that are a "tragedy of the commons" but I don't see it happening any time soon).
One of the problems with open access to these government DBs is that it gives out a lot of information that spammers and scammers use.
Eg if you create a business then that email address/phone number is going to get phished and spammed to hell and back again. It's all because the government makes that info freely accessible online. You could be a one man self-employed business and the moment you register you get inundated with spam.
>and the AI companies should be free to scrape to their hearts desire...
Why? They generate massive traffic, why should they get access for free?
Yes. This should be held by the London Archives in theory with the rest of the paper records of that sort.
They have ability to seal documents until set dates and deal with digital archival and retrieval.
I suspect some of this is it's a complete shit show and they want to bury it quickly or avoid having to pay up for an expensive vendor migration.
The idea that an individual can look up and case they want is the same thing as a bot being able to scrape and archive an entire dataset forever is just silly.
One individual could spend their entire life going through one by one recording cases and never get through the whole dataset. A bot farm could sift through it in an hour. They are not the same thing.
I want information to be free.
I don't think all information should be easily accessible.
Some information should be in libraries, held for the public to access, but have that access recorded.
If a group of people (citizens of a country) have data stored, they ought to be able to access it, but others maybe should pay a fee.
There is data in "public records" that should be very hard to access, such as evidence of a court case involving the abuse of minors that really shouldn't be public, but we also need to ensure that secrets are not kept to protect wrongdoing by those in government or in power.
> Nothing that goes through the courts should be sealed forever.
What about family law?
[dead]
Open to research yes.
Free to ingest and make someones crimes a permanent part of AI datasets resulting in forever-convictions? No thanks.
AI firms have shown themselves to be playing fast and loose with copyrighted works, a teenager shouldn't have their permanent AI profile become "shoplifter" because they did a crime at 15 yo that would otherwise have been expunged after a few years.