logoalt Hacker News

ghm2199yesterday at 3:19 PM0 repliesview on HN

In medicine there is a concept of reporting adverse effects of medication or interventions which are then collectively studied for Public Health [MedWatch][VAERS][EudraVigilance] and in academia. We should have something like that for all coding agents(and agents in other fields too), given how widely its deployed and affect on "health" in general(not only human). Call it the AI "health" of things benchmark.

I would imagine a sort of hybrid qualities of volunteer efforts like wikipedia, new problems like advent of code and benchmarks like this. The goal? It would be to study the collective effort on the affects of usage to so many areas where AI is used.

[MedWatch](https://www.fda.gov/safety/medwatch-fda-safety-information-a...)

[VAERS](https://www.cdc.gov/vaccine-safety-systems/vaers/index.html)

[EudraVigilance](https://www.ema.europa.eu/en/human-regulatory-overview/resea...)