Giving LLM agents direct, autonomous access to a real production databases with write access seems insane to me.
NO ONE, agent or human, should have direct write access to production databases outside of emergency break glass scenarios. This is why we have stored routines and API layers to pre-define what writes are allowed. The facts that agents CAN autonomously write to a database does not imply that they should.
For the point about query optimization, again your agents should not be issuing random queries against a production database. We have had the concept of separate analytics databases with different architectures to support exporatory queries for decades.
> autonomous access to a real production databases
remember that filesystems are just sophisticated databases
rm -rf /<unexpectedly-missing-value>I totally i agree! even read access specialty when databases has some sensitive/personal information about users.
How does that even work in compliance-relevant scenarios where the audit trail shows some LLM messed with the data? Who, if anyone, is on the hook?
I’m a DBRE. I spend a good portion of my day with a shell into one or more prod databases. The schema definitions in code are scattered between ORM model definitions, Alembic migrations, and Liquibase migrations, so the only reliable way I have of understanding a schema as it exists is to view it. Plus, I am very comfortable with SQL, and the various system catalogs of both MySQL and Postgres, so it’s a ton easier to work with.
Truly sensitive customer information is encrypted, and on an isolated DB cluster that no one has regular access to. I also operate with a read-only grant, because manual writes to a prod DB is generally a terrible idea.
LLM agents are unlocking demand and supply for applications that wouldn't have been possible before due to time constraints though. There's a growing demand for single user or smaller scoped apps where giving LLM agents direct access means velocity. The failure/rollback model is much easier with these as long as we have good backup hygiene.
I agree and hope this is the case for anything serious enough. I also don't see this changing any time soon.
There are ways to give safe access to the data, at least read-only, that don't involve production risk and don't sacrifice privacy. For example, database branches with anonymization. Instead of accessing the prod/staging db, the agent creates a branch and has read/write access to that.
(disclaimer: I work at Xata, where we offer copy-on-write branches for Postgres, and the agent use-cases are the most popular right now)