logoalt Hacker News

epistasisyesterday at 2:31 PM13 repliesview on HN

I think one thing that people are sleeping on is passing a ton of secrets to OpenAI and Anthropic or your OpenRouter by having a .env or secrets on disk in your repo, but not checked in

Your LLM will happily read the entire file, ship it off to be training data for future versions of ChatGPT, and not raise any flags, because let's be fair it was on ok thing to check if all the env vars were set, or it you had set up the database password for the app.

It's time for orgs to audit and rotate secrets wherever they are stored in disk or in logs, and switch to SOPS or Vault or whatever to keep these out if plaintext except exactly when needed.


Replies

mooredsyesterday at 2:41 PM

Agreed. Static long lived credentials are real problems. Kudos for AWS and the other hyperscalers for building the tooling to move away from them. And providing some gentle and not-so-gentle nudges away from it too.

But not everyone is where they need to be. For instance, railway doesn't let you access AWS resources via roles/OIDC. I filed a ticket[0] but haven't seen movement.

0: https://station.railway.com/feedback/allow-for-integration-w...

show 1 reply
nijaveyesterday at 11:44 PM

In fairness, any secrets in your .env file in your development tree shouldn't have very important secrets. They should be limited access dev secrets and any secrets that go to "production" systems like an OpenAI dev environment should be limited, where possible.

Besides leaking, it's easy to oopsie and DoS a system or send malformed requests in the course of testing and development. You don't want a surprise $1k bill cause someone was working on some test automation and accidentally sent thousands of real results in the process.

nrubyesterday at 3:33 PM

I no longer keep my dotenv files in plaintext. I use `sops` to keep an encrypted env around and you can use tools like direnv to make them available to your shell while you're working. Obviously the LLM could print any of these secrets, but it's less likely. Additionally I find that at least claude seems to avoid reading the dotenv. And lastly, don't make any local secrets that important. Limited scope, dev accounts, etc.

show 2 replies
strbeantoday at 12:52 AM

Plug for my buddy's project: http://agentsh.org/

Block agents from misbehaving at the OS level instead of asking them to behave.

doctobogganyesterday at 2:55 PM

I've noticed recently that at least Claude will try its best not to read your env files. You really need to push it in the prompt if you want it to read and access your DB for example.

show 2 replies
cozzydyesterday at 3:06 PM

it seems crazy to "trust" an LLM with any secrets. Anyone running one as their normal user account with access to all files is playing with fire...

show 2 replies
philipwhiukyesterday at 2:40 PM

Sure but like, no AI was needed here. Regular human stupidity is still pretty potent.

show 1 reply
j0ej0ej0eyesterday at 10:24 PM

[Cursor appears to at least be trying...](https://cursor.com/docs/reference/ignore-file#why-ignore-fil...)

> Cursor automatically ignores files in .gitignore

...

>While Cursor blocks ignored files, complete protection isn't guaranteed due to LLM unpredictability.

[Antigravity appears to just _do_, not _try_)[https://antigravity.google/docs/strict-mode]

show 1 reply
theozeroyesterday at 4:20 PM

Get everything out of plaintext!

Varlock is a great and flexible way to do this.

giancarlostoroyesterday at 2:55 PM

Claude told me to revoke an API key I accidentally pasted (was for a side project and I was getting it on its legs) just flat out did not want it. I have a feeling that if it needs something out of an env file it will grep for the specific line.

show 2 replies
yieldcrvyesterday at 3:21 PM

probably but a ton of services have popped up in the last 6 months specifically to help mitigate that

localhost reading env from the cloud and other solutions

to me it suggested that I’m already late on that idea, but I can understand how that puts me deeper in a bubble than others

show 1 reply
doctorpanglossyesterday at 4:27 PM

what exactly is the threat model?

user data is always paraphrased for training. what do you mean, not raise any flags?

look... Google is running your browser, Apple your messenger, Amazon your backend. They already have all these keys in the same way, are they misusing them? Why doens't it raise any flags then?

show 1 reply
jonnyasmaryesterday at 11:57 PM

[flagged]

show 1 reply