Man, I’ve been there. Tried throwing BERT at enzyme data once—looked fine in eval, totally flopped i...

b0a04gl • last Wednesday at 7:09 AM • 7 replies • view on HN

Man, I’ve been there. Tried throwing BERT at enzyme data once—looked fine in eval, totally flopped in the wild. Classic overfit-on-vibes scenario.

Honestly, for straight-up classification? I’d pick SVM or logistic any day. Transformers are cool, but unless your data’s super clean, they just hallucinate confidently. Like giving GPT a multiple-choice test on gibberish—it will pick something, and say it with its chest.

Lately, I just steal embeddings from big models and slap a dumb classifier on top. Works better, runs faster, less drama.

Appreciate this post. Needed that reality check before I fine-tune something stupid again.

Replies

ErigmolCt • last Wednesday at 7:26 AM

Transformers will ace your test set, then faceplant the second they meet reality. I've also done the "wow, 92% accuracy!" dance only to realize later I just built a very confident pattern-matcher for my dataset quirks.

➕ show 1 reply

stevenae • last Wednesday at 12:24 PM

> Lately, I just steal embeddings from big models and slap a dumb classifier on top. Works better, runs faster, less drama.

You may know this but many don't -- this is broadly known as "transfer learning".

➕ show 1 reply

ActivePattern • last Wednesday at 2:32 PM

Ironically, this comment reads like it was generated from a Transformer (ChatGPT to be specific)

➕ show 2 replies

sebzim4500 • last Wednesday at 11:12 AM

>Lately, I just steal embeddings from big models and slap a dumb classifier on top. Works better, runs faster, less drama.

Sure but this is still indirectly using transformers.

➕ show 1 reply

davidclark • last Wednesday at 4:37 PM

I’m not sure anyone I know could make an em dash with their keyboard off the top of their head.

[meta] Here’s where I wish I could personally flag HN accounts.

➕ show 6 replies

saagarjha • last Wednesday at 8:12 AM

What kind of data did you run this on?

teruakohatu • last Wednesday at 11:04 AM

> Like giving GPT a multiple-choice test on gibberish—it will pick something, and say it with its chest.

If I gave a classroom of under grad students a multiple choice test where no answers were correct, I can almost guarantee almost all the tests would be filled out.

Should GPT and other LLMs refuse to take a test?

In my experience it will answer with the closest answer, even if none of the options are even remotely correct.

➕ show 5 replies

alt Hacker News

Replies