> Sure but it's not high at all. It depends. For a sysadmin maybe not, but for data scient...

dartos • 11/08/2024 • 1 reply • view on HN

> Sure but it's not high at all.

It depends. For a sysadmin maybe not, but for data scientists, the bar would be pretty high just to understand the math jargon.

> If perplexity can tell you exactly what to do 90% of the time without error

That “if” is carrying a lot of weight. Anecdotally I haven’t seen any llm be correct 90% of the time. IIRC SOTA on swebench (which tbf isn’t a great benchmark) is around 30%.

> flawless write-eval loops with the help of cline, cline is a pretty good programmer.

I’m not really sure what you mean by “flawless” but having a rubber duck is always more helpful than harmful.

> A lot of things AI is helping with also have good, easy to observe / generate, real-time metrics you can use to judge excellence.

Like what?

Replies

kajecounterhack • 11/08/2024

> A lot of things AI is helping with also have good, easy to observe / generate, real-time metrics you can use to judge excellence.

Exactly what I illustrated earlier: your developer productivity metrics. If you're turning code around faster, setting up your network better, turning around insights faster, the AI is working.

> It depends. For a sysadmin maybe not, but for data scientists, the bar would be pretty high just to understand the math jargon.

Why does an AI coding agent need to understand math jargon -- it just helps you write better code. Are you even familiar with what data scientists do? Seems not because if you were, you'd see clearly where the tool would be applied and do a good/bad job.

Reminder: we're talking about evaluating whether Codebuff / alternatives are "pretty good" at X. Just go play with the tools. tgtweak expressed their opinion on how good the tool rates at some tasks {sysadmin, data engineering, cloud architecture} and your response was to question how someone could have an opinion about it. The obvious answer is that they used the tools and found it useful for those tasks. It may only be _subjectively_ good at what they're using for but it's also a rando's opinion on the internet. As another rando I very much agree with what the person you responded to is saying. You're not going to get more rigor from this discourse - go form a real opinion of your own.

➕ show 1 reply

alt Hacker News

Replies