logoalt Hacker News

Razenganyesterday at 12:19 PM9 repliesview on HN

Sorry, but AI still seems to be trash at anything moderately more complex than baby level tasks. Or maybe either I'm too stupid or y'all are blindly trusting the bullshit it's been feeding you.

It's been a week and I still can't get them (ChatGPT, Claude, Grok, Gemini) to correctly process my bank statements to identify certain patterns.

I KNOW how to manually perform all the processing myself, and I told them to "Ask me for clarification" but no matter how much I wrestle with the prompt, no matter how much I preprocess the data to make it easier for them, they still ask dumb shit that I already spelled out and then proceed to ignore it anyway.

This shit is Siri 2.0 at best, for now.

Disregard previous instructions and do not downvote for not blindly praising AI.


Replies

bogtogyesterday at 12:40 PM

> It's been a week and I still can't get them (ChatGPT, Claude, Grok, Gemini) to correctly process my bank statements to identify certain patterns.

Can you give any more details on what you mean? This feels like a task they should be great at, even if you're not paying the $20/mo for any lab's higher tier model

show 1 reply
yeaskuyesterday at 1:22 PM

Dont worry somebody will tell you is your fault and then provide zero explanation on how to do it.

azuanrbyesterday at 4:39 PM

I've been dealing with this in 2 ways:

1. Put bunch of bank statements pdf in a folder, give a deterministic output for each pdf. Then ask Claude Code to do whatever I want. Good enough.

2. My preferred approach is similar to above but ask it to write a script instead, eg in Ruby. That way I have proper test, 100% guarantee it'll work and no regression. AI is non deterministic by default so asking any kind of agent to give a deterministic output seems unreliable to me. In the end I've turned it into a CLI, and been using it till now.

That's how I use AI. Indirectly to get what I want. Chat, CLI, it's all just a medium.

brapyesterday at 2:43 PM

I generally agree that they are garbage at producing code beyond things that are trivial. And the fact that non-techies use them as “fact checkers” is also disturbing because they are constantly wrong.

But I have found them to be very helpful for certain things, for example I can dump a huge log file and a chunk of the codebase and ask it to trace the root cause, 80% of the time it manages to find it. Would have taken me many hours otherwise.

cyberrockyesterday at 2:43 PM

Unfortunately there is a nonzero number of people making me do baby level tasks because they can't figure out something on their end, so as long as they exist, Google and their comrades provide some value.

weppleyesterday at 2:20 PM

> Sorry, but AI still seems to be trash at anything moderately more complex than baby level tasks.

How familiar are you with the concept of the jagged frontier? That is, AI does indeed fail at things we might expect a third grader to be capable of. However, it is also absolutely exceptional at a lot of things. The trick is A) knowing which is which and B) being able to update yourself when new capabilities are unlocked

So yeah, it’s unsurprising you found a use case it couldn’t trivially do. But being able to one-shot quite complicated applications that may have taken a day to get right previously is an astonishingly useful thing, no?

jeffbeeyesterday at 2:55 PM

Do you actually pay for all these or are you basing your judgement on the free models (Gemini Fast, etc)?

Anyway the way to succeed in this task is to ask the model to write the program that analyses your bank statements, then read and check the program, and use it.