Claude's actually pretty great at this! I actually used to use Claude A LOT to answer interesting questions (which I'll be writing up on!) More generally, Claude is palpably different from most other agents. I'd recommend these models – especially Opus – without qualifications.
But there's a process risk here based on their current practises. I'm hoping those practises change so that I can recommend Claude to everyone I know, but as of now, there's existential risk exposure here that's greater than Google's.
Anthropic's automated systems can and will ban you for pretty arbitrary things; and you won't get human support or Claude – even if you are an enterprise paying out of your nose. And there's 0 redressal unless you go viral on social media. Or know someone who knows someone. See: https://x.com/Whizz_ai/status/2051180043355967802 https://x.com/theo/status/2045618854932734260
And I say that as someone who likes how Anthropic has been training Claude and Opus. I just don't think they're prepared to be the trillion dollar company they've become. They are – in a very real way – suffering from success. Which is extremely inconvenient to be on the receiving end of when you're on a deadline.
Pretty great at what? I work in the insurance industry specifically medicare. All I see is sales people and other managers slopping out AI dashboards off of spreadsheets galore. Not only is it terrible for protecting PHI/PII. It also doesn't do things like RBAC very well either. Now instead of preventing a person from externally sharing a file i have to make sure they didn't egress the file to supabase or some other platform.
Here's some of the horrible things i've seen. Frontend dashboard with PHI/PII deployed via vercel/next because AI told them how to get their site online. Login is hardcoded into the frontend so anyone with inspect can find the password.
Another "fixed" dashboard deployed the same way. This time they added firebase auth so they got sign in with Google added with only logging into our domain. Wait how would they be able to create a token for our domain? They didn't the frontend just blocks domains from calling firebase.auth but firebase doesn't care. So simply calling the function in the console lets me login with any gmail account....
They also where showing me their RBAC with firebase. Again they don't have access to our Orgnization/Directory/Groups. So i wondered how they did this.. wouldn't you guess its a hardcoded list of approved users. You can literally call firebase.auth and sign in anonymously. Again only the frontend checks the email addresses. So now that i have a firebase auth all the backend firebase function just check that you have auth'd. So i can make any request i want to the backend. The frontend simply won't show me the code.
I could go on and on about the stupidity levels I'm facing but I don't feel like crashing out.
All I can say is this tool is only useful if you already know how to correctly implement these things. Does it save me time sure but I have to call it retarded and explain why not to do things. Honestly I feel like claude is good for people who like to gamble. When it gets it right it feels great but I don't want to roll the dice 30 times to get it correct.
> and you won't get human support or Claude – even if you are an enterprise paying out of your nose. And there's 0 redressal unless you go viral on social media.
Sadly this sounds like par for the course when it comes to tech. Too many messages and requests for help depend on knowing someone in the right slack groups.
They aren't even close to a 1T company, they're valued at <400bb and that's at like a 20x-30x multiple. They can probably raise money at a higher valuation but its literally just value based on hype, not revenue.
Before AI, shipping code to production used to be a two-person task: one writes the code, another one reviews the code. Now with AI writing the code, the developer that was supposed to write the code, only has to review it. And this is because they are responsible for the code they ship.
Code review has become unbearable because before AI, developers were reviewing code as they went writing it in the first place. Granted, never perfect and why a second person reviewing code was (is?) a best practice. But effectively there was always some level of code review happening as developers wrote code.
I fear it is way more boring to review financial and medical documents completely written by AI than it is to write (and at the same time review) by yourself. And way more dangerous to ship mistakes than in most software.