logoalt Hacker News

After outages, Amazon to make senior engineers sign off on AI-assisted changes

185 pointsby ndr42today at 1:31 PM253 commentsview on HN

https://www.ft.com/content/7cab4ec7-4712-4137-b602-119a44f77... (https://archive.ph/wXvF3)

https://twitter.com/lukolejnik/status/2031257644724342957 (https://xcancel.com/lukolejnik/status/2031257644724342957)


Comments

cobolcomesbacktoday at 3:50 PM

This “mandatory meeting” is just the usual weekly company-wide meeting where recent operational issues are discussed. There was a big operational issue last week, so of course this week will have more attendance and discussion.

This meeting happens literally every week, and has for years. Feels like the media is making a mountain out of a mole hill here.

show 11 replies
happytoexplaintoday at 3:56 PM

>Junior and mid-level engineers can no longer push AI-assisted code without a senior signing off

Review by a senior is one of the biggest "silver bullet" illusions managers suffer from. For a person (senior or otherwise) to examine code or configuration with the granularity required to verify that it even approximates the result of their own level of experience, even only in terms of security/stability/correctness, requires an amount of time approaching the time spent if they had just done it themselves.

I.e. senior review is valuable, but it does not make bad code good.

This is one major facet of probably the single biggest problem of the last couple decades in system management: The misunderstanding by management that making something idiot proof means you can now hire idiots (not intended as an insult, just using the terminology of the phrase "idiot proof").

show 25 replies
philip1209today at 9:39 PM

I think the deeper need is a "self-review" flow.

People push AI-reviewed code like they wrote it. In the past, "wrote it" implies "reviewed it." With AI, that's no longer true.

I advocate for GitHub and other code review systems to add a "Require self-review" option, where people must attest that they reviewed and approved their own code. This change might seem symbolic, but it clearly sets workflows and expectations.

show 1 reply
rglovertoday at 9:53 PM

The amount of time and money being wasted chasing this dragon is unreal.

show 2 replies
prakhar897today at 5:17 PM

From the amazon I know, people only care about a. not getting fired and b. promotions. For devs, the matrix looks like this:

1. Shipping: deliver tickets or be pipped.

2. Having Less comments on their PRs: for some drastically dumb reason, having a PR thoroughly reviewed is a sign of bad quality. L7 and above use this metric to Pip folks.

3. Docs: write docs, get them reviewed to show you're high level.

Without AI, an employee is worse off in all of the above compared to folks who will cheat to get ahead.

I can't see how "requesting" folks for forego their own self-preservation will work. especially when you've spent years pitting people against each other.

show 3 replies
cmiles8today at 4:49 PM

The optics here are really bad for Amazon. The continuing mass departures of long tenured folks, second-rate AI products, and a string of bad outages paints a picture that current leadership is overseeing a once respected engineering train flying off the tracks.

News from the inside makes it sound like things are getting pretty bad.

show 1 reply
sdevonoestoday at 4:28 PM

Reviewing AI generated code at PR time is a bottleneck. It cancels most of the benefits senior leadership thinks AI offers (delivery speed).

There’s also this implicit imbalance engineers typically don’t like: it takes me 10 min to submit a complete feature thanks to Claude… but for the human reviewing my PR in a manual way it will take them 10-20 times that.

Edit: at the end real engineers know that what takes effort is a) to know what to build and why, b) to verify that what was built is correct. Currently AI doesn’t help much with any of these 2 points.

The inbetweens are needed but they are a byproduct. Senior leadership doesn’t know this, though.

show 4 replies
lokartoday at 4:01 PM

If this is true, it misunderstands the primary goals of code review.

Code review should not be (primarily) about catching serious errors. If there are always a lot of errors, you can’t catch most of them with review. If there are few it’s not the best use of time.

The goal is to ensure the team is in sync on design, standards, etc. To train and educate Jr engineers, to spread understanding of the system. To bring more points of view to complex and important decisions.

These goals help you reduce the number of errors going into the review process, this should be the actual goal.

show 1 reply
ndr42today at 1:44 PM

I think the problem of responsibility will come for many more companies sooner than later. It is possible that some of the alleged efficacy gains by using ai are not so big anymore when someone has to be accountable for it.

ritlotoday at 3:54 PM

The only way to see the kinds of speed-up companies want from these things, right now, is to do way too little review. I think we're going to see a lot of failures in a lot of sectors where companies set goals for reduced hours on various things they do, based on what they expected from LLM speed-ups, and it will have turned out the only way to hit those goals was by spending way too little time reviewing LLM output.

They're torn between "we want to fire 80% of you" and "... but if we don't give up quality/reliability, LLMs only save a little time, not a ton, so we can only fire like 5% of you max".

(It's the same in writing, these things are only a huge speed-up if it's OK for the output to be low-quality, but good output using LLMs only saves a little time versus writing entirely by-hand—so far, anyway, of course these systems are changing by the day, but this specific limitation has remained true for about four years now, without much improvement)

show 2 replies
Lalabadietoday at 3:46 PM

I'm not sure the sustainable solution is to treat an excess of lower-quality code output as the fixed thing to work with, and operationalize around that, but sure.

show 1 reply
sethops1today at 3:56 PM

> The response for now? Junior and mid-level engineers can no longer push AI-assisted code without a senior signing off.

So basically, kill the productivity of senior engineers, kill the ability for junior engineers to learn anything, and ensure those senior engineers hate their jobs.

Bold move, we'll see how that goes.

show 4 replies
AlexeyBrintoday at 1:47 PM

I wonder how this will work in practice. Say I'm a senior engineer and I produce myself thousands of lines of code per day with the help of LLMs as mandated by the company. I still need to presumably read and test the code that I push to production. When will I have time to read and evaluate similar amounts of code produced by a junior or a mid level engineer ?

show 3 replies
AlotOfReadingtoday at 3:36 PM

I'm not surprised by the outages, but I am surprised that they're leaning into human code review as a solution rather than a neverending succession of LLM PR reviewers.

I wonder if it's an early step towards an apprenticeship system.

show 2 replies
andsoitistoday at 1:34 PM

> Amazon’s website and shopping app went down for nearly six hours this month in an incident the company said involved an erroneous “software code deployment.” The outage left customers unable to complete transactions or access functions such as checking account details and product prices.

The environment breathed a little.

julienchastangtoday at 5:30 PM

> best practices and safeguards are not yet fully established

The way I am working with AI agents (codex) these days is have the AI generate a spec in a series of MD documents where the AI implementation of each document is a bite sized chunk that can be tested and evaluated by the human before moving to the next step and roughly matches a commit in version control. The version control history reflects the logical progression of the code. In this manner, I have a decent knowledge of the code, and one that I am more comfortable with than one-shotting.

show 1 reply
znpytoday at 9:47 PM

"Make senior engineer sign off ai-assisted changes" sounds incredibly weird.

First thing that comes to mind is: reminds me of those movie where some dictatorship starts to crumble and the dictator start being tougher and tougher on generals, not realizing the whole endeavor is doomed, not just the current implementation.

Then again, as a former amazon (aws) engineer: this is just not going to work. Depending how you define "senior engineer" (L5? L6? L7?) this is less and less feasible.

L5 engineers are already supposed to work pretty much autonomously, maybe with L6 sign-off when changes are a bit large in scope.

L6 engineers already have their own load of work, and a fairly large amount of engineers "under" them (anywhere from 5 to 8). Properly reviewing changes from all them, and taking responsibility for that, is going to be very taxing on such people.

L7 engineers work across teams and they might have anywhere from 12 to 30 engineers (L4/5/6) "under" them (or more). They are already scarce in number and they already pretty much mostly do reviews (which is proving not sufficient, it seems). Mandating sign-off and mandating assumption of responsibility for breaking changes means these people basically only do reviews and will be stricter and stricter[1] with engineers under them.

L8 engineers, they barely do any engineering at all, from what I remember. They mostly review design documents, in my experience not always expressing sound opinions or having proper understanding of the issues being handled.

In all this, considering the low morale (layoffs), the reduced headcount (layoffs) and the rise in expectations (engineers trying harder to stay afloat[2] due to... layoffs)... It's a dire situation.

I'm going to tell you, this stinks A LOT like rotting day 2 mindset.

----

1. keep in mind you can't, in general, determine the absence of bugs

2. Also cranking out WAY MUCH MORE code due to having gen-ai tools at their fingertips...

throwaw12today at 4:41 PM

If Seniors are going to review every GenAI generated code, how do they keep up with the volume of changes?

So you have 2 systems of engineers: Sr- and Sr+

1. Both should write code to justify their work and impact

2. Sr- code must be reviewed by Sr+

What happens:

a. Sr+ output drops because review takes their time more and more

b. Sr+ just blindly accepts because of the volume is too high, and they should also do their own work

c. Sr+ asks Sr- to slow-down, then Sr- can get bad reviews for the output, because on average Sr+ will produce more code

I think (b) will happen

LogicFailsMetoday at 4:43 PM

For the good of the company's future, all code should be reviewed by L10s going forward before they are accepted. They're the only ones with enough skin in the game to know what really matters after all.

And from their sagely reviews, we shall train a large language model to ultimately replace them because the most fungible thing at Amazon is the leadership.

mhogerstoday at 6:10 PM

.agentignore/.agentnotallowed file

force agents to not touch mission critical things, fail in CI otherwise

let it work on frontends and things at the frontier of the dependency tree, where it is worth the risk

show 1 reply
m3kw9today at 9:42 PM

A year later, they will require AI to sign off engineer changes.

kmg_finfoliotoday at 1:53 PM

The accountability problem is real but I think it's slightly different from what's being described. The issue isn't just "who signs off"; it's that the reasoning behind a change becomes invisible when AI generates it. A senior engineer can approve output they don't fully understand, and six months later when something breaks, nobody can reconstruct why that decision was made. Human review works when the reviewer can actually interrogate the logic. At LLM-assisted velocity, that bar gets harder to clear every month.

zcw100today at 6:25 PM

I just met a guy from Amazon this past weekend who was bragging, "We've got unlimited access to LLMs and our developers have 10 agents going at a time.". I tried telling him it wasn't all unicorns and rainbows but I didn't get the impression he cared and just kept crapping out skittles.

butILoveLifetoday at 4:59 PM

Maybe its my 1 buddy that works at amazon, but they seemed extremely slow to adopt LLMs. Big ships take a long time to turn, but this seemed hostile.

I am seeing this mindset still, with AI Agents. I imagine they will slowly realize they need to use this stuff to be competitive, but being slow to adopt AI seems like it could have been the source of this.

show 1 reply
tcbrahtoday at 5:41 PM

the funniest part is amazon literally started tying AI usage to performance reviews like 6 months ago and now theyre doing damage control. you cant simultaneously pressure every engineer to use more AI AND be shocked when AI-assisted code breaks prod. pick one lol

Insanitytoday at 4:36 PM

It's only going to get worse with the brain drain as a result of the layoffs. Which will increase the use of AI assisted coding and increase the number of outages related to this.

Imagine having to debug code that caused an outage when 80% is written by an LLM and you now have to start actually figuring out the codebase at 2am.. :)

smy20011today at 4:53 PM

An outage could cost Amazon ~millions to tens of millions. Most of the time, we want the junior to learn from the outage and fix the process. With AI agent, we can only update the agent.md and hope it will never happen again.

dedoussistoday at 5:35 PM

How do they determine whether a PR is AI-assisted and therefore requires senior review? A junior engineer could still copy-paste AI-generated code and claim it as their own.

show 1 reply
newobjtoday at 8:24 PM

Speed of code-writing was never the issue at Amazon or AWS. It was always wrong-headed strategic directions, out to lunch PMs, dogshit testing environment, stakeholder soup, high turnover, bureaucracy, a pantheon of legacy systems, insane operational burdens, garbage tooling, and last but not last -- designing for inter-system failure modes, which let's be real, AI has no chance of having context for -- and so on...

Imagine if the #1 problem of your woodworking shop is staff injuries, and the solution that management foists on you is higher RPM lathes.

dragonelitetoday at 3:59 PM

Expect a shitload of AI powered code review products the next 18 months.

show 4 replies
teeraytoday at 9:00 PM

> Junior and mid-level engineers can no longer push AI-assisted code without a senior signing off

So what incentive is there for juniors to look at the code at all? Seniors are now just another CI stage for their slop to pass.

letitgo12345today at 4:52 PM

Worth noting that this is when they used Amazon's own AI product, not when using Claude Code or Codex.

mattschallertoday at 4:21 PM

Anyone work with Kiro before? As I understood, it was held as an INTERNAL USE ONLY tool for much longer than expected.

show 2 replies
dlev_pikatoday at 6:20 PM

A few days ago, after some very weird failed purchase attempts I made (payment couldn’t be validated or Smth) I received an even weirder mail from Amazon saying they had detected suspicious activity, all my devices got logged out and I was forced to change my password. I did it, after verifying it was a legit email (even if it looked sketchy af, pure text, unstyled, but sender verified and confirmed with in-app behavior), and next I know all my orders and browsing history had disappeared - +15 yrs of history, done.

Over the next few days my account history came back, except purchases made Q1 2026. Those are still missing. There are a few substantial purchases I made that are nowhere to be found anymore.

I attributed this Iranian missiles hitting some of their infrastructure in EU, as it had been reported.

Now I am not sure if it was blast radius from missiles or AI mishaps. Lmao - couldn’t happen to a worse company…

CodingJeebustoday at 1:58 PM

I'm at a small company struggling with this problem. Fundamentally, we have a limited context and AI is capable of generating tremendous amounts of output that exceed our ability to deeply process.

I find myself context-switching all the time and it's pretty exhausting, while also finding that I'm not retaining as much deep application domain knowledge as I used to.

On the surface, it's nice that I can give my LLM a well-written bug ticket and let it loose since it does a good job most of the time. But when it doesn't do a good job or it's making a change in an area of the codebase I'm not familiar with, auditing the change gets tiring really fast.

rvztoday at 8:19 PM

Hope this happens at GitHub since there are constant outages on the entire platform.

mikkupikkutoday at 9:02 PM

lgtm

skeledrewtoday at 4:06 PM

> the affected tool served customers in mainland China

Thought this blurb most interesting. What's the between-lines subtext here? Are they deliberately serving something they know to be faulty to the Chinese? Or is it the case that the Chinese use it with little to no issue/complaint? Or...?

10xDevtoday at 6:00 PM

With AI it makes sense to have leaner teams. Being able to go faster requires greater responsibility.

oxqbldpxotoday at 4:19 PM

Not fun to work at amazon.com it seems.

bigbuppotoday at 4:01 PM

Ugh. The Great Oops has never been closer.

MDGeisttoday at 4:18 PM

A former colleague of mine recently took a role that has largely turned out to be "greybeard that reviews the AI slop of the junior engineers". In theory it sounds workable, but the volume of slop makes thoughtful review impossible. Seems like most orgs will just put pressure on the slop generators to do more and put pressure on the approvers and then scape goat the slop approvers if necessary?

show 1 reply
dude250711today at 3:52 PM

I knew this would happen.

Take a perfectly productive senior developer and instead make him be responsible for output of a bunch of AI juniors with the expectation of 10x output.

show 1 reply
AlexandrBtoday at 5:05 PM

"We want you to use AI for everything!"

"No, not like that though!"

fredgrotttoday at 5:04 PM

Curious question, how many Amazon Engineers flunk basic CS?

If you know CS you know two things:

1. AI can not judge code either noise or signal, AI cannot tell. 2. CS-wise we use statistic analysis to judge good code from bad.

How much time does it take to take AI output and run the basic statistic tools for most computer languages?

Some juniors need firing outright

th2o34i3432897today at 4:36 PM

First Microsoft and now Amazon (eg. their RufusAI is useless compared to the old comment search!)

Has Seattle now become the code-slop capital ? Or is SFO still on top ?

josefritzisheretoday at 3:37 PM

The excessive exuberance of AI adoption is all part of the bubble.

show 1 reply
throw_m239339today at 4:54 PM

Yet another example of vibe coding at scale. You'll have to hire a lot of seniors out of retirement to fix that mess of gigantic proportions... and don't blame "the juniors" for that, they didn't make the decision to allow those tools at first place.

show 1 reply
adrien_devtoday at 8:18 PM

[dead]

🔗 View 1 more comment