Question Details

No question body available.

Tags

code-reviews teamwork artificial-intelligence

Answers (5)

February 18, 2026 Score: 2 Rep: 47,220 Quality: Medium Completeness: 50%

Preface: most of my answer is purely speculative. My employer doesn't use AI yet, but there are plans to utilize it in the not-so-distant future, so I'm probably going to be in your shoes very soon. This is how I would approach the problem.


With AI, however, it costs me much more time to do the review and to type the comment, compared to a few seconds needed to copy-paste my comments to AI.

That's not a bug; that's a feature. And at least you don't need to spend time copying and pasting these comments into the chatbot; someone else does.

The premise of AI (also the promise) is to dramatically reduce the time it takes to write code. That means the human will be delegating more to the machine. If we are to believe the hype (and the machine delivers on this hype) then we, as software engineers, will play more of a supervisory role to agentic AI.

I think this basically means "human does code review."

Since AI is not deterministic, I can see a period of experimentation become necessary before agentic AI can be fully realized. Humans need to learn how to work with the machine to maximize output, and perhaps the "vibe coder" at this point can be seen as an intermediate step in this process.

From an engineering perspective, pretend the human isn't there and consider the code mostly AI-generated. It sounds like you have someone with experience supervising this thing, so you've got that going for you. If the human isn't providing much benefit, then the next thing I'm thinking about is how to eliminate the extraneous human in the loop and more fully embrace agentic AI.

I'm not advocating for firing this person (labor laws not withstanding). Instead, you will probably need to sip the Kool-Aid just a little. This extra "human in the loop" might be a suitable candidate to supervise other AI agents, making this a force multiplier. Ok, maybe that's a bit more than a sip of the Kool-Aid, but I did warn you this answer was speculative.

I just think we are in the "throw everything at the wall and see what sticks" phase of deploying artificial intelligence. It's an experiment: how much human oversight is required to use AI effectively? Consider that you are evaluating a tool rather than the human submitting the code review, because the tool has not been granted full access to all the auxiliary IT systems necessary to complete the task independently.

February 18, 2026 Score: 2 Rep: 85,748 Quality: Medium Completeness: 50%

Code reviews are next to impossible without defining what you are checking against.

Here your problem is your annoyance with the fact that you might as well write the code yourself and cut out the middle man.

But if you have a list of things that should be checked in a code review, then you can check that list of things regardless of whether or not AI is writing the code.

  1. Correctness

    Tests must pass. Run test automatically and fail on fail

  2. Test coverage

    Tests must exist. Check automatically and fail under a percentage.

  3. Coding style.

    Run a linter and auto fail on its complaints

  4. Performance

    Add performance tests/static analysis and run automatically.

  5. Match the spec.

    Human check, read spec, run code, think. I would ask the submitter to give instruction in the PR, e.g. link to ticket, page that's changed, what to click to demonstrate feature etc.

  6. Whatever else you want.

If you have some rules like this for PRs then evaluating a PR is a fixed task that takes the same amount of time however the code is written.

If the AI writes code that passes the tests then great. If it doesn't then you just have to say (or hopefully automatically say) "Fails rule 2" or whatever.

Better yet, you can say "Have you run through the PR tests yourself?" You should only be getting failing PRs when there is some justifiable problem that needs a human override, i.e.

  • I know it fails performance but we have a pass for this feature
  • The linter fails on this style but it's correct for the library we are using
  • Code coverage fails, but it's because I refactored and all the old code is being counted as new

You never have to ask "Why IsNullOrWhiteSpace?" because either a performance or unit test will fail, not exist, or it won't matter.

If you find you are constantly replying "Does not match spec" Then I call the guy up and say "Hey, this doesn't seem to match the spec? What about feature X?" They won't be able to copy and paste your voice, and you will soon work out if they don't understand the spec or are just being lazy.

February 18, 2026 Score: 2 Rep: 31,202 Quality: Medium Completeness: 50%

A timely discussion, I think. Here's where I see things in early 2026: we have some really interesting new technologies that are being presented as being far more capable than they really are. To be clear, I am not saying these tools are not useful, I think we've only scratched the surface of what can be done with them. But they are definitely not operating as rational sentient 'agents' as so many people seem to believe.

It's that last part which is the issue. People with expertise in specific skills tend to judge LLMs to be less capable in those skills and more capable in areas where they (the person) have limited expertise. That is, for example, a programmer will tend to think that LLMs are better at medicine than programming and a doctor will tend to believe that the LLMs are better at programming than medicine. This is sort of inevitable because of the way they work: they produce plausible outputs based on probabilistic processes. I'm likely not telling you anything you don't already know here. I just want to establish some context.

What this means for you is that, if the management does not have deep technical skill, they may be convinced that these tools are more capable than they really are. I mean Sam Altman says they are genius-level entities, so, what's the problem?

Experienced, skilled developers are in a bit of a bind here. Pushback against the idea that Agentic AI isn't all it's cracked up to be is often perceived as gatekeeping: the nerds just want to protect their guild. A lot of developers feel compelled to embrace the technology, even if they don't believe it actually has a net benefit.

The situation you describe is sort of a worst case scenario, IMO. You are in the position of supporting/enabling a fundamentally flawed approach. This 'developer' is non-productive. I'd go as far as to say their productivity is negative. They are creating work for you. I'd wager it would take less time for you to just write the code and produce better quality than you are spending on babysitting this 'developer'. In other words, they can only appear to succeed due to you and other people. Management may perceive this as proof that the approach is working. AI slop is a big problem for many types of work these days and I'm not sure it's going to get any better soon.

Given the reality of the situation, the best thing you could do here, IMO, is to let them POC an isolated solution without assistance. If they and their management are so confident that LLMs and unskilled operators are the future, then just do it. Show and prove. If the agents are wonderful, why do they need your help? Unfortunately, supporting this seems to be your job responsibility and it may be that their work is part of a larger solution. I think, ethically, as a consultant, you should try to help the management understand the reality of the situation. If it's truly the case that they really can't fire this person, they should probably be looking for some way to give them a role where they can't cause trouble and waste the time of skilled people. This person, based on your description, is a sunk cost employee and they should be minimizing the employee's drag on the organization, not giving them tools to help maximize their counterproductive output.

February 18, 2026 Score: 2 Rep: 12,769 Quality: Medium Completeness: 30%

Obviously you don't need to "prove" what the person is doing - he's doing exactly what the company recently said he should do, which is use AI, so it cannot be beyond countenance that he is in fact doing so.

If a number of reviews demonstrate that your colleague is being habitually inefficient or careless with submissions to review, you would invoke the same procedure as if your colleague were being inefficient or careless without AI - namely, a behaviour or competency procedure.

Whether you want to invoke that procedure is another question, since such behaviour increases the demand for software experts like yourself to explain to incompetent/devious developers how their AI slop is wrong. Here we see how AI can actually have a negative labour-productivity effect.

It's also another question whether there even is such a procedure to invoke. Perhaps the assumption in the past is that developers submitting poor quality would get worn down by knock-backs at review and redoing work, but AI maybe allows your colleague to withstand longer at serving up slop, until perhaps you crack first and get sick of pointing out the defects.

Perhaps the next step is opening a conversation with management about how to deal with the particular situation.

It's worth noting that review processes themselves are largely a result of employing people who hand-generate slop. Most businesses do not apply two pairs of eyes to code to increase software quality, but so they can cut back on the grade of developers they employ without quality immediately crashing to the level it would if only one low-grade developer were applied to the task.

In properly-functioning teams consisting of professionals, there may be general supervision of each others' work, but you wouldn't have everyone checking every piece of code twice in a routine way, any more so than you have two accountants reading every entry the other makes, or two lawyers reading every single email than one writes.

February 18, 2026 Score: 0 Rep: 110,413 Quality: Medium Completeness: 50%

What are my options?

  1. Try to get them fired. As you mention, this is likely impossible in your particular situation. But low code quality is low code quality, regardless of the tools used to generate it.
  2. Timebox your code review time. It's easy to get DOSed by AI generated PRs, so limiting your review time will protect your flow-time as well as serve as backpressure on their slop. Unfortunately, this will usually lead to someone else not doing code review.
  3. Flood them with PRs. Fight fire with fire. If they're busy reviewing your code, then they're generating less slop. If they complain, then you point out all the work you're doing to review their vibes.
  4. Don't do code review. Just ship that slop to prod and let them suffer the consequences of their work-product.
  5. Talk with the engineer and persuade them to behave differently. Ideally, your manager should be doing this - that's their job after all. But sometimes it will fall to engineers to lead other engineers away from evil.