on LinkedIn a number of days ago saying that a variety of the highest engineers are actually just using AI to code.
It reached hundreds and got quite a number of heated opinions. The space is clearly split on this, and the people against it mostly consider it as outsourcing a whole project to a system that may’t construct reliable software.
I didn’t have time to reply to every comment, but I feel there’s a fundamental misunderstanding about how you should utilize AI to construct today. It could surprise you that a variety of it continues to be engineering, just on a unique level than before.
So let’s walk through how this space has evolved, find out how to plan before using AI, why judgement and taste still matter, which AI coding tools are winning, and where the bottlenecks still are.
Because software engineering is perhaps changing, however it doesn’t appear to be disappearing.
The space is moving fast
Before we get into find out how to actually construct with these tools, it’s price understanding how briskly things have modified.
Cursor became the primary real AI-assisted IDE breakout in 2024, despite the fact that it launched in 2023, but getting it to provide something good without abandoning a trail of errors was hard.
I struggled lots even last summer using it.
Lots of us also remember the Devin fiasco, the so-called “junior AI engineer” that couldn’t really finish anything by itself (though this was a while ago).
The previous few months have been different and we’ve seen this in socials too.
Spotify publicly claimed its top developers haven’t written a single line of code manually since December. Anthropic’s own internal team reportedly has 80%+ of all deployed code written with AI assistance.
And Andrej Karpathy said that programming modified more within the last two months than it had in years.
Anthropic also found that Claude Opus 4.6 discovered 22 novel vulnerabilities in Firefox in two weeks, 14 of them high-severity, roughly a fifth of Mozilla’s entire 2025 high-severity fix count.
The individuals who use these tools day by day already know they’re convalescing. But “convalescing” doesn’t mean the engineering work is gone.
You intend, AI codes
So if the tools are this capable, why can’t you only say what you would like and have it built? Since the planning, the architecture, and the system pondering continues to be the hard part.
Consider AI as an assistant, not the architect. You might be still the one directing the project, and it is advisable think it through before you begin delegating the way it ought to be built.
The higher your overview of different layers (i.e. frontend, backend, security, infrastructure) the simpler it’s to instruct it accurately.
For those who don’t mention what you would like, you often don’t get it.
This might mean using one agent to research different approaches first: tech stack options, cost and performance tradeoffs, or why you’d pick one language or framework over one other.
For those who’re constructing authentication, go do research. Get a transient review of whichever tool you’re considering, whether that’s Cognito, Auth0, or something else, and check whether it actually supports what you wish.
This does mean you’ve gotten to learn a few of it on your individual.
For those who’re storing user data, you may need a CRUD API for it. One agent can construct it, document it properly, after which one other agent can use that documentation inside one other application.
This works a lot better when you already know the way APIs ought to be structured, how cloud CDKs work, or how deployment pipelines fit together.
The less you specify upfront, the more painful it gets later once you’re attempting to get the agent to do stuff saying things like “not like that” and “this doesn’t work like I believed it could.” (I’m guilty of being this lazy).
Now, you may have a look at this and think that also feels like a variety of work.
And truthfully, yes, it continues to be work. A whole lot of these parts could be outsourced, and that makes things significantly faster, however it continues to be engineering of some kind.
Boris Cherny, who works on Claude Code, talked about his approach: plan mode first, iterate until the plan is true, then auto-accept execution.
His insight that keeps getting quoted within the tech community is, “Once the plan is sweet, the code is sweet.”
So, you think that. The AI agent builds.
Then possibly you evaluate it, redirect it, and test it too.
Perhaps we’ll eventually see higher orchestrator agents that may help with system design, evaluation, and wireframing, and I’m sure persons are already working on this.
But for now, this part still needs a human.
On judgement and taste
People discuss judgement lots, and taste too, and the way this just can’t be delegated to an AI agent. This is basically about knowing what to ask, when to thrust back, what looks dangerous, and having the power to inform if the consequence is definitely any good.
Judgement is largely recognition you construct from having been near the work, and it often comes with some form of experience.
Individuals who’ve worked near software are likely to know where things break. They know what to check, what assumptions to query, and may often tell when something is being built badly.
This can be why people say it’s ironic that a variety of the people against AI are software engineers. They’ve probably the most to achieve from these tools precisely because they have already got that judgement.
But I also think people from other spaces, whether that’s product development, technical design, or UX, have developed their very own judgement that may transfer over into constructing with AI.
I do think individuals who have an affinity for system level pondering and who can think in failure modes have some form of upper hand too.
So, you don’t must have been a developer, but you do must know what attractiveness like for the thing you’re attempting to construct.
But when every part is latest, learn to ask a variety of questions.
For those who’re constructing an application, ask an agent to do a preliminary audit of the safety of the applying, grade each area, offer you a brief explanation of what each does, and explain what form of security breach could occur.
If I work in a brand new space, I make certain to ask several agents against one another so I’m not completely blind.
So, the purpose is to work with the agents somewhat than blindly outsourcing your complete pondering process to them.
If judgement is knowing what to query, what to prioritize, what’s dangerous, and what is sweet enough, taste is more your quality bar. It’s sensing when the UX, architecture, or output quality feels off, even when the thing technically works.
But none of that is fixed. Judgement is something you construct, not something you’re born with. Taste is perhaps a bit more innate, but should recuperate with time too.
As I’m self-taught myself, I’m pretty optimistic that folks can jump into this space from other areas and learn fast in the event that they have the affinity for it.
They may also be motivated by other things that will turn out to be useful.
Which AI-assisted tools are winning
I’ve now overloaded you on every part before attending to the actual AI tools themselves so let’s run through them and which one appears to be winning.
Cursor was released in 2023 and held the stage for a very long time. Then OpenAI, Anthropic, and Google began pushing their very own tools.
Have a look at the quantity of mentions of Claude Code, Cursor, and Codex across tech communities for the past yr below. This beautiful much sums up how the narrative has shifted over the past yr.
For those who go to Google Trends and perform some research it can show similar trends, though it doesn’t show that Cursor trend lowering in the midst of last summer.
The standout is clearly Claude Code. It went from a side project inside Anthropic to the only most discussed developer tool in under a yr.
The amount of conversation around it dwarfs Cursor, Copilot, and Codex combined within the communities this one tracks.
It’s fascinating how these platforms that own the LLMs can just grab an area they need to reach, and just about crush their competitors (after all still subsidizing their very own tool at a rate no third-party IDE can match).
But besides the subsidized token-economics of those tools, people shifted from writing code blocks and a part of their codebase to only saying “I finished opening my IDE.”
So these tools are actually allowing us to go from assisted coding to delegated coding.
The elemental difference people keep pointing to from the opposite tools (like Cursor) is Claude Code works in your codebase like a colleague you hand work to somewhat than inside your editor suggesting code.
People also keep discovering that Claude Code is beneficial for things that aren’t programming.
I actually have a friend that works on organizing his entire 15-person team company within VS Code with Claude Code. None of it is definitely code and he just uses the IDE for organisation.
Now the speed limits are a relentless thing, with Claude Code being the fastest you’ll run out of week by week. I often run out by Thursday and need to wait until Monday.
Because of this we have now several subscriptions, like Codex as well.
Now possibly it’s a taste thing, but most individuals I consult with go to Claude Code for many of their work, with Codex being the sidekick.
Claude Code Skills
Let’s just briefly mention Skills too here together with Claude Code.
I feel it was made for people to jot down internal instructions that were project based, where you encode the teachings right into a skill file and hand it to Claude before it starts working.
These are markdown files (together with scripts, assets, data) that live in your project and may cover anything from find out how to structure APIs to what your deployment pipeline expects to find out how to handle edge cases in a selected framework.
But I actually have found it as a neat technique to transfer knowledge. Say you’re a developer who needs to construct a mobile application and also you’ve never touched React Native.
For those who can discover a Skill with best practices built by someone who actually knows what they’re doing, you’ll have a neater time to construct that project. It’s such as you’re borrowing another person’s experience and injecting it into your workflow.
Same thing with frontend design, accessibility standards, system architecture, search engine optimisation, UX wire framing and so forth.
Now I actually have tried to construct a few of these with AI (without being an authority within the domain) with kind of success.
Possibly this pattern will grow though where we’ll higher have the ability to instruct the agents beforehand, possibly selling skills amongst one another, so we don’t need to learn a lot, who knows.
Let’s cover bottlenecks too
I should cover the problems as well. This will not be all rainbows and sunshine.
LLMs could be unreliable and cause real damage, we’re not in command of model drift, after which there’s the query of how judgement is built if we’re not coding.
The opposite day I used to be pulling my hair out because an integration wasn’t working. I’d asked Codex to document find out how to use an API from one other application, then sent that documentation to Claude Code.
It took a number of minutes to construct the mixing after which an hour for me to debug it, pondering it was something else entirely. But essentially Claude Code had made up the bottom URL for the endpoint which must have been the one thing I checked but didn’t.
I kept asking it where did you get this one from, and it said, “I can’t really say.”
You already know the deal.
So it is sensible that it will probably get pretty bad once you give these agents real power. We’ve heard the stories by now.
In December, Amazon’s AI coding agent Kiro inherited an engineer’s elevated permissions, bypassed two-person approval, and deleted a live AWS production environment. This caused a 13-hour outage.
I do know they made it mandatory now to approve AI generated code.
But I doubt manual review could be the predominant control layer if AI is writing this much code. So I’m wondering if the reply is best constraints, narrower blast radius, stronger testing, and higher system level checks in a roundabout way.
It should be interesting to see what the long run holds here.
There are more stories like this after all.
Akin to, Claude Code wiped a developer’s production database via a Terraform command, nuking 2.5 years of records (though Claude did warn him before). OpenAI’s Codex wiped a user’s entire F: drive from a character-escaping bug.
There may be also model drift that we just don’t have control of as users. Which means the tools can degrade, possibly because of latest releases, cost cutting fixes, etc.
Having the model just not working prefer it used to sooner or later is greater than a little bit of a nuisance.
This isn’t latest, and folks have built their very own monitoring tools for it.
Marginlab.ai runs day by day SWE-bench benchmarks against Claude Code specifically to track degradation. Chip Huyen open-sourced Sniffly for tracking usage patterns and error rates.
The incontrovertible fact that the community felt the necessity to construct all of this tells you something. We’re counting on these tools for serious work, but we’re not accountable for how they perform.
Then there’s the entire judgement thing.
Anthropic ran a controlled trial with 52 mostly junior software engineers and located that the group using AI scored 17% lower on comprehension tests, roughly two letter grades worse than the group that coded by hand.
While you outsource the code writing part, you begin losing the intuition that comes from working near the code, the query is how much of an issue this will probably be.
This list will not be exhaustive, there’s also the query of what these tools actually cost once the subsidies disappear.
Rounding Up
This conversation is neither about not needing software engineering experience nor about AI being useless.
What I feel is definitely happening is that engineering on this space is shifting. System pondering, engineering experience, curiosity, breadth across domains, and analytical pondering will matter greater than the power to jot down the code by hand.
Possibly this implies engineering is moving up a layer of abstraction, with AI shifting value away from hand coding and toward system judgment.
But I don’t think AI removes the necessity for engineering itself. Immediately it is a latest technique to engineer software, one which is clearly much faster, but not without a variety of risks.
We’ve seen the progress exceed anything we’ve expected, so it’s hard to say how far this goes.
But for now, a human still has to drive the project, take responsibility, and judge what is sweet and what will not be.
That is my first opinion piece, as I often write about constructing within the AI engineering space.
But since we’ve been constructing software straight away just using AI with Claude Code, it seemed fitting to jot down a bit on this subject.
This continues to be the fundamentals of vibe engineering, I do know people have gone further than me, so there’ll probably be one other one in the long run talking about how naive I used to be here and the way things have modified since then.
Alas, that’s just the best way it’s and when you write it is advisable swallow your pride and just be okay with feeling silly.
Connect with me on LinkedIn to jot down your thoughts, try my other articles here, on Medium, or on my website.
❤
