2025 marked a major turning point in AI usage, far beyond simple individual use.
Since 2020, we've moved from autocomplete to industrialization:
- 2021 with Github Copilot: individual use, essentially focused on advanced autocomplete.
- then browser-based use for more complex tasks, requiring multiple back-and-forths and copy-pasting
- 2025 with Claude Code, Windsurf and Cursor: use on the developer's workstation through code assistants
Gradually moving from a few lines produced by autocomplete to applications coded over 90% by AI assistants, dev teams must face the obligation to industrialize this practice at the risk of major disappointments.
And more than that, as soon as the developer's job changes, it's actually the entire development team that must evolve with it.
It's no longer just a simple tooling issue, but an industrialization issue at the team scale, just as automated testing frameworks changed how software was created in the early 2000s.
(We obviously tested before the 2000s, but how we thought about automating these tests through xUnit frameworks, the advent of software factories (CI/CD), etc., is more recent)
In this article, we'll explore how dev teams have adapted through testimonials from several tech companies that participated in the writing by addressing:
- Context Driven Engineering, the new paradigm
- Spec/Plan/Act: the reference workflow
- The AI Rules ecosystem
- Governance and industrialization
- Human challenges
Context driven engineering
While the term vibe coding became popular in early 2025, we now more readily speak of Context driven engineering or agentic engineering. The idea is no longer to give a prompt, but to provide complete context including the intention AND constraints (coding guidelines, etc.).
Context Driven Engineering aims to reduce the non-deterministic part of the process and ensure the quality of what is produced.
With Context Driven Engineering, while specs haven't always been well regarded, they become a first-class citizen again and become mandatory before code.
$ echo>Separate your process into two PRs:
- The PR with the plan.
- The PR with the implementation. The main reason is that it mimics the classical research-design-implement loop. The first part (the plan) is the RFC. Your reviewers know where they can focus their attention at this stage: the architecture, the technical choices, and naturally their tradeoffs. It's easier to use an eraser on the drawing board, than a sledgehammer at the construction site
Source: Charles-Axel Dein (ex CTO Octopize and ex VP Engineering at Gens de confiance)
We find this same logic here at Clever Cloud:
$ echo>Here is the paradox: when code becomes cheap, design becomes more valuable. Not less. You can now afford to spend time on architecture, discuss tradeoffs, commit to an approach before writing a single line of code. Specs are coming back, and the judgment to write good ones still requires years of building systems.
Source: Pierre Zemb (Staff Engineer at Clever Cloud)
or at Google
$ echo>One common mistake is diving straight into code generation with a vague prompt. In my workflow, and in many others', the first step is brainstorming a detailed specification with the AI, then outlining a step-by-step plan, before writing any actual code.
Source: Addy Osmani (Director on Google Cloud AI)
In short, we now find this method everywhere:
- spec
- plan
- act

Spec/Plan/Act
Spec: The specification brings together use cases: the intentions expressed by the development team. It can be called RFC (request for change), ADR (architecture decision record), or PRD (Product requirement document) depending on contexts and companies.
This is the basic document to start development with an AI. The spec is usually reviewed by product experts, devs or not.
AI use is not uncommon at this stage either (see later in the article).
But context is not limited to that. To limit unfortunate AI initiatives, you also need to provide it with constraints, development standards, tools to use, docs to follow. We'll see this point later.
Plan: The implementation plan lists all the steps to implement the specification. This list must be exhaustive, each step must be achievable by an agent autonomously with the necessary and sufficient context. This is usually reviewed by seniors (architect, staff, tech lead, etc., depending on companies).
Act: This is the implementation step and can be distributed to agentic sessions.
In many teams, this session can be done according to two methods:
- copilot/pair programming mode with validation of each modification one by one
- agent mode, where the developer gives the intention then verifies the result (we'll see how later)
We of course find variations, such as at Ilek which details the Act part more:
$ echo>We are in the first phase of industrialization which is adoption. The goal is that by the end of the quarter all devs rely on this framework and that the use of prompts/agents is a reflex. So we're aiming for 100% adoption by the end of March. Our workflow starts from the need and breaks down into several steps that aim to challenge devs in the thinking phases until validation of the produced code. Here's the list of steps we follow:
1- elaborate (challenges the need and questions edge cases, technical choices, architecture, etc.)
2- plan (proposes a technical breakdown, this plan is provided as output in a Markdown file)
3- implement (Agents will carry out the plan steps)
4- assert (an agent will validate that the final result meets expectations, lint, test, guideline)
5- review (agents will do a technical and functional review)
6- learn (context update)
7- push (MR creation on gitlab) This whole process is done locally and piloted by a developer.
Cédric Gérard (Ilek)
While this 3-phase method seems to be consensus, we see quite a few experiments to frame and strengthen these practices, particularly with two tools that come up regularly in discussions: Bmad and SpeckKit.
Having tested both, we can quite easily end up with somewhat verbose over-documentation and a slowdown in the dev cycle.
I have the intuition that we need to avoid digitally reproducing human processes that were already shaky. Do we really need all the roles proposed by BMAD for example? I felt like I was doing SaFe in solo mode and it wasn't a good experience :)
What is certain is that if the spec becomes queen again, the spec necessary for an AI must be simple, unambiguous. Verbosity can harm the effectiveness of code assistants.
The AI Rules ecosystem
While agentic mode seems to be taking over copilot mode, this comes with additional constraints to ensure quality. We absolutely want to ensure:
- that the implementation respects the spec
- that the produced code respects the team's standards
- that the code uses the right versions of the project's libraries
To ensure the quality produced, teams provide the necessary context to inform the code assistant of the constraints to respect. Paradoxically, despite vibe coding's bad reputation and its use previously reserved for prototypes, Context Driven Engineering puts the usual good engineering practices (test harness, linters, etc.) back in the spotlight. Without them, it becomes impossible to ensure code and architecture quality.
In addition to all the classic good practices, most agent systems come with their own concepts: the general context file (agents.md), skills, MCP servers, agents.
agents.md
A code assistant will read several files in addition to the spec you provide it. Each code assistant offers its own file: Claude.md for Claude, .cursorrules for Cursor, windsurfrules for Windsurf, etc.
There is an attempt at harmonization via agents.md but the idea is always broadly the same: a sort of README for AI. This README can be used hierarchically, we can indeed have a file at the root, then a file per directory where it's relevant.
| agents.md
|
` application 1
|
` agents.md
This file contains instructions to follow systematically, example:
- the interface is in English mandatorily
- in the nuxt backend (`server/api/`), we ALWAYS use the generated OpenAPI client (`~~/server/utils/openapi`), never direct `$fetch`
and can reference other files.
If you have to work in kotlin, always load the rules/kotlin.md file
Having multiple files allows each agent to work with reduced context, which improves the efficiency of the agent in question (not to mention savings on costs).
Skills, agents, MCP servers
Depending on the tools used, we find several notions that each have different uses.
A skill explains to an AI agent how to perform a type of operation.
For example, we can give it the commands to use to call certain code generation or static verification tools.
An agent can be involved to take charge of a specific task. We can for example have an agent dedicated to external documentation with instructions regarding the tone to adopt, the desired organization, etc.
MCP servers allow enriching the AI agent's toolbox. This can be direct access to documentation (for example the Nuxt doc), or even tools to consult test account info like Stripe's MCP.
It's still too early to say, but we could see the appearance of a notion of technical debt linked to the stacking of these tools and it's likely that we'll see refactoring and testing techniques emerge in the future.
Governance and industrialization
With the appearance of these new tools comes a question: how to standardize practice and benefit from everyone's good practices?
As Benjamin Levêque (Brevo) says:
$ echo>The idea is: instead of everyone struggling with their own prompts in their corner, we pool our discoveries so everyone benefits.
Corporate Marketplaces
One of the first answers for pooling relies on the notion of corporate marketplace:
$ echo>At Brevo, we just launched an internal marketplace with skills and agents. It allows us to standardize code generated via AI (with Claude Code), while respecting standards defined by "experts" in each domain (language, tech, etc.). The 3 components in claude code: We transform our successes into Skills (reusable instructions), Subagents (specialized AIs) and Patterns (our best architectures). Don't reinvent the wheel: We move from "feeling-based" use to a systematic method.
Benjamin Levêque and Maxence Bourquin (Brevo)
$ echo>At Manomano we also initiated a repository to transpose our guidelines and ADRs into a machine-friendly format. We then create agents and skills that we install in claude code / opencode. We have an internal machine bootstrap tool, we added this repo to it which means all the company's tech people are equipped. It's then up to each person to reference the rules or skills that are relevant depending on the services. We have integration-type skills (using our internal IaC to add X or Y), others that are practices (doing code review: how to do react at Manomano) and commands that cover more orchestrations (tech refinement, feature implementation with review). We also observe that it's difficult to standardize MCP installations for everyone, which is a shame when we see the impact of some on the quality of what we can produce (Serena was mentioned and I'll add sequential-thinking). We're at the point where we're wondering how to guarantee an iso env for all devs, or how to make it consistent for everyone
Vincent AUBRUN (Manomano)
$ echo>At Malt, we also started pooling commands / skills / AGENTS.MD / CLAUDE.MD. Classically, the goal of initial versions is to share a certain amount of knowledge that allows the agent not to start from scratch. Proposals (via MR typically) are reviewed within guilds (backend / frontend / ai). Note that at the engineering scale we're still searching a lot. It's particularly complicated to know if a shared element is really useful to the greatest number.
Guillaume Darmont (Malt)
Note that there are public marketplaces, we can mention:
Be careful however, it's mandatory to review everything you install…
Among deployment methods, many have favored custom tools, but François Descamps from Axa cites us another solution:
$ echo>For sharing primitives, we're exploring APM (agent package manager) by Daniel Meppiel. I really like how it works, it's quite easy to use and is used for the dependency management part like NPM.
CI/CD Integration
Despite all the instructions provided, it regularly happens that some are ignored. It also happens that instructions are ambiguous and misinterpreted. This is where teams necessarily implement tools to frame AIs:
- linting
- test harness
- code reviews
While the human eye remains mandatory for all participants questioned, these tools themselves can partially rely on AIs.
AIs can indeed write tests. The human then verifies the relevance of the proposed tests.
Several teams have also created agents specialized in review with very specific scopes: security, performance, etc.
Others use automated tools, some directly connected to CI (or to Github).
(I'm not citing them but you can easily find them).
Related to this notion of CI/CD, a question that often comes up:
$ echo>It's also very difficult to know if an "improvement", i.e. modification in the CLAUDE.MD file for example, really is one. Will the quality of responses really be better after the modification?
Guillaume Darmont (Malt)
Can I evaluate a model? If I change my guidelines, does the AI still generate code that passes my security and performance criteria? Can we treat prompt/context like code (Unit testing of prompts).
To this Julien Tanay (Doctolib) tells us:
$ echo>About the question "does this change on the skill make it better or worse", we're going to start looking at
promptfooandbrainstrust(used in prod for product AI with us) to do eval in CI.(...) For example with promptfoo, you'll verify, in a PR, that for the 10 variants of a prompt "(...) setup my env" the env-setup skill is indeed triggered, and that the output is correct. You can verify the skill call programmatically, and the output either via "human as a judge", or rather "LLM as a judge" in the context of a CI
All discussions seem to indicate that the subject is still in research, but that there are already work tracks.
Costs
$ echo>We had a main KPI which was to obtain 100% adoption for these tools in one quarter (...) At the beginning our main KPI was adoption, not cost.
Julien Tanay (Staff engineer at Doctolib)
Cost indeed often comes second. The classic pattern is adoption, then optimization.
To control costs, there's on one hand session optimization, which involves
- keeping session windows short, having broken down work into small independent steps.
- using the /compact command to keep only the necessary context (or flushing this context into a file to start a new session)
For example we find these tips proposed by Alexandre Balmes on Linkedin.
This cost control can be centralized with enterprise licenses.
This switch between individual key and enterprise key is sometimes part of the adoption procedure:
$ echo>We have a progressive strategy on costs. We provide an api key for newcomers, to track their usage and pay as close to consumption as possible. Beyond a threshold we switch them to Anthropic enterprise licenses as we estimate it's more interesting for daily usage.
Vincent Aubrun (ManoMano)
On the monthly cost per developer, the various discussions allow us to identify 3 categories:
| Teams in adoption process and teams with "best effort" usage | Strong adoption and usage pushed at all levels | Advanced usage, multi agents, AI integrated throughout CI |
|---|---|---|
| about $20/month | about $200/month | from $200 to$1000/month outliers observed well beyond |
The vast majority oscillates between category 1 and 2.
Documentation, positive side effect of AI
When we talk about governance, documentation having become the new programming language, it becomes a first-class citizen again.
We find it in markdown specs present on the project, ADRs/RFCs, etc. These docs are now maintained at the same time as code is produced.
$ echo>So we declared that markdown was the source of truth. Confluence in shambles :)
Julien Tanay (Doctolib)
It's no longer a simple micro event in the product dev cycle, managed because it must be and put away in the closet. The most mature teams now evolve the doc to evolve the code, which avoids the famous syndrome of piles of obsolete company documents lying around on a shared drive.
This has many advantages, it can be used by specialized agents for writing user doc (end user doc), or be used in a RAG to serve as a knowledge base, for customer support, onboarding newcomers, etc.
$ echo>The integration of this framework impacts the way we manage incidents. It offers the possibility to debug our services with specialized agents that can rely on logs for example. It's possible to query the code and the memory bank which acts as living documentation.
Cédric Gérard (Ilek)
Intellectual Property
One of the major subjects that comes up is obviously intellectual property. It's no longer about making simple copy-pastes in a browser with chosen context, but giving access to the entire codebase.
This is one of the great motivations for switching to enterprise licenses which contain contractual clauses like "zero data training", or even "zero data retention".
In 2026 we should also see the appearance of the AI act and ISO 42001 certification to audit how data is collected and processed.
In enterprise usage we also note setups via partnerships like the one between Google and Anthropic:
$ echo>On our side, we don't need to allocate an amount in advance, nor buy licenses, because we use Anthropic models deployed on Vertex AI from one of our GCP projects. Then you just need to point Claude Code to Vertex AI. This configuration also addresses intellectual property issues.
On all these points, another track seems to be using local models. We can mention Mistral (via Pixtral or Codestral) which offers to run these models on private servers to guarantee that no data crosses the company firewall.
I imagine this would also be possible with Ollama.
However I only met one company working on this track during my discussions. But we can anticipate that the rise of local models will rather be a 2026 or 2027 topic.
Human impacts
Recruitment
While AI is now solidly established in many teams, its impacts now go beyond the framework of development alone.
We notably find reflections around recruitment at Alan
$ echo>Picture this: You're hiring a software engineer in 2025, and during the technical interview, you ask them to solve a coding problem without using any AI tools. It's like asking a carpenter to build a house without power tools, or a designer to create graphics without Photoshop. You're essentially testing them on skills they'll never use in their actual job. This realization hit us hard at Alan. As we watched our engineering teams increasingly rely on AI tools for daily tasks — with over 90% of engineers using AI-powered coding assistants — we faced an uncomfortable truth: our technical interview was completely disconnected from how modern engineers actually work.
Emma Goldblum (Engineering at Alan)
Junior training
One of the big subjects concerns junior training who can quickly be in danger with AI use. They are indeed less productive now, and don't always have the necessary experience to properly challenge the produced code, or properly write specifications. A large part of the tasks previously assigned to juniors is now monopolized by AIs (boiler plate code, form validation, repetitive tasks, etc.).
However, all teams recognize the necessity to onboard juniors to avoid creating an experience gap in the future.
Despite this awareness, I haven't seen specific initiatives on the subject that would aim to adapt junior training.
Welcoming newcomers
Finally, welcoming newcomers is disrupted by AI, particularly because it's now possible to accompany them to discover the product
$ echo>Some teams have an onboarding skill that helps to setup the env, takes a tour of the codebase, makes an example PR... People are creative*
Julien Tanay (Doctolib)
As a side effect, this point is deemed facilitated by the changes induced by AI, particularly helped by the fact that documentation is updated more regularly and that all guidelines are very explicit.
Support towards a career change
One of the little-discussed elements remains supporting developers facing a mutation of their profession.
$ echo>We're moving the value of developers from code production to business mastery. This requires taking a lot of perspective. Code writing, practices like TDD are elements that participate in the pleasure we take in work. AI comes to disrupt that and some may not be able to thrive in this evolution of our profession
Cédric Gérard (Ilek)
The question is not whether the developer profession is coming to an end, but rather to what extent it's evolving and what are the new skills to acquire.
We can compare these evolutions to what happened in the past during transitions between punch cards and interactive programming, or with the arrival of higher-level languages. With AI, development teams gain a level of abstraction, but keep the same challenges: identifying the right problems to solve, finding what are the adequate technological solutions, thinking in terms of security, performance, reliability and tradeoffs between all that.
Despite everything, this evolution is not necessarily well experienced by everyone and it becomes necessary in teams to support people to consider development from a different angle to find the interest of the profession again.
Cédric Gérard also warns us against other risks:
$ echo>There's a risk on the quality of productions that decreases. AI not being perfect, you have to be very attentive to the generated code. However reviewing code is not like producing code. Review is tedious and we can very quickly let ourselves go. To this is added a risk of skill loss. Reading is not writing and we can expect to develop an evaluation capacity, but losing little by little in creativity
Conclusion
2025 saw the rise of agentic programming, 2026 will undoubtedly be a year of learning in companies around the industrialization of these tools.
There are points I'm pleased about, it's the return in force of systems thinking. "Context Driven Engineering" forces us to become good architects and good product designers again. If you don't know how to explain what you want to do (the spec) and how you plan to do it (the plan), AI won't save you; it will just produce technical debt at industrial speed.
Another unexpected side effect could be the end of ego coding, the progressive disappearance of emotional attachment to produced code that sometimes created complicated discussions, for example during code reviews. Hoping this makes us more critical and less reluctant to throw away unused code and features.
In any case, the difference between an average team and an elite team has never been so much about "old" skills. Knowing how to challenge an architecture, set good development constraints, have good CI/CD, anticipate security flaws, and maintain living documentation will be all the more critical than before. And from experience this is not so acquired everywhere.
Now, there are questions, we'll have to learn to pilot a new ecosystem of agents while keeping control. Between sovereignty issues, questions around local models, the ability to test reproducibility and prompt quality, exploding costs and the mutation of the junior role, we're still in full learning phase.
No comments yet. Be the first to comment!