Vibe Coding and the Loss of Gemba: The Risks of AI-Driven Development

April 8, 2026

There’s a new pattern emerging in software development that people are calling vibe coding.

Vibe coding is the practice of describing what you want an AI agent to build, letting it generate the code, and then iterating on the results. It’s effective, and honestly kind of addictive. Anybody who has been through a long debugging session knows that writing code can be a labor of love. Skipping some of the syntax-level details of coding with an AI agent does feel like a step change in productivity. I've seen small projects that would have taken a couple weeks that now take just a couple days. However, looking at it through a Lean lens, I keep coming back to a question:

What happens when we stop going to the gemba in software development?

What is the ‘Gemba’ in Software Development?

In Lean, gemba is the place where the work actually happens. It's the factory floor, a hospital operating room, a kitchen in a restaurant, etc. It’s where you go to understand how value is created.

In software, that “place” isn’t as obvious because it isn’t physical. It’s:

  • the backlog of ideas
  • the codebase
  • peer review
  • the build and deployment pipeline
  • the way the system actually behaves in production

And working in the code isn’t just about getting features done. It’s an ecosystem that you learn and grow with. When you spend time in a codebase (or any system), you start to notice things you can’t see from the outside. You learn which features are surprisingly hard to build, where bugs tend to come from, and which parts of the system are fragile or painful to work in. Ask any frontline employee in this setting (the developer) and they can draw the parallels between their world and walking a production line to visually see where work piles up or gets reworked.

The best engineers are constantly making changes within the code base to improve the developer experience, and best way to understand the challenges within the codebase is to try and build within it.

What Vibe Coding Changes

Vibe coding changes your relationship with a codebase. Instead of working in the code, you sit a layer removed. You describe intent, the agent churns through some implementation, and you evaluate the output. In most cases you do that in an iterative cycle. Instead of working through specifics and syntax, your priority shifts to using good judgement for how you can use an agent to design and evaluate. In many cases, you’re producing code faster than ever. But you’re now a step removed from the system that’s producing it – and we know from decades of good Lean operations that problem-solving happens best close to where the work is done.

In Lean terms, it feels a bit like optimizing output without really understanding the process.

Writing code is a lot like writing an essay. When you write an essay, you labor over words, phrasing, structure, and your ability to convey an idea. I think that laboring is critically helpful for understanding what you're writing about and the same thing happens when writing code. It's learning by doing, and writing forms understanding in a different way from reading. I can tell my two year old that she shouldn't eat sand, but she just won't remember that lesson in the same way that she will when she decides to try it for herself.

When we outsource writing code, we rely on our ability to read and test the generated code to understand the new state of the system. This puts an enormous demand on an engineer's ability to read code to fully comprehend how a system is changing. As the quantity of new code increases, this creates a kind of cognitive debt in regards to an engineer's ability to fully comprehend incoming changes - and it's amplified when they no longer have the act of writing to help solidify their mental models and understanding of the code base.

To be fair, there’s a lot to like here:

  • iteration is faster
  • prototyping is dramatically cheaper
  • more people can build things
  • you can explore ideas you wouldn’t have had time for before
  • sometimes the agents write better code than you would have

That’s all real. But we’re also spending less time at the gemba.

The Bet We’re Making

There’s an implicit bet in all of this: Models will improve faster than our codebases expand or degrade.

Right now, that bet doesn’t seem crazy because models can already generate solid code, navigate unfamiliar codebases, fix bugs, and even suggest improvements. It’s not hard to imagine that continuing. In the past, it's been crucial to write well structured code so engineers can read, decipher, and make improvements. Perhaps in a few years (or sooner), models will be good enough to where engineers can feel comfortable living exclusively at a higher level of abstraction in the code base—maybe human readability is no longer a constraining factor for code. I don’t think that’s science fiction, but we’re not fully there yet. And even if we get there, how we get there probably matters because in the meantime, the structure of the system still matters—A LOT. It's a lot easier for a new developer to contribute to a codebase that is well designed and maintained, and that applies to the efficacy of AI agents as well. Engineers still bear the responsibility and accountability to take care of a codebase and fix things when they go wrong.

The Codebase is a System

One way I’ve found helpful to think about this is: a codebase isn’t just output, it’s a production system.

If you map Lean ideas over:

  • flow → developer experience, build/test cycle
  • waste → tech debt, duplication, unnecessary complexity
  • rework → bugs, regressions
  • standard work → patterns, conventions, architecture

And gemba is the code itself. Refactoring the code is like rearranging a production line. Good abstractions are like better tooling. These things don’t show up immediately as features, but they make everything downstream better.

Codebases are less like machines and more like gardens. They grow and evolve, and if you don’t take care of them, weeds show up as technical debt, paths get harder to follow, things take longer to do, and weird bugs start appearing in brittle parts of the application. A lot of codebases have a “DO NOT TOUCH” section that nobody wants to go near because they’re afraid of what might happen if they make a change. Vibe coding makes it incredibly easy to plant new features. But who’s trimming? Who’s shaping the structure? Who’s making sure things don’t just sprawl? I find the models are quick to add complexity, and without deliberate instruction are wanting in terms of pruning that complexity.

A Quick Detour: The Mythical Man-Month

The Mythical Man-Month by Frederick P. Brooks Jr.

There’s a book from 1975 called The Mythical Man-Month by Frederick Brooks Jr. It’s a collection of essays about building large software systems written in a completely different era.

Some of the essays are a little dated, but I think it holds a lot of wisdom that's applicable in modern development. The title essay, The Mythical Man-Month, talks about Brooks's law: "Adding manpower to a late software project makes it later." I think a lot of folks have been on either end of a conversation asking if we can get the project done sooner if we just add a few more developers. Brooks opens the book with an essay called The Tar Pit, and I think it's an apt description of how it can feel to complete a complicated software project. The goal of his book is to provide some boardwalks to traverse the tar pit without falling in.

I think now is a good time to revisit some of these essays that have stood the test of time now that software development is undergoing some kind of phase shift. I think it's worth a read from anybody close to software development.

One of the ideas Brooks comes back to a lot is conceptual integrity. Conceptual integrity boils down to: does the system feel like it was designed with a coherent vision? This is already hard with teams of humans. With AI, it gets even more interesting. AI is really good at solving the problem in front of it. It’s much less obvious that it’s good at maintaining a consistent system-wide design over time. So you end up with a system that works, but feels like a collection of locally good decisions rather than a cohesive whole, resulting in a bit of a Frankenstein. I don’t think this is unsolvable. In fact, it’s possible future models will get very good at this. But right now, I notice pretty quickly the Frankenstein-nature of AI-heavy codebases. I think great engineers have a sense of accountability for the quality of their codebase, and it's hard to instill that sense of accountability in an agent.

On the other hand, Brooks also talks about the idea that you should plan to throw the first system away as you'll learn so much in implementation to invalidate many of the initial designs. With vibe coding, that first system is basically free now. That’s a huge upside because it allows us to try more ideas, prototype faster, and learn far more quickly.

From a Lean perspective, that’s what you want. You want iterative, rapid experimentation. Layer in AI and you’ve suddenly got a trap. We start keeping the prototype, because it’s easy to lose sight of it as a prototype. And if we’re not careful, we end up building long-term systems on top of something that was never meant to last.

So How Will We Continue Learning?

The biggest thing I keep coming back to is learning. When you work directly in a system, you build intuition for:

  • what’s hard
  • what’s fragile
  • what should be improved

If you stay one layer removed by prompting and validating, you’ll likely lose some of that. That matters, because you can’t improve what you don’t deeply observe. Now, I don’t think the conclusion here is “don’t use vibe coding.” If anything, I think the opposite is true. It’s incredibly powerful, it’s going to get better fast, and it’s going to change how we build software. So we shouldn’t ignore or avoid it. I think it’s plausible that in a few years models can do all of the things Brooks mentions like help maintain conceptual integrity, continuously refactor systems, and actively manage codebase health. At that point, some of these concerns might fade.

Until then, I think we still have to at ask engineers and AI agents to deliberately observe the code, go to gemba, and hold off from trying to manage from afar.

What This Means (Especially for Lean Folks)

If you think about this through a continuous improvement lens, this is a familiar situation. We’ve introduced a powerful new capability into a system. We don’t stop at, “Does it work?” We move on to, “What does it do to the system?”.

So I ask you to consider important reminders:

  • Stay connected to the codebase. Even if AI is generating code, spend time understanding it. AI tools can help with this!
  • Invest in standard work. Instruction files are making it easier to prescribe conceptual goals for agents to follow while writing code. Best practices for agent files are constantly changing so it's important to treat these as living documents.
  • Treat codebase health as real work. Refactoring, cleanup, and developer experience improvements matter. This doesn't change, and if anything coding agents can lower the cost of this kind of work.
  • Be intentional about design. Someone (or something) needs to own conceptual integrity.

Closing thought

Vibe coding lets us move faster than we ever have. One day it may even let us build and maintain systems in ways that weren’t possible before. Still, the Lean thinking ingrained in me keeps pulling me back to the same idea: Understanding the work is what enables improvement.

We may be able to build software without going to the gemba. The question is whether we can keep building systems we understand well enough to improve. We might be able to vibe our way to working software, but we still need real engineering to create effective systems.

Xander Hathaway

Vice President of Product DevelopmentMoreSteam

Xander Hathaway leads MoreSteam’s software development team, shaping the strategy, architecture, and technical foundation behind the company’s platform. His work sits at the intersection of software engineering, operational excellence, and applied AI, where he focuses on building systems that help organizations solve problems more effectively at scale.

Before joining MoreSteam, Xander worked as a senior software architect in consulting, where he designed and implemented cloud-based solutions for a wide range of clients. This experience informs his practical, systems-oriented approach to product development today.

Xander is an active contributor to the operational excellence community, having presented at leading industry conferences including POMS, IISE Annual Conference, and ASQ events. He holds a Master of Information and Data Science from University of California, Berkeley and a B.S. in Computer Science from University of Notre Dame.

Continuous ImprovementProcess ImprovementLeanAIQuality ControlSoftware Development

What's Actually Behind Your Most Persistent Problems? 2 Week Virtual Master Class - April 14th & 21st at 11am ET

Use Technology to Empower Your Continuous Improvement Program