The Unbearable Lightness of Coding

A month ago, I built a full retrieval system with embeddings, hybrid search, and a GUI in about 25 hours. Last weekend, I spent two days attempting to fix a bug in it — and realized I had no idea how my very own software worked.

Let’s be honest: I even have pushed a GitHub repo without having written a single line of code. Do I feel bad about it? Type of. The quantity of technical doubt weighs heavily on my shoulders, way more than I’m used to. Will I regret it? Possibly. Will you?

I desired to share my story here because I consider that is something many developers are going through right away, and much more will experience it in the approaching years.

Because let’s face it: you possibly can have a code of honor and be happy with your craftsmanship, but nothing beats the speed of GitHub Copilot & Co. In case your colleague on AI steroids ships features and pushes updates twice () as fast as you, who do you think that is closer to the corporate’s door when budgets tighten?

The productivity gains are real, even if you happen to only use these tools for documentation. And there’s a tiny step from:

““

That tiny prompt step skyrockets you into a very different realm of productivity.

But here comes my very personal story, what I learned, and where I feel this leaves us as developers.

The project: constructing my very own NotebookLM (but stricter)

For background, I got down to construct a RAG-style text retrieval system within the spirit of NotebookLM, except stricter. The system takes a non-public PDF library, processes it, after which retrieves answers verbatim from that corpus. No paraphrasing, no hallucinated sentences, just “give me the precise passage that answers my query so I can search it in the unique PDF again.”

Admittedly, it is a very scientific, barely paranoid way of using your literature. But I’m probably not the just one who’s bored with fact-checking every LLM response against the source.

The architecture of the software was fairly straightforward:

A sturdy ingestion pipeline: walking directory trees, extracting text from PDFs, and normalizing it into paragraphs and overlapping chunks.
Hybrid Storage & Retrieval: a storage layer combining standard SQL tables, an inverted-index full-text search engine (for exact keyword matches), and a vector database (for semantic understanding).
A Reranking Strategy: some logic to drag a large candidate pool via lexical search, then rerank the outcomes using dense vector similarity to get the very best of each worlds.
A Full UI: a dashboard to administer the PDF library, monitor ingestion progress, and display results with deep links back to the source text.

On paper, that is all quite straightforward. Python, Streamlit, SQLite+FTS5, FAISS, a sentence-transformer model, every little thing wrapped in a Docker container. No exotic cloud dependencies, just a non-public NotebookLM‑ish tool running on my machine.

The documentation-first approach

I didn’t start with code, but with the documentation. I already had my usual project skeleton from a cookiecutter template, so the structure was there: a spot for requirements, for design decisions, for how one can deploy and test, all neatly sitting in a docs folder waiting to be filled.

I wrote down the use case, sketched the architecture, the algorithms to implement, the necessities. I described goals, constraints, and major components in a few bullet points, then let genAI help me expand the longer sections once I had the rough idea in place. I subsequently moved progressively from a basic idea to filling out more detailed documents describing the tool. The result wasn’t the very best documentation ever, nevertheless it was clear enough that, in theory, I could have handed the entire bundle to a junior developer and they might have known what to construct.

Releasing my AI coworker into the codebase

As a substitute, I handed it to the machine.

I opened the doors and let my GitHub Copilot colleague into the codebase. I asked it to create a project structure as it might see fit in addition to to fill within the required script files. Once a basic structure was set and the tool looked as if it would work with one algorithm, I also asked it to generate the pytest suite, execute the test, and to iterate once it bumped into any errors. Once this was done, I continued asking it to implement further algorithms and to cover some edge cases.

In essence, I followed my usual approach to software development: start with a working core, then extend with additional features and sort things at any time when the growing construct is running into major issues. Is that this a globally optimal architecture? Probably not. But it surely’s very much within the spirit of the Pragmatic Programmer: keep things easy, iterate, and “ship” incessantly — even when the shipment is just internal and only to myself.

And there’s something deeply satisfying about seeing your ideas materialize right into a working tool in a day. Working with my AI coworker felt like being the project lead I all the time desired to be: even my half‑baked wishes were anticipated and implemented inside seconds as mostly working code.

When the code wasn’t working, I copy‑pasted the stack trace into the chat and let the agent debug itself. If it got stuck in a self‑induced rabbit hole, I switched models from GPT5 to Grok or back again and so they debugged one another like rival siblings.

Following their thought process and seeing the codebase grow so quickly was fascinating. I only kept a really rough time estimate of this project, as this was a side experiment, nevertheless it was definitely not greater than 25 hours to provide >5000 lines of code. Which is definitely a fantastic achievement for a comparatively complex tool that might have otherwise occupied me for several months. It’s still removed from perfect, nevertheless it does what I intended: I can experiment with different models and summarization algorithms on top of a retrieval core that returns verbatim answers from my very own library, together with the precise source, so I can jump straight into the underlying document.

After which I left it alone for a month.

The technical debt hangover

After I got here back, I didn’t wish to add a significant feature. I just desired to containerize the app in Docker so I could share it with a friend.

In my head, this was a neat Saturday morning task. As a substitute, it became a weekend full‑time nightmare of Docker configuration issues, paths not resolving accurately contained in the container, embedding caches and FAISS indexes living in places I hadn’t clearly separated from the code, and tests passing on my local machine but failing (or never running properly) inside CI/CD.

A few of these issues are entirely on me. I happily assumed that my CI/CD pipeline (also generated by AI) would “maintain it” by running tests on GitHub, in order that cross‑platform inconsistencies would surface early. Spoiler: they didn’t.

back when Copilot suggested a seemingly easy fix: “Just add a reference to the working directory here.” As a substitute of letting it touch the code, I desired to stay on top of things and only ask for directions. I didn’t want it to wreak havoc in a codebase I hadn’t checked out for weeks.

That’s when I noticed how much I had outsourced.

Not only did I not realize why the error occurred in the primary place, I could discover neither the file nor passage I used to be imagined to make the change in. I had no idea what was occurring.

Compare that to a different project I did with a colleague three years ago. I can still recall how certain functions were intertwined and the silly bug we spent hours hunting, only to find that one among us had misspelled an object name.

The uncomfortable truth

I saved enormous development time by skipping the low-level implementation work. I stayed answerable for the architecture, the goals, and the design decisions.

But not the main points.

I effectively became the tech lead on a project whose only developer was an AI. The result looks like something a really fast, very opinionated contractor built for me. The code has unusually good documentation and decent tests, but its mental models never entered my head.

Would I give you the option to repair anything if I needed to make a change and the web was down? Realistically: no. Or a minimum of not faster than if I inherited this codebase from a colleague who left the corporate a 12 months ago.

Despite the higher‑than‑average documentation, I still stumble over “WTF” code pieces. To be fair, this happens with human‑written code as well, including my very own from just a few months back. So is GenAI making this worse? Or simply faster?

So… is vibe coding good or bad?

Truthfully: each.

The speed is insane. The leverage is real. The productivity gap between individuals who use these tools aggressively and those that don’t will only widen. But you’re trading implementation intimacy for architectural control.

You progress from craftsman to conductor. From builder to project lead. From knowing every screw within the machine to trusting the robot that assembled the automotive. And possibly that’s simply what software engineering is quietly turning into.

Personally, I now feel way more like a project lead or lead architect: I’m answerable for the massive picture, and I’m confident I could pick the project up in a 12 months and extend it. But at the identical time, it doesn’t feel like “my” code. In the identical way that, in a classic setup, the lead architect doesn’t “own” every line written by their team.

It’s my system, my design, my responsibility.

However the code? The code belongs to the machine.

The Unbearable Lightness of Coding

The project: constructing my very own NotebookLM (but stricter)

The documentation-first approach

Releasing my AI coworker into the codebase

The technical debt hangover

The uncomfortable truth

So… is vibe coding good or bad?

References

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

unlock foundation models for enterprises

Creating Privacy Preserving AI with Substra

The AI Hype Index: Grok makes porn, and Claude Code nails your job

County pays $600,000 to pentesters it arrested for assessing courthouse security

Chain apps programmatically, inspect visually

The Unbearable Lightness of Coding

The project: constructing my very own NotebookLM (but stricter)

The documentation-first approach

Releasing my AI coworker into the codebase

The technical debt hangover

The uncomfortable truth

So… is vibe coding good or bad?

References

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.