I Directed an AI to Ship Real Software

It's a mistake for businesses to think they should use AI to clone the mature SaaS solutions they currently use. It is a waste of time and focus to build your own CRM, analytics platform, or email delivery system unless that's your product. It almost doesn't matter how much money you might save; saved money isn't revenue. Furthermore, it is easy to underestimate the ongoing operation and maintenance costs of custom internal software.

However, there is custom software that you should have AI built for you. And that custom software is the interface that streamlines and simplifies your workflows, and the glue between the different systems in your business. These are productivity solutions that reduce tedium and reduce the opportunities for errors. They make it easier to get a handle on the most important activities and metrics in your business.

Over a few sessions, I directed an AI agent to build a working macOS Stream Deck plugin. Nine actions, a smart file-opener, dials that drive tmux windows and panes, Safari tab jumping, window scrolling, app and window switching, and a document-switching dial for BBEdit. Around 182 automated tests, all passing, and feedback I dictated into my microphone.

I didn't even name it. Claude did: Switchboard.

Although I'm an engineer by training, my work focuses on the process and organizational level, not writing code — I sat in the director's chair, not the editor's. The plugin is the easy thing to point at. What's worth your attention is what changed underneath it — how the work got done. That's the leverage most teams miss: they bolt AI onto the surface and never change the machinery beneath it.

AI Workshop for CEOs

Directing an AI agent to ship production-grade work — the specification, the review discipline, the insistence on tests — is exactly the operating model the workshop maps to your team. Three hours live with a group of 8 CEOs, plus a personal session to translate it to how your people actually work.

Reserve Your Seat →

Don't Type, Direct

Lead with the feature list, and you'd file this under "developer side project" — fair enough. A terminal utility someone builds for themselves is a hobby project.

I didn't open an editor and type. I described what I wanted, set the constraints, reviewed what came back, and refused work that wasn't tested. The AI did the typing; I did the deciding. The afternoon produced rough-edged but production-grade software, not because the model is a genius, but because the operating model around it was disciplined.

AI doesn't change much when you treat it as a faster way to do the same work. It changes a lot when it changes who can direct what kind of work, and how that work gets reviewed.

Change How the Work Gets Done

Think of a business as a tree: the canopy is what everyone watches — the metrics, the roadmap, the visible output — and the roots, underground, decide whether it grows and survives the occasional storm. Most AI adoption decorates the canopy. Switchboard was a chance to change the roots instead.

Three things changed underground.

Who could direct the work? A year ago, a tool like this meant hiring a contractor, writing a spec, waiting, iterating across weeks. If I decided to do it myself, I might spend hours on a prototype just learning how Elgato's plugin architecture worked. The bottleneck was never the idea; it was the translation from intent to code. That layer is now cheap. I held the intent and the standards in my own head and drove the work directly, without becoming the person who writes the loops. For a CEO, that's a throughput story, not a coding one.

Quality discipline. We've all seen impressive prototypes that never reached the quality needed for serious work. When working with AI, I made tests non-negotiable — not because I love test counts, but because tests are how a director holds quality without writing the code himself. When I discover a bug, I ask Claude not only to fix it but also to create a test that demonstrates the bug before fixing it, and to document any lessons learned. This process discipline says this does what I asked, and it will keep doing it. Without a process, "the AI built it" is a sentence that should make a serious operator nervous.

Where the leverage lived. The model was the easy part — fast, fluent, and occasionally wrong. The leverage was in the scaffolding: the clear specification added to with each refinement, automated tests, the review pass on every change, and the validated build before I'd call anything done. The harness, not the horsepower.

Build for Focus, Not Just Convenience

The custom tool is worth building because it protects your attention, not just your time. Faster low-value work is just faster waste.

My Switchboard configuration keeps me on track. The Stream Deck Plus sits on my desk, and my highest-priority work is one labeled button away — strategic priorities kept in my field of view. When the next interesting-but-irrelevant thing tries to pull me sideways, the path back to the work that matters is a single press. The tool encourages discipline through convenience and the lack of distractions.

No off-the-shelf product was going to do that. The leverage came from building something shaped exactly to how I work and what I've decided matters — and you only get that when the person who knows the work directs the build.

AI is very good at making you faster at everything, including the work that doesn't matter. Your inbox is usually the least important thing in your business, and AI will happily help you tear through it all day.

Tools and context enable AI to keep you and your team on the strategic inputs that need your unique skills and judgment — the decisions only you can make — and off the merely urgent and interesting. A custom tool can hold that line in a way a generic one never will. That's the difference between productivity and value. Productivity is about output per unit time. Value is about the return on your investment of attention.

Expect It to Be Confidently Wrong

Claude and ChatGPT will tell you the work's done and ready to ship. I saw silly, easily avoidable errors, and more subtle failures. The agent was, at times, confident and fast and pointed in the wrong direction — and the only reason those mistakes didn't ship is that someone reviewed the work with an eye for what should be true, not just what the agent claimed was true.

Just as with human engineers, the highest-quality work requires another set of eyes. AI output is a multiplier on the first 80% of any task. The last 20% — where judgment lives — needs a reviewer who knows the difference between looks done and is done.

You should feel skeptical hearing "AI built it, all the tests pass." A leader who asks who wrote the tests, who reviewed the failures, who decided what done means understands the seeds of quality. The discipline that made a weekend tool trustworthy is the same discipline needed for a commercial product—just on a smaller scale.

Off-the-Shelf Only Scratches the Surface

A company's whole AI strategy is often a shopping list: a copilot license here, a summarizer there, a chatbot bolted onto the product. The adoption dashboard shows that adoption, spend, and activity are up. Six months later, the efficiency that was supposed to reach the bottom line still hasn't arrived.

Off-the-shelf AI has its place, but ad hoc prompting is the most superficial form of adoption — level one. Copilot, Claude, and ChatGPT are shaped to the average customer, not to your business, so the results often look more like first drafts.

Many AI-skeptical engineers see this and feel vindicated when the LLM makes an error transcribing a log into an Excel file or takes longer to perform a task than a human would. But you need to work with AI to build context and tooling before you declare victory for the human worker.

Switchboard demonstrates this second path: a set of user interfaces I directed AI to build. It helps me use AI and my Mac more effectively, and to automate the parts of my business workflows that don't require my judgment — so I can spend it on the parts that do.

Where to Start

You don't need to build a plugin. Pick one real piece of work and run it the way I ran this one:

Sit in the director's chair, not the keyboard. Your job is to specify, steer, and review — not to become the bottleneck you were trying to remove.
Make the review the work. Decide up front what "done" means and how you'll verify it. With AI in the loop, the review is the quality system. Tests, checks, a second agent with fresh context — pick a contract and hold it.
Build, don't just buy. Off-the-shelf makes the average process faster. The leverage is in directing AI to build a solution specific to your business — especially one that keeps you and your team focused on the work that actually matters.
Assume it will be confidently wrong sometimes, and design for catching it. The human earns his place by applying judgment to the 20% that matters, not by going faster.
Measure the root, not the canopy. Don't ask "are we using AI." Ask "what work can we now direct that we couldn't before — and how do we know it's good."

A working tool is easy to admire and easy to misread. The canopy is what you see; the roots are what hold. What changed in that afternoon was the way the work got done: directed rather than typed, held to a standard that survived contact with an agent that makes mistakes, shipped without a single line written by the person who decided what shipped meant. That's the move I help leaders make on the work that actually decides their growth. If that's the conversation you're having on your own team, that's the one I'm built for.

The plugin itself, Switchboard, is being released as open source — shared as-is, as evidence, not as a product. The code is the footnote. The operating model is the essay.

I Directed an AI to Ship Real Software

John M. P. Knox

Don't Type, Direct

Change How the Work Gets Done

Build for Focus, Not Just Convenience

Expect It to Be Confidently Wrong

Off-the-Shelf Only Scratches the Surface

Where to Start

Want to Talk?

I Directed an AI to Ship Real Software

John M. P. Knox

Don't Type, Direct

Change How the Work Gets Done

Build for Focus, Not Just Convenience

Expect It to Be Confidently Wrong

Off-the-Shelf Only Scratches the Surface

Where to Start

Want to Talk?

Get in Touch

Message Sent!