Your AI Agent Belongs in a Container

In the previous post I told you to skip the devcontainer setup. That advice stands for your first steps. This post is about why you shouldn't skip it forever, because the devcontainer is where Mx CLI goes from "interesting tool" to something I'd genuinely call agentic engineering.

First, the Uncomfortable Part

We talk a lot about what AI agents can build. We talk far too little about what they can reach.

An agent like Claude Code runs with shell access on your machine. That means it can read your SSH keys. Your environment variables, including the API keys in them. Your browser profiles, your other projects, every local file your user account can touch. Most of the time nothing goes wrong, and that's exactly the problem: nothing going wrong is not the same as nothing being able to go wrong. The agent doesn't have to be malicious; a confidently wrong rm or a prompt injection hidden in something it reads is enough.

The answer is old-fashioned and boring: isolation. Give the agent its own machine. A devcontainer is the cheapest way to do that, and it consists of two files in a .devcontainer folder:

devcontainer.json — which extensions, ports, and commands the environment gets
Dockerfile — which OS the container runs and which dependencies are installed

Install the Dev Containers extension in VS Code, open the project folder, and VS Code asks whether you want to reopen it inside the container. From that moment the agent works in a disposable Linux environment with access to your project and nothing else. Your SSH keys stay outside.

What Mx CLI Puts in the Box

Here's the nice part: you don't have to configure any of this yourself. The mxcli init command from the previous post also generates the .devcontainer setup, and what's in it shows that the maintainers understand what an agent actually needs:

mxcli on the PATH, plus MxBuild and a JDK, so the project can be validated and built
Node.js and Playwright, for browser-based checks
A PostgreSQL client, for poking at the database
Claude Code, preinstalled
And the one that ties it all together: Docker-in-Docker

That last one sounds like overkill until you see what it's for. It means the agent can start containers inside its container, and that unlocks the whole loop.

The Loop

This is the part that made me sit up. With the sandbox in place, an agent can do this end to end:

# Validate the model
mxcli docker check -p app.mpr

# Build it with mxbuild, in a container
mxcli docker build -p app.mpr

# Run it: Mendix runtime plus a Postgres 17 database, via docker compose
mxcli docker run -p app.mpr --wait

That docker run spins up a real Mendix runtime against a real Postgres database, with demo users and configuration handled for you. Your generated app is now actually running, on localhost, inside the sandbox. The agent can query live data with OQL:

mxcli oql -p app.mpr "SELECT * FROM Shop.Product"

And then it can verify its own work in a browser:

mxcli playwright verify tests/ -p app.mpr

The Playwright step matters more than it might seem. Mendix apps are React single-page applications, so an HTTP 200 on a page URL tells you nothing: the widgets render client-side, and a button that exists in your model can still be missing from the DOM. Mx CLI's verify command runs test scripts in headless Chromium against the running app, captures a screenshot when something fails, and can emit JUnit XML for your CI pipeline.

Why This Is the Point

Put the pieces together and you get something I find genuinely exciting. The agent writes MDL, validates it, builds the app, boots it with a database, seeds it with data, clicks through the UI, and reads back the results. When the Playwright check fails, the agent sees the failure and the screenshot, and tries again. That's a closed feedback loop, and a closed feedback loop is the difference between an agent that generates plausible-looking output and an agent that ships something verified.

And the whole loop runs inside a box that can't touch your keys, your credentials, or the rest of your machine. That combination, autonomy because of isolation rather than despite it, is what agentic engineering should look like in my view. Not an agent loose on your laptop with your production credentials in reach, but an agent with real power inside walls you chose.

One practical note for organizations that can't use Docker Desktop because of licensing: Mx CLI supports Podman as a drop-in alternative. Run mxcli init --container-runtime podman and the generated configuration uses Podman-in-Podman instead.

That wraps up the setup part of this series. Everything from here on is about what you actually build with it.

Your AI Agent Belongs in a Container

First, the Uncomfortable Part

What Mx CLI Puts in the Box

The Loop

Why This Is the Point

Posts in this series

Get in Touch