How We Actually Make Architecture Calls in OpenClaw

📖 8 min read•1,410 words•Updated Mar 28, 2026

Why architecture decisions feel way harder than they look

A couple years ago, I spent three evenings trying to add what looked like a “simple” feature to OpenClaw: a per-project rate limiter for webhooks. Three evenings. Nothing I tried slotted in cleanly. Every change touched five files. Every “fix” broke some integration we’d half-forgotten about from 2021.

That was the moment I realized our problem wasn’t missing tests or lazy refactors. It was that we hadn’t been deliberate enough about architecture decisions. We had code that worked, but it didn’t like being changed.

If you contribute to OpenClaw (or you’re thinking about it), I want to walk you through how we actually make architecture calls now. Not the idealized “we drew a box diagram” version. The real stuff: the tradeoffs, the “we’ll regret this but it’s worth it,” and the places where we intentionally keep things boring.

The real job of architecture: make change cheap

I don’t care how pretty the diagrams are. If making a change costs you an entire weekend and a migraine, your architecture is lying to you.

In OpenClaw, we’ve started using a very simple test for architecture decisions:

Does this make the next change cheaper?
Does this make debugging faster than it is today?

That’s it. Not “is this pattern correct” or “will this scale to 10 million users.” We don’t have 10 million users. We do have maintainers who burn out when adding a small feature means reading 2,000 lines of unrelated code.

Example: in August 2024 we introduced the ExecutionPlan abstraction in the job runner. Before that, adding a new execution step meant:

Editing the core scheduler (bad idea #1)
Touching 3 different enums (bad idea #2)
Updating two places that built SQL queries by hand (bad idea #3)

We bit the bullet and created ExecutionPlan as a separate module. Yes, it was a big diff (around 1,200 lines changed). Yes, it broke a bunch of internal scripts. But now it’s one file to understand, one place to plug in a new step, and the scheduler doesn’t know or care about the details.

Was it “perfect architecture”? Definitely not. We already reworked parts of it twice. But every change since then has been smaller, safer, and easier to code review. That’s the only score I’m really looking at.

How we decide where to put a feature (and when to say no)

Architecture in open source is especially weird because you’re not just fighting complexity, you’re also fighting expectations. Feature requests come in with strong opinions about where things “should” live.

So in OpenClaw, we lean on three simple questions when deciding where a new feature belongs:

What actually owns the data? (code should live close to its data)
Who debugs this when it breaks? (keep that person’s mental model in mind)
Can someone delete this in a year without reading the whole repo?

Let me show you a concrete example.

In March 2025, someone opened an issue asking for “inline Lua scripting inside pipeline definitions.” Very cool idea. Very dangerous for the codebase. There were three obvious ways to do it:

Inline scripts inside the YAML parser (tempting, but cursed)
Add a scripting layer inside the core engine (very cursed)
Treat scripts as plugins with a tight interface (more boring, more work)

If we’d glued it into the YAML parser, it would have shipped faster, but now every little change to configuration syntax would risk breaking customer scripts in weird ways. That’s a time bomb for future maintainers.

We went with the plugin-style interface layered on top of the pipeline execution engine. That meant:

A small Lua runtime module, with zero knowledge of YAML
A clear boundary: Config → Engine → ScriptAdapter
Two config flags to turn the feature off in deployments that don’t want it

It took longer. The initial PR was #1897, merged on April 9, 2025, after four rounds of review. But now if someone wants to add “inline JS” or “inline WASM,” there’s a very obvious place to hook in. We paid for a clean seam once; we get to reuse it over and over.

Saying no is also an architecture decision. We’ve closed issues with “this belongs in a sidecar service, not in OpenClaw” more than once. That’s not us being stubborn; that’s protecting the core so it stays understandable.

Patterns we use on purpose (and the ones we avoid)

There’s a museum of patterns you can drag into a project. Most of them don’t belong in OpenClaw.

The patterns we actually lean on, repeatedly:

Ports and adapters for integrations and IO
Event-driven internals where we know we’ll add more listeners later
“Configuration at the edges” with typed code in the middle

Ports/adapters shows up everywhere in our codebase now:

Storage: StoragePort with Postgres, S3, and filesystem adapters
Messaging: QueuePort with Redis and NATS adapters
Auth: AuthPort with OIDC and static token adapters

The benefit is simple: when someone came along in late 2025 and wanted MinIO support, it was a ~180-line adapter instead of “rewrite every call site that touches storage.” That’s the kind of tradeoff I’ll happily make.

Things we mostly avoid:

Deep inheritance hierarchies. We favor data + functions.
Overly generic abstractions. If the generic name is harder to understand than “PostgresStorage,” we made it too clever.
Global singletons. Configuration and services are passed explicitly most of the time. It’s slightly annoying, but debugging is way easier.

A specific example of “too clever” that we ripped out: the old “UniversalBackendManager” from early 2023. It wrapped:

Storage
Queue
Auth
Caching

All behind one mega-interface. It looked nice at first. Then we tried to change just the cache implementation. That required editing the manager, the DI wiring, and half the tests. In mid-2024 we deleted it and replaced it with four small interfaces. More boilerplate, better life.

How we record decisions (without drowning in process)

Architecture docs can rot faster than the code they describe. So we keep it intentionally lightweight. You’ll see three main things in the OpenClaw repo:

ADRs (Architecture Decision Records) in docs/adr/
“Why” comments above weird-looking code paths
PR descriptions that talk about tradeoffs, not just “what changed”

Our ADRs are short. The one for the new scheduler (ADR-0007, dated 2024-11-02) is basically:

Context: old scheduler too tied to HTTP layer
Decision: pull scheduling into a separate service module
Alternatives: keep as-is, or move to an external queue entirely
Consequences: slightly more config, but better isolation and easier scaling

That’s about a page of text. You can read it in under two minutes. But when a new contributor jumps in and asks “why is the scheduler its own thing?”, we have an answer that isn’t just “because Kai felt like it.”

Same for PRs: if you’re changing something architectural, we really want to see:

What you considered but didn’t do
What will be easier or harder after this lands
Any “weird” choices you made on purpose

You don’t need a 10-page design doc. A few honest paragraphs are enough.

FAQ

Q: I’m a new contributor. How do I avoid making a “bad” architecture change?

Start small and start loud. Open a draft PR or a GitHub Discussion with your idea before you touch too many files. Show a tiny prototype, explain the tradeoffs you see, and ask “where would this belong?” You’ll get feedback faster, and you won’t spend a weekend building something we’re going to suggest moving anyway.

Q: Is it okay to add a new dependency or service to support a feature?

Yes, but we’re picky. New dependencies should either make things dramatically simpler or implement something we absolutely shouldn’t roll ourselves (crypto, serious parsing, etc.). New services are fine if they have a clear API boundary and don’t secretly depend on reaching into OpenClaw’s internals. If it feels like “just one more helper,” it probably belongs inside the existing modules instead.

Q: Do I need to write an ADR for every non-trivial change?

No. ADRs are for decisions that change how people think about the code: new modules, new cross-cutting patterns, deprecating old approaches. If you’re rearranging internal code but the mental model stays the same, a solid PR description is enough. When in doubt, ask in the #dev-architecture channel and we’ll tell you if it deserves an ADR.

🕒 Published: March 28, 2026

👨‍💻

Written by Jake Chen

Developer advocate for the OpenClaw ecosystem. Writes tutorials, maintains SDKs, and helps developers ship AI agents faster.

Learn more →

Why architecture decisions feel way harder than they look

The real job of architecture: make change cheap

How we decide where to put a feature (and when to say no)

Patterns we use on purpose (and the ones we avoid)

How we record decisions (without drowning in process)

FAQ

📚 You Might Also Like

Related Articles