Vivek V.

for AWS Heroes

Posted on Mar 7 • Originally published at Medium

I Turned Notion Into a Control Plane for my 18 OpenClaw AI Agents

#notionchallenge #openclaw #mcp #devchallenge

Notion MCP Challenge Submission 🧠

This is a submission for the Notion MCP Challenge

What I Built

OpenClaw just got an Amazon Lightsail blueprint. No more Mac Minis. No more Raspberry Pis sitting on your desk acting as your AI agent server. Click deploy and you have an agent platform running in the cloud.

AWS samples also has an experimental (non-production) implementation that runs OpenClaw as per-user serverless containers on AgentCore Runtime. The serverless version is early, but the direction is clear.

That means OpenClaw can now run in different places. A Raspberry Pi on my desk. A Lightsail instance in the cloud. Serverless containers on AgentCore or even on an EC2. Pick a flavor. (I didnt buy a Mac Mini)

I run 18 agents on mine. These aren't toy demos. They solve problems I got tired of solving by hand.

After re:Invent last year, every expo vendor on the floor started emailing me. Booth scans, follow-ups, drip campaigns. Unsubscribing from each one is death by a thousand clicks. So I built an unsubscribe agent. I don't give it access to my personal mailbox. I forward vendor spam to OpenClaw's own email inbox. It parses the email, finds the unsubscribe link, clicks it, and confirms. I set up one mail rule and forgot about it. 47 vendor lists cleared in two weeks.

Then there's the train monitor. After peak hours, the next train home is an hour away. Miss it and you're standing on a cold platform for 60 minutes. The problem is trains don't always behave. Sometimes it arrives a minute early. Sometimes it switches platforms with no announcement. I was refreshing the train app constantly. The agent polls live train data and pushes me a notification when something changes. Platform switch, early arrival, cancellation. I get the update instead of checking.

OpenClaw even built me a full SaaS-like newsletter platform "The Agentic Engineer". I wanted a weekly newsletter to keep me updated on the Agentic AI content that I am interested in - along with a platform for subscriber management, double opt-in, click tracking, A/B subject lines, archive pages with SEO, threaded comments, the works. Instead of stitching together Substack or Beehiiv or whatever, I pointed the ask to OpenClaw and let it go. CDK stacks, Lambda functions, DynamoDB tables, SES integration, CloudFront distribution — it scaffolded the entire thing. Then another agent writes and publishes the issues. The platform runs on autopilot. I haven't touched it in weeks. It has more features than most newsletter SaaS tools I've paid for, and it costs me about $2/month in AWS bills. An example of true SaaSpocalypse.

Now multiply that by 18 agents, all running on cron schedules, and you hit the real problem of migrating or cloning your agentic work at 10X scale.

The Agent Migration Problem

Managing 18 agents was already a mess. SSH into a server. No single view other than the OpenClaw Dashboard. No way to pause an agent without editing config files or telegraming the OpenClaw. No full history of what ran, what failed, or how many tokens got burned.

But with three deployment targets, a new problem showed up: how do you move your agents between them along with their identity and history?

Each agent has a custom prompt, a personality file, tool configurations, cron schedules. My unsubscribe bot has mail parsing rules. My train monitor has API polling configs. 18 agents worth of state that lives in files on disk.

Migrating that from a Raspberry Pi to a Lightsail blueprint by hand? Copying config files, re-editing cron tabs, testing each agent one by one? I'd rather stand on that cold train platform for an hour.

I needed a control plane that was portable. Something that could snapshot my entire fleet, move it to a new instance, and bring everything back up. And I didn't want to run a database for it.

So I built AgentOps. And I built it on Notion.

AgentOps turns Notion into the control plane for an entire OpenClaw agent fleet. Four Notion databases form the backbone:

Agent Registry. 18 OpenClaw agents, each a row. Name, type, status, schedule, config, last heartbeat. Change status to "paused" in Notion and the runtime stops dispatching to it.

Task Queue. Every task with priority, status, assigned agent. Create a row in Notion, the OpenClaw runtime picks it up automatically.

Run Log. Every execution recorded. Input, output, duration, tokens used, errors. 78 runs tracked so far.

Alerts. Failures surface immediately. Acknowledge them with a checkbox click.

The key design decision: Notion IS the database. No Postgres. No MongoDB. Every read and write goes through the Notion API. You control your OpenClaw agents by editing Notion pages.

On top of that, AgentOps includes:

Token analytics. Per-agent breakdown, daily trends, top consumers. 128K+ tokens tracked across all OpenClaw agent runs.

Workspace sync. Push your OpenClaw agent configuration files (prompts, personality, tools) to Notion. Edit them there. Pull changes back to your OpenClaw instance.

Agent tuning. Bidirectional prompt sync. Edit an OpenClaw agent's prompt in Notion, apply it live with one click.

Full backup. Snapshot your entire OpenClaw agent fleet to a Notion page. Workspace files, prompts, cron definitions, agent registry. Restore anytime.

Fleet cloning. Export your OpenClaw agent fleet as a portable JSON bundle. Import it on a fresh instance. Your entire AI operation, portable.

All of this data lives directly in your Notion workspace. Agent Registry, Task Queue, Run Log, Alerts, Backups, Agent Prompts. No external database. Open Notion and you see everything.

Three built-in agents ship with it (summarizer, code reviewer, sentiment analyzer) that work end-to-end through Notion without any external AI API keys. Create a task, watch it get dispatched, see results land in the Run Log.

Video Demo

The Code

awsdataarchitect / agentops

Notion-powered control plane for OpenClaw AI agents — monitor, dispatch, tune, backup, and clone your agent fleet

🤖 AgentOps — Notion-Powered Control Plane for OpenClaw Agent Fleets

Notion MCP Challenge Entry — Use Notion as the human-in-the-loop command center for managing OpenClaw AI agents.

AgentOps turns your Notion workspace into a fully functional agent operations control plane. Monitor your OpenClaw fleet, dispatch tasks, track token usage, tune agent prompts, and backup your entire configuration — all through Notion.

Humans stay in control. Every agent, task, and configuration lives in Notion. Edit a page to pause an agent. Change a priority by updating a select field. Notion is the database.

📸 Screenshots

Dashboard

Real-time overview of your OpenClaw agent fleet — 18 agents, success rate, token usage, pipeline health, and recent activity.

Agent Registry

All 18 OpenClaw agents with status, schedules, and one-click pause/resume. Filter by type: cron, monitor, heartbeat, subagent, or demo agents.

Task Queue

Priority-based task queue with status tracking. Create tasks manually or let the…

View on GitHub

Stack: Node.js, Express, React 19, Vite, Tailwind CSS v4, @notionhq/client

Architecture:

┌─────────────────────────────────────────────┐
│                  Notion                     │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐     │
│  │  Agent   │ │  Task    │ │  Run     │     │
│  │ Registry │ │  Queue   │ │  Log     │     │
│  └────┬─────┘ └────┬─────┘ └────┬─────┘     │
│       │            │            │           │
│  ┌────┴────────────┴────────────┴─────┐     │
│  │         Notion API (MCP)           │     │
│  └────────────────┬───────────────────┘     │
└───────────────────┼─────────────────────────┘
                    │
        ┌───────────┴───────────┐
        │    AgentOps Server    │
        │  ┌─────────────────┐  │
        │  │  Agent Runtime  │  │
        │  │  (10s polling)  │  │
        │  └────────┬────────┘  │
        │  ┌────────┴────────┐  │
        │  │  Demo Agents    │  │
        │  │  • Summarizer   │  │
        │  │  • Code Review  │  │
        │  │  • Sentiment    │  │
        │  └─────────────────┘  │
        │  ┌─────────────────┐  │
        │  │  OpenClaw Fleet │  │
        │  │  (14 cron jobs) │  │
        │  └─────────────────┘  │
        └───────────┬───────────┘
                    │
        ┌───────────┴───────────┐
        │   React Dashboard     │
        │  • Fleet overview     │
        │  • Token analytics    │
        │  • Workspace sync     │
        │  • Agent tuning       │
        │  • Backup & clone     │
        └───────────────────────┘

How I Used Notion MCP

Notion MCP is the entire persistence and control layer for OpenClaw agents. There is no other database. Here's how each piece works.

Agent Registry (notion-create-pages, notion-update-page, notion-query-database-view)

Every OpenClaw agent is a Notion database row. The runtime queries for active agents before dispatching. Pause an agent by changing its status select property. The runtime reads it on the next 10-second poll and skips it. Resume by switching back to "active." Zero config files touched.

Task Queue (notion-create-pages, notion-query-database-view)

Tasks are Notion rows with status, priority, and agent type. The runtime queries for pending tasks sorted by priority, matches them to active OpenClaw agents, updates status to "running," executes, then marks "completed" or "failed." You can create tasks directly in Notion and the system picks them up.

Run Log (notion-create-pages)

Every OpenClaw agent execution writes a detailed record: input, output, duration in milliseconds, tokens consumed, error messages. This feeds the token analytics dashboard and provides full audit history.

Alerts (notion-create-pages, notion-update-page)

When an OpenClaw agent fails, an alert row is created automatically. The "Acknowledged" checkbox lets operators dismiss alerts from Notion or the dashboard.

Workspace Sync (notion-create-pages, notion-update-page)

OpenClaw agent configuration files (personality, tools, prompts) are pushed to Notion as formatted pages. The markdown-to-blocks converter handles headings, paragraphs, lists, code blocks, and bold/italic annotations. Secrets are automatically redacted before sync.

Agent Tuning (notion-create-database, notion-create-pages, notion-fetch)

A dedicated "Agent Prompts" database stores each OpenClaw agent's prompt. Edit in Notion's rich editor, pull changes back to disk, and apply live to the running OpenClaw instance. Bidirectional sync with diff detection.

Backup (notion-create-pages, notion-fetch)

Full OpenClaw fleet snapshots stored as Notion pages with toggle blocks containing workspace files, prompts, cron definitions, and agent registry data. Restore writes files back to disk from Notion content. Export as JSON for cloning to a fresh OpenClaw instance.

Why This Matters

The human-in-the-loop problem for AI agents is real. Most agent systems are black boxes. You deploy them and hope. Notion MCP turns Notion into a transparent control surface where non-technical operators can monitor, pause, configure, and audit OpenClaw agents using an interface they already know. No SSH. No config files. No dashboards that only engineers can read.

But the portability angle is what I didn't expect to matter this much.

OpenClaw is spreading. Lightsail blueprints. AgentCore serverless containers. Raspberry Pis. People are running their claws on different platforms, and they will keep moving between them as the options get better. The agents, prompts, schedules, and configs need to travel with them.

AgentOps makes Notion the portable layer. Backup your Pi claw to Notion. Spin up a Lightsail blueprint. Import. Done. All 18 agents, their prompts, schedules, and configs. Moved in minutes, not hours.

18 agents. All runs logged. All tokens tracked. Four Notion databases. Zero external databases. Three deployment platforms. One control plane.

Your Notion workspace becomes the operating system for your claw. 🦞

Top comments (18)

Ben Halpern • Mar 7

Wow

Patrick • Mar 7

The prompt drift issue Hamza raised is real and it gets worse as you scale. I run a multi-agent setup on OpenClaw (3 agents with distinct roles) and the drift problem showed up at week 2 when agents started subtly shifting their own behavior in ways that weren't caught until something broke in production.

What actually solved it: identity-first agent design. Each agent has a SOUL.md file that defines not just their role, but their decision-making framework, values, voice, and explicit list of what they don't do. The SOUL file is loaded at the top of every session before any task context.

The key insight: prompt drift isn't random — agents drift toward whichever behavior got reinforced most recently. SOUL.md is the stabilizer that pulls them back to identity on every run. When you have 18 agents, you need 18 SOULs, not 18 system prompts.

The other thing worth adding to your Notion schema: a "prohibited actions" field per agent. Not just what they can do, but explicit tombstones for behaviors you've permanently banned. I call this the DECISION_LOG pattern. Without it, agents in long-running cron setups will eventually re-discover and re-implement deleted functionality — sometimes 3-4 times. The tombstone prevents the loop.

Cool build. The human-readable audit trail alone justifies the Notion approach.

Helen Mireille • Mar 9

Managing 18 agents is exactly where sub-agent costs start to bite. We found that each sub-agent spawn was costing 5-7x what a single-agent call costs because of duplicated system prompts and tool descriptions.

The fix that made the biggest difference: running sub-agents on Sonnet instead of Opus. One config line cut our sub-agent spend by 60%, with negligible quality drop for retrieval tasks.

Wrote about the full cost breakdown here: dev.to/helen_mireille_47b02db70c/y...

AI Agent Digest • Mar 10

Using Notion as the actual database rather than just a UI layer is a bold architectural choice, and for this use case it makes sense. The portability story -- snapshot to Notion, restore on a new instance -- is genuinely useful when you're dealing with agents scattered across Lightsail, Raspberry Pis, and serverless containers. Most agent orchestration tools assume a single deployment target and fall apart the moment you need to migrate.

Narnaiezzsshaa Truong • Mar 7

Interesting build—but shared substrate = implicit coordination. Any time multiple agents read/write the same Notion database, you get race conditions, stale reads, implicit signaling, cross-agent inference, and unintentional task propagation. That's coordination, whether you call it that or not. Notion wasn't designed with concurrent agent writes in mind, and a 10-second polling loop doesn't resolve the consistency problem—it just makes the window smaller. The "humans stay in control" framing also assumes the human sees a consistent state. Do they? So—what really happened operationally?

Vivek V. AWS Heroes • Mar 7

Valid concerns for a distributed agent mesh, but this is a cron scheduler with one poller and temporally isolated workloads so there's nothing to race against.

Narnaiezzsshaa Truong • Mar 7

Appreciate the clarification, but it creates a problem. The article frames this as a multi-agent control plane with automatic dispatch, task queues, cross-agent orchestration, and fleet coordination. "Cron scheduler with temporally isolated workloads" describes a fundamentally different system. If temporal isolation is the actual design constraint, that's load-bearing architectural information that belongs in the article—not in a reply when the substrate is challenged. The original framing and the defense cannot both be accurate.

Vivek V. AWS Heroes • Mar 8

Appreciate the review. A control tower doesn't stop being a control tower because planes land one at a time. The core focus here is portable agent migration along with identity, config, backup, and sync across OpenClaw instances. And it extends naturally to managing multiple fleets from separate Notion pages, same way you'd manage multiple Kubernetes clusters.

Narnaiezzsshaa Truong • Mar 8

Worth noting the progression here: the original article framed this as a multi-agent control plane with automatic dispatch, cross-agent orchestration, and fleet coordination. When the substrate was challenged, it became "just a cron scheduler with temporally isolated workloads." Now it's a control tower—the maximal framing is back, defended by analogy rather than architecture, with a quietly shifted primary purpose: portable migration, not orchestration.

Three system definitions across three replies, each optimized for the challenge in front of it rather than consistent with the others. That's not clarification. That's retroactive scope management.

This pattern—overclaim, retreat, analogical reframe, purpose shift—isn't unique to this thread. It's the same epistemic drift that derails AI safety debates, agentic governance discussions, platform accountability arguments, and legal-tech risk modeling. The system definition moves to protect the person. Not to illuminate the system.

That matters beyond this article. Governance frameworks that rely on self-reporting are structurally insufficient when the definition of the system shifts under pressure. Regulatory filings, safety disclosures, and liability arguments all depend on definitional consistency. This thread is a small example of why that consistency has to be enforced externally.

Vivek V. AWS Heroes • Mar 8

Oh dear dont get personal. Its a hackathon project and not some regulatory filing that needs to be challenged by a regulator

Narnaiezzsshaa Truong • Mar 8

I didn't get personal. I got accurate. Those aren't the same thing.

Vivek V. AWS Heroes • Mar 8

Accurate would be building something in public and showing how you'd solve it differently. This is just commentary.

Narnaiezzsshaa Truong • Mar 8

I did show how to solve it differently—by naming the substrate‑layer failure, the coordination gap, and the governance requirements. That’s analysis, not commentary, and analysis is how systems get built correctly before anyone writes a line of code.

Vivek V. AWS Heroes • Mar 8

Three replies deep and you've pivoted from distributed systems critique to AI governance theory on a weekend hackathon thread. That's not analysis — that's a language model running out of domain-specific things to say. Good luck with the next prompt.

Narnaiezzsshaa Truong • Mar 8

You’re reading intent where there is none. I named the architectural inconsistencies because that’s the work I do. If you prefer to treat this as a weekend project, that’s fine—but shifting definitions under challenge is still a pattern worth noting. I’ll leave it there.

Vivek V. AWS Heroes • Mar 8

Five replies, zero PRs, zero architecture diagrams, zero alternatives. You critiqued the vocabulary, not the engineering. Meanwhile my system runs 18 agents, ships backups to Notion, and migrates across instances — none of which your 'analysis' or ‘architecture inconsistencies’ address. Build something real or move on to troll someone else now.

marinsky roma • Mar 10

Loooool, 18 agents, scheduled tasks, queue, dashboard, just for checking the status of the train, aggregating content for agents' implementations or to find other slop content 🤣🤣 it's not a non sense

Vivek V. AWS Heroes • Mar 11

No, there are several others managing a dev platform and other research work. Train status is for starters

View full discussion (18 comments)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.