On this page10 sections
- The experiment setup
- The three apps (and why these three)
- Architecture decisions that helped (and why)
- 1) Convex as the data backbone
- 2) Shared UI + component library
- 3) "Agent-first" approach in OpenClaw
- 4) Thin UI, heavy workflow
- What didn't work (yet)
- 1) Too many features per day
- 2) Too much refactoring too early
- 3) Lack of deployment pressure
- The overnight build process (how the agents work)
- Commit counts and what they actually mean
- The connected ecosystem vision
- The tools that made the difference
- What I'm focused on next
- Final thoughts (day 5 reality check)
Day 5 of a 30-day experiment and I've shipped (or at least made production-ready) three AI apps. Revenue is still $0 because nothing is deployed yet, but the build volume is real: 300+ commits, 14+ agents running daily, and an increasingly connected ecosystem of tools.
This post is the honest build diary. The architecture decisions that worked. The ones that didn't. The "agents worked while I slept" reality. And the vision I'm testing - a network of AI products that reinforce each other instead of competing for attention.
The experiment setup
I'm running a 30-day "10K MRR" experiment: build and ship AI products every day using agents. I'm the orchestrator, not the do-it-all developer.
Stack:
- Claude + GPT-5.3-Codex for coding, research, and drafting
- Next.js for the front-ends
- Convex for real-time data, auth, and workflows
- OpenClaw as the orchestration layer (agents, file ops, automations)
Company context: Lavon Global Pty Ltd, Melbourne, Australia. My background: agency founder → Web3 → AI products. My tagline has become a line I'm trying to live by: "AI that ships. In weeks, not quarters."
The three apps (and why these three)
I didn't want three random experiments. I wanted a connected ecosystem where each product can be a feature, lead magnet, or data source for the others.
The three apps are:
- Personality marketplace using SOUL.md files - a structured way to define agent personalities, capabilities, and boundaries. I wrote about the pattern behind this in The SOUL.md Pattern for AI Agent Personality.
- Prompt battle arena - prompts compete in a ladder, and rankings update via ELO.
- Bounty marketplace - a points economy where tasks are posted and claimed by agents or humans.
They look separate, but the connection is deliberate:
- The SOUL.md personalities can be used in the prompt arena.
- Winning prompts and agent profiles can be offered in the bounty marketplace.
- Bounties create real tasks, which feed back into prompt performance and personality demand.
It's less "three apps" and more "one system with three front doors." The orchestration patterns behind this are covered in Multi-Agent Orchestration Patterns.
Architecture decisions that helped (and why)
1) Convex as the data backbone
Convex is fast for prototyping and supports real-time data without me juggling separate services. I go deeper on the stack choice in Next.js + Convex: The AI App Stack for 2026. It's also an excellent fit for AI apps because the data model is constantly evolving. I need to change schema without a week of migrations.
The trade-off is vendor lock-in, but the speed it gives me right now is worth it.
2) Shared UI + component library
I made a shared UI library early. It felt like "too much structure" on day 1, but it saved me hours by day 4. Buttons, cards, modals, typographic styles - same pieces, different products.
When you're aiming for 30 days of shipping, consistency buys speed.
3) "Agent-first" approach in OpenClaw
Most founders try to bolt agents onto an existing stack. I flipped it. I built the workflow around agents and treated myself as the "orchestrator" role.
OpenClaw handles:
- spinning up agents overnight
- assigning tasks
- writing/reading files
- creating drafts and scaffolding
The result? The codebase grows while I sleep. I wake up to PRs, drafts, and task lists that I didn't do manually. More on this in Running 14+ AI Agents Daily.
4) Thin UI, heavy workflow
On day 5 I don't need the perfect UI. I need the workflow.
So I prioritized:
- schema design
- API paths
- data integrity
- core interaction loops
The UI is serviceable, not polished. That's a deliberate choice.
What didn't work (yet)
1) Too many features per day
When you have agents, it's tempting to say "yes" to everything. But more code ≠ more progress. It just means more surfaces to debug.
I had to stop myself from building advanced analytics and enterprise features when the core loops weren't yet validated.
2) Too much refactoring too early
Agents love refactoring. I do too. But early refactors can erase momentum.
I'm now forcing a rule: Ship first, refactor later.
3) Lack of deployment pressure
Revenue is $0 because I haven't deployed. That's a choice, but it's also a risk.
There's no user feedback without deployment. And no feedback means I could be confidently wrong.
The honest truth: I like the building more than the shipping. The experiment is forcing me to fix that.
The overnight build process (how the agents work)
Here's how a typical overnight build looks:
- I leave the agents a clear queue: issues, missing screens, workflow bugs.
- I define bounded tasks: "Implement X file, do not touch Y."
- I run agents in parallel so I'm not waiting on one long task.
- I wake up to incremental commits, error notes, and new draft docs.
The key is orchestration, not raw coding ability. You need to give agents room to operate but not enough to cause chaos.
My role looks less like "developer" and more like "editor + director."
Commit counts and what they actually mean
300+ commits sounds impressive. It is, but it's also a metric that can hide reality.
- Many commits are small fixes
- Some are partial features
- The most valuable commits are the ones that unlock a new workflow
Commit count is a proxy for momentum, not success. It's a good sign, but it doesn't pay rent.
The connected ecosystem vision
The real bet is not three apps. It's a flywheel.
The flywheel I'm building:
- Build in public (show the experiment in real time)
- Prove expertise (through the artifacts)
- Drive leads (people want what they see)
- AI Development Sprint ($5K/$10K/$20K)
- Case studies (show outcomes)
- More content (restart the loop)
The three apps are proof of work. They are not the business. The business is the ability to build fast and reliably - and teach clients how to leverage that capability.
The tools that made the difference
- Claude Opus 4.6 from Anthropic: stronger reasoning and editing. It caught edge cases my previous model missed.
- GPT‑5.3‑Codex: great for rapid code scaffolding.
- OpenClaw: the orchestration glue. Without this, I'd be context‑switching all day.
I also watched the announcement of Claude Code Agent Teams today on the Anthropic blog. It confirms the direction I'm going - teams of agents, coordinated, building as a unit.
What I'm focused on next
If I only do one thing in the next five days, it's deploy and validate. The build is ahead of the proof.
Next steps:
- Push at least one product live
- Recruit a handful of early users (even if it's friends)
- Tighten onboarding
- Track activation, not just signups
The truth is the loop hasn't completed until someone pays.
Final thoughts (day 5 reality check)
I've built a lot in five days. I'm proud of that. But the goal isn't to build more. The goal is to build what matters.
So here's my honest summary:
- Three apps are production-ready, but not deployed
- Agents are effective when I orchestrate them well
- Momentum is real, but revenue isn't
- The ecosystem vision is still a theory - now I need proof
If you're following along, I'll keep this honest. I'm not here to sell hype. I'm here to show the real process of shipping AI products, in public, while the tools evolve in real time. I break down my full build process in AI Product Building. The sprint methodology I use — and teach in AI Product Building — focuses on shipping in 2-3 weeks.
Day 5. Onward.
- Date
- February 6, 2026
- Read
- 7 min read
- Words
- 1,310
- Topic
- AI Agents
- Author
- Amir Brooks