I Accidentally Rebuilt OpenHands From Scratch -- Here's What I Learned
The Problem I Was Trying to Solve
Building a SaaS where users describe features in plain English and AI agents write code requires solving three challenges:
- User isolation — User A’s code can’t leak to User B
- Persistence — Users expect work to persist across sessions
- Security — Untrusted code can’t run dangerous commands on production servers
Existing solutions didn’t combine all three: tools + isolated execution + session persistence.
What I Built
10 Coding Tools (Inspired by Gemini CLI)
from omniagents.tools.write_file_tool import WriteFileTool
from omniagents.tools.read_file_tool import ReadFileTool
from omniagents.tools.run_shell_command_tool import RunShellCommandTool
from omniagents.tools.glob_tool import GlobTool
from omniagents.tools.search_file_content_tool import SearchFileContentTool
from omniagents.tools.replace_tool import ReplaceTool
# + 4 more
Framework-agnostic tools that don’t depend on LangChain or smolagents.
3 Execution Backends
from omniagents.backends.local_backend import LocalBackend # Your machine
from omniagents.backends.docker_backend import DockerBackend # Isolated container
from omniagents.backends.e2b_backend import E2BBackend # Cloud sandbox
Same tools, different environments. Swap backends without changing agent logic.
3 State Managers
from omniagents.backends.state_manager import (
NoOpStateManager, # No persistence
GitStateManager, # Save to GitHub branches
GCSStateManager, # Save to Google Cloud Storage
)
The Multi-Tenant Pattern
Core insight: One backend per user, identified by project_id.
def get_agent_for_user(user_id: str):
backend = DockerBackend(
project_id=f"user-{user_id}", # Unique per user
state_manager=GCSStateManager(), # Persists to cloud
)
backend.start() # Loads previous state if exists
return LangChainAgent(
backend=backend,
model=ChatOpenAI(model="gpt-4"),
preset=PythonUVPreset(),
)
# Alice gets her own container + persistent storage
alice = get_agent_for_user("alice")
alice.run("Create a Flask app")
alice.backend.shutdown() # Saves to GCS
# Bob is completely isolated
bob = get_agent_for_user("bob")
bob.run("Build a CLI tool")
bob.backend.shutdown()
# Alice returns tomorrow -- her Flask app is still there
alice = get_agent_for_user("alice")
alice.run("Add authentication")
Framework Agnostic by Design
Same tools work with any framework:
agent = LangChainAgent(backend=backend, model=model, preset=preset)
agent = PydanticAIAgent(backend=backend, model=model, preset=preset)
agent = SmolagentsAgent(backend=backend, model=model, preset=preset)
Each tool has conversion methods (to_langchain_tool(), to_pydantic_ai_tool(), to_smolagents_tool()).
The Architecture
+------------------------------------------+
| Your LLM Framework |
| (LangChain, Pydantic-AI, smolagents) |
+--------------------+---------------------+
|
+--------------------v---------------------+
| 10 Core Tools |
| (framework-agnostic, just functions) |
+--------------------+---------------------+
|
+--------------------v---------------------+
| Execution Backend |
| (Local, Docker, or E2B) |
+--------------------+---------------------+
|
+--------------------v---------------------+
| State Manager |
| (Git, GCS, or nothing) |
+------------------------------------------+
Each layer is swappable.
What I Learned
You Can Do a Lot with a Few Tools
All coding agents use roughly the same core tools (read file, write file, run command). Quality varies based on attention to detail.
Best agents also implement:
- Empowering tools (browsing, search, etc.)
- Task Tracker Tool (Plan Mode)
- Think Tool (for complex reasoning)
- Delegate Tool (sub-agents for complex tasks)
Agent Frameworks Help You Start, Not Finish
Frameworks (smolagents, LangChain, Pydantic-AI) are great for prototyping, but hit walls when you need customization. A basic agent loop is only ~50 lines of code. When you own it, customization is trivial.
Recommendation: Use a framework to validate your idea, then rewrite the core loop yourself for control.
Where the Costs Come From
LLM costs: Proprietary models (GPT-4, Claude) add up fast. Open-source alternatives via HuggingFace providers (Groq, Cerebras) offer better rates and rate limits.
Compute options:
- E2B — Handles everything but expensive, Python-only
- Local Docker on EC2 — Predictable pricing, full control, security concerns with untrusted code
- Fly.io or Modal — Per-request scaling, mid-range cost
For side projects: start with local Docker. For production: explore Fly.io.
State Persistence Is Underrated
Real products need persistence for:
- Resuming work
- Debugging previous state
- Billing tied to artifacts
Git-based storage works well for small projects (free, version-controlled). GCS for larger files.
OpenHands Exists (And That’s OK)
OpenHands is an application (full product to deploy). Omniagents is a library (primitives to compose). Different goals.
Try It
uv add git+https://github.com/charles-azam/omniagents.git
from omniagents.backends.local_backend import LocalBackend
from omniagents.backends.state_manager import NoOpStateManager
from omniagents.tools.write_file_tool import WriteFileTool
from omniagents.tools.run_shell_command_tool import RunShellCommandTool
backend = LocalBackend(project_id="demo", state_manager=NoOpStateManager())
backend.start()
write = WriteFileTool(backend=backend)
write.execute(absolute_path="hello.py", content="print('Hello from Omniagents!')")
shell = RunShellCommandTool(backend=backend)
result = shell.execute(command="python hello.py")
print(result.content) # "Hello from Omniagents!"
backend.shutdown()
What’s Next?
Plans include:
- Fly.io backend
- MCP server interface for Claude Desktop integration