Agent OS supports per-agent limits and server behaviour so you can cap runtime, tokens, and turns. This page summarizes the main limits; exact env or config keys may vary by build—check the runtime and config code.
When creating or updating an agent, you can set limits in the agent config (e.g. via API or dashboard):
| Limit | Description |
|---|---|
| max_turns | Maximum tool-call turns per run. After this, the run stops even if the model would continue. |
| max_tokens | Approximate token cap for the model response (or total context). |
| max_runtime | Maximum duration (e.g. in seconds or ms) for a single run. |
The runtime enforces these so a single agent run cannot run indefinitely or consume unbounded context.
~/.agent-os/workspaces/{agent_id}/). No access outside that sandbox.Agent OS does not document a built-in global rate limit (e.g. per-IP). If you need rate limiting, put the server behind a reverse proxy (e.g. nginx, Caddy) and configure limits there, or add rate limiting in the server code and document the env vars (e.g. RATE_LIMIT_MAX, RATE_LIMIT_WINDOW_MS).
For the latest limits and env vars, see the runtime and config code in the repo (e.g. src/core/runtime.ts, src/core/config.ts, agent limits in types/agent.ts).