Deterministic AI Orchestration Layer
LLMs are stochastic engines. Relying on them to directly trigger system execution is a catastrophic anti-pattern. This architecture strictly separates Decision from Execution, acting as a fault-tolerant mediator that traps hallucinations and enforces telemetry.
Unbounded AI agents frequently enter infinite execution loops on bad schema outputs, rapidly exhausting token budgets and DDoS-ing local APIs. This layer traps 100% of these loops at the boundary, returning deterministic -32000... JSON RPC errors.
By offloading physical execution to asyncio.to_thread and wrapping it with a rigid 5.0s wait limit, the orchestrator guarantees the main Event Loop never stalls.
The Guardrails Model
1. Execution Separation
The LLM generates the intent. The orchestrator intercepts, parses via rigid JSON schema validations, and physically maps to sandboxed routines.
2. Fault Tolerance
If execution fails or times out, the orchestrator traps the exception and forces a RE-EVALUATE_INTENT fallback state to the agent pipeline.
3. Concurrency Limits
A global Semaphore restricts active tool executions. If 50 agents spawn, the system throttles backend pressure naturally.