Every engineer who builds a cyclic multi-agent system eventually encounters the exact same nightmare: The Infinite Revision Loop.
You build a Solver agent to draft a solution and a Critic agent to review it. The Solver outputs a draft. The Critic spots an error, gives it a score of 7/10, and requests a revision. The Solver apologizes, rewrites the draft, and accidentally makes the exact same mistake. The Critic gives it a 7/10 again.
Left unchecked, this cycle will run until your cloud provider cuts off your API key, burning thousands of tokens and leaving the user staring at a frozen UI.
In standard software, loops are deterministic. In LLM-driven architecture, loops are stochastic. To make ByteTect's OMAS (Organizational Multi-Agent System) production-ready, we had to engineer strict, deterministic circuit breakers into our LangGraph routing logic.
Here is exactly how we tame the graph in router.py.
1. The Entry Point: Conditional Edges
In LangGraph, state transitions are handled by conditional edges. After every major node execution, control is handed back to our central routing function, should_continue.
Before we even look at the LLM's requested next_actor, we run a gauntlet of deterministic checks:
def should_continue(state: AgentState, *, max_loops: int = 5) -> str:
if _has_error(state):
return "error"
if _is_stagnating(state):
return "safety"
last_action = state.get("next_actor")
# ... standard routing logic (search, analyze, etc.)
loop_count = _get_loop_count(state)
if loop_count >= max_loops:
return "end"The LLM is completely blind to these checks. It doesn't matter how badly the Critic wants to request another revision; if the system hits max_loops, the graph violently halts the cycle and forces the state to "end" (triggering our polisher_node).
2. The Circuit Breaker: Detecting Stagnation
A hard loop cap of 5 is a good safety net, but it is inefficient. What if the agents get stuck arguing on iteration 2? Waiting for 3 more useless iterations burns time and money.
To solve this, we built a stagnation detector. Since our Critic and Visual Critic nodes are forced to output a strict JSON payload containing a numerical "score", we append this to the graph's critique_history state array.
Our router inspects this array in real-time:
def _is_stagnating(state: AgentState, window: int = 3) -> bool:
history = state.get("critique_history") or []
scores = []
for item in history:
score = item.get("score")
if isinstance(score, (int, float)):
scores.append(float(score))
if len(scores) < window:
return False
recent = scores[-window:]
# If the last 3 scores are identical, the agents are stuck in a stalemate.
return all(score == recent[0] for score in recent)If the Critic gives a draft a 7, the Solver revises it, and the Critic gives it a 7 again... and again... the _is_stagnating function returns True.
3. Graceful Degradation: The Safety Node
When _is_stagnating trips, we don't just throw a 500 Internal Server Error. In a collaborative multi-agent system, the ultimate fallback is the human in the loop.
The router forcefully diverts the graph to the safety_node:
async def _safety_node(state):
message = (
"The agents are struggling to reach a consensus. "
"Would you like to step in?"
)
return {"messages": [AIMessage(content=message)], "current_draft": message}This updates the state and streams directly to the frontend. Through our WebSocket architecture, the user's UI instantly switches from a "Generating..." state to a "Waiting for User Input" state. The human can read the critique_history on the dashboard, see exactly what the agents are arguing about, manually inject a hint, and resume the graph.
Engineering Over Prompting
The AI industry is obsessed with trying to solve orchestration problems by writing "better prompts" or waiting for "smarter models" like GPT-5.
We fundamentally disagree. You cannot prompt your way out of a stochastic loop. Production-grade Multi-Agent Systems require traditional software engineering: strict state management, observability, and deterministic circuit breakers.