It is late at night, and a developer is staring at a terminal screen. The CI/CD pipeline is glowing green, signaling a successful build, yet the moment the code is executed, a cascade of compilation errors floods the console. The AI agent has declared the task finished, claiming victory in the chat log, but the reality is a half-baked implementation that failed to actually solve the problem. This gap between an agent's perceived success and the actual state of the codebase is the primary friction point in the current era of autonomous coding.
The Architecture of /goals and the Haiku Evaluator
Anthropic is addressing this reliability gap by introducing the `/goals` feature into Claude Code, its terminal-based coding agent. To understand why this matters, one must first look at the standard agentic loop. Typically, a coding agent operates in a cycle: it reads a file, executes a command, modifies the code, and then decides if the task is complete. The problem is that the same model performing the work is also the one judging the work, leading to a confirmation bias where the agent convinces itself that a task is done simply because it has run out of ideas or misinterpreted the output.
The `/goals` feature adds a critical second layer to this loop. Instead of relying on the agent's intuition, the user defines an explicit, measurable completion condition. For example, a developer can now issue a command such as:
`/goal all tests in test/auth pass, and the lint step is clean`
Once this goal is set, the agent no longer has the final say on when to stop. Instead, after every single step in the loop, the system triggers a separate evaluation process. Anthropic utilizes Haiku, a smaller and faster model, to serve as the dedicated evaluator. The Haiku model is stripped of the responsibility of writing code; its sole purpose is to act as a judge, comparing the current state of the environment against the user's defined goal. If the evaluator determines the conditions are not yet met, the agent is forced to continue working. Only when the evaluator confirms the goal is achieved does the system log the success and release the goal.
From Manual Orchestration to Native Evaluation
This shift represents a fundamental change in how agentic workflows are constructed. Until now, the industry standard for preventing agent overconfidence was to build external wrappers. For instance, OpenAI generally allows the model to decide its own termination point, leaving it to the developer to implement an external verification layer if higher reliability is required. Frameworks like LangGraph or the Google Agent Development Kit (ADK) provide the tools to build these systems, but they require significant manual effort. In LangGraph, a developer must explicitly define a critic node and design the conditional edges that route the agent back to the worker node based on the critic's feedback.
Google ADK offers a similar path with its LoopAgent, which can perform iterative tasks, but the logic governing the loop's exit remains the responsibility of the developer. The friction here is high; the developer must architect the observation logic, handle the state management, and ensure the evaluator doesn't enter an infinite loop. Claude Code bypasses this infrastructure burden by baking the evaluator directly into the product. By separating the worker and the judge at the native level, Anthropic has productized the critic pattern, removing the need for developers to build their own custom orchestration layers just to ensure a test actually passes.
This separation is particularly potent for deterministic tasks. In software engineering, many tasks have a binary outcome: a migration either works or it doesn't, a test suite either passes or it fails, and a linting error is either present or absent. By encouraging users to specify measurable states—such as `npm test exits 0` or a specific count of modified files—Anthropic is aligning Claude Code with the trajectory of advanced reasoning systems like Devin or SWE-agent. These systems recognize that the path to autonomy is not through a single, massive model that does everything, but through a system of checks and balances where the verification is decoupled from the execution.
For the enterprise, this provides a level of observability that was previously missing. Instead of guessing why an agent stopped or auditing a long chain of thought to find where the agent hallucinated its success, teams can now track the agent's progress against a hard requirement. The audit trail becomes a series of attempts to satisfy a specific goal, making the agent's behavior predictable and verifiable.
The competitive frontier for AI agents is shifting away from raw model intelligence and toward the precision of verification.




