The AI Harness Enabling 6 Hours of Fully Autonomous Coding

Every developer using AI agents has experienced the same frustrating loop. You assign a complex feature to an agent, step away for an hour, and return to find a green checkmark on the tests but a product that completely misses the original requirement. The agent has entered a hallucination cycle, where it writes a bug, writes a test that ignores that bug, and then declares victory. When the session eventually times out or the context window overflows, the entire mental model of the project vanishes, forcing the developer to restart the explanation from scratch.

The Architecture of Autonomous Execution

Tenet addresses these failures by moving away from simple chat interfaces and implementing a rigorous agent harness designed for long-term stability. The process begins not with a prompt, but with a structured interview to solidify requirements. This leads into a technical research phase, the creation of architecture diagrams, and the development of UI mockups. Once the foundation is set, Tenet freezes the implementation specifications, test harnesses, and scenarios into immutable documents. The actual work is then decomposed into a Directed Acyclic Graph (DAG), which allows the system to identify independent tasks and execute them in parallel without creating circular dependencies.

To ensure the quality of these autonomous outputs, Tenet employs a triple critic pipeline. The first layer is the code critic, which verifies if the implementation aligns with the original specifications. The second is the test critic, which evaluates whether the existing tests are actually sufficient to validate the specific task at hand. The final layer is an end-to-end (e2e) evaluation powered by Playwright, which simulates actual user behavior in a web browser to confirm the feature works in a real-world environment. Crucially, these evaluations occur in a fresh context, meaning the critic does not share the same memory or biases as the agent that wrote the code.

Persistence is handled through a dedicated directory structure that transforms the AI's volatile memory into a permanent engineering record. All decision logs, state changes, and artifacts are stored in the `.tenet/` directory, which includes the following components:

`.tenet/interview`

`.tenet/spec`

`.tenet/harness`

`.tenet/visuals`

`.tenet/knowledge`

`.tenet/journal`

`.tenet/steer`

`.tenet/status`

`.tenet/SQLite state`

On the technical side, Tenet is built to be model-agnostic, supporting adapters for Claude Code, OpenCode, and Codex. It integrates with Model Context Protocol (MCP) servers and operates via a Command Line Interface (CLI). To prevent data loss during long runs, the system utilizes SQLite with Write-Ahead Logging (WAL) for persistent state management and includes an orphan job recovery feature to resume interrupted tasks. In practical testing, this framework has enabled the system to run autonomously for over six hours, producing production-ready results without human intervention.

From Session-Based Chat to State-Based Engineering

The fundamental shift Tenet introduces is the transition from session-based interaction to state-based engineering. Most current AI coding tools operate as a series of chat sessions. This creates a dangerous feedback loop known as self-approval bias, where the agent becomes the sole judge of its own work. Because the agent remembers what it intended to do, it often overlooks the gap between that intention and the actual code it produced. By decoupling the author from the critic and enforcing a fresh context for every validation step, Tenet forces the AI to prove its work against objective specifications rather than its own internal narrative.

This architectural change also transforms how developers steer the AI. In a traditional session, changing the direction of a project requires a massive re-explanation or a complete reset of the chat history. Tenet replaces this with a steer message system. When a developer provides a correction, the system saves it as a persistent instruction in the `.tenet/steer` directory. The agent then automatically picks up these modifications at the relevant step in the DAG. This allows for asynchronous control, where a human can nudge the project's direction without breaking the autonomous loop.

Ultimately, Tenet treats the AI agent not as a chatbot, but as a managed external contractor. It implements the same rigor required for human outsourcing: detailed documentation, strict verification milestones, and a clear hand-off structure. The records accumulated in the `.tenet/` folder are not mere logs of a conversation, but a growing knowledge base that serves as the project's single source of truth. This shifts the unit of AI coding from the generation of a single function or file to the management of the entire project lifecycle.

The success of autonomous AI coding depends less on the raw intelligence of the underlying model and more on the precision of the harness that governs its execution.

The AI Harness Enabling 6 Hours of Fully Autonomous Coding

The Architecture of Autonomous Execution

From Session-Based Chat to State-Based Engineering

Related Articles