Every developer using Claude Code has encountered the jarring experience of the cold start. You spend hours refining a complex feature, navigating a deep directory structure, and establishing a specific mental model with the AI, only to close the terminal and lose that momentum. When you return to the project, the AI has no memory of the previous session. You are forced to re-upload files, re-explain the current objective, and waste precious tokens just to bring the model back up to speed. This gap in continuity transforms a fluid coding experience into a series of fragmented restarts.
The Architecture of Local Project Memory
Recall addresses this continuity gap by implementing a local project memory system that operates entirely on the user's machine. Unlike traditional context management that relies on external databases or expensive LLM-based summarization, Recall captures session logs and compresses them into a restartable format stored directly within the project root. The system operates without requiring API keys and ensures that no data is transmitted to external models during the summarization process, effectively eliminating token costs for memory maintenance.
State management is handled through a dedicated `.recall/` directory. Within this folder, the tool maintains two primary files that serve distinct purposes. The first is `summary.md`, which contains a compressed version of the session's narrative. The second is `context.md`, which stores deterministic facts extracted from the session and the version control system. This includes the user's initial goal, a list of modified files, the specific commands executed, the exact point where the session was interrupted, and the output of `git diff --stat`.
It is important to distinguish Recall from `CLAUDE.md`. While `CLAUDE.md` acts as a set of permanent instructions or a style guide defining how the AI should behave and work within a project, Recall functions as a chronological ledger. If `CLAUDE.md` is the rulebook, Recall is the diary, recording exactly what happened and where the developer left off.
The Shift to Classical NLP for Zero-Cost Context
The technical pivot that makes Recall efficient is its rejection of LLM-based summarization in favor of classical natural language processing. The summarization engine, located in `scripts/summarizer.py`, utilizes TF-IDF (Term Frequency-Inverse Document Frequency) and the TextRank algorithm. Instead of generating new text, Recall employs extractive summarization, which treats sentences as nodes in a graph and identifies the most central, representative sentences based on similarity. This approach ensures that the summary is grounded in the actual words used during the session, avoiding the hallucinations that can occur when an LLM summarizes its own logs.
To ensure maximum compatibility, the implementation relies exclusively on the Python standard library. This design choice removes the need for a `pip install` step for basic functionality. However, the system is optimized for performance; if `numpy` is detected in the environment, Recall uses it to vectorize mathematical operations, significantly increasing processing speed for massive session logs. In environments without `numpy`, the tool falls back to a pure-Python implementation that produces identical results, ensuring consistency across different developer setups.
Quality assurance is integrated directly into the development pipeline via `benchmarks/bench.py`. This benchmark harness measures latency and throughput while scoring the effectiveness of the selected key sentences against lead, tail, and random baselines. The system also employs a quality gate using the `--check` flag to verify that the `numpy` path and the pure-Python path select the exact same sentences. Security is handled through a rigorous CI pipeline in `.github/workflows/` that tests across Python versions 3.9 through 3.13. The codebase is audited using Bandit and CodeQL, and sensitive information is handled by a dedicated redaction suite verified in `tests/test_redact.py`.
For the practitioner, the impact is a total removal of the financial and privacy overhead associated with context preservation. The workflow is streamlined: after a session, the developer runs the `/recall:save` command to commit the current state to the local files. When a new session begins, the AI can reference these files to instantly regain the project's state. Control is maintained through simple file-based triggers. Users can override default settings by placing a `.recall/config.json` file in the project root, or they can pause logging entirely by creating a `.recall/.capture-paused` file.
Beyond individual use, Recall introduces a strategic choice in memory management. By default, the `.recall/` folder is included in `.gitignore`, keeping the session memory private to the local developer. However, by committing this folder to a shared repository, a team can transform individual session logs into a shared team memory, allowing collaborators to see exactly how a feature was developed and where the previous developer stopped.
This move toward local-first, non-LLM context management suggests a future where AI tools are not just powerful engines, but sustainable systems that respect both the developer's budget and the privacy of their local environment.




