Poolside Laguna XS.2 Shifts Coding AI from Cloud APIs to Local GPUs

The developer community on GitHub and Hugging Face has spent the last few days reacting to a sudden surge in activity from the US-based startup Poolside. For months, the industry standard for high-performance coding assistance has relied on massive cloud-based APIs that require constant connectivity and a willingness to send proprietary code to external servers. However, a shift is occurring as developers discover a new class of high-performance coding agents capable of running directly on local hardware, removing the middleman of the cloud API.

The Architecture of Local Agency

Poolside has introduced two distinct large language models under the Laguna banner, each designed for different deployment scales. The first, Laguna M.1, is a Mixture-of-Experts (MoE) model featuring 225 billion total parameters, though it maintains efficiency by utilizing only 23 billion active parameters during actual computation. The second, and perhaps more disruptive for the individual developer, is Laguna XS.2. This 33 billion parameter MoE model, which utilizes 3 billion active parameters, is released under the Apache 2.0 license, making it fully open source.

These models are not designed as simple chat interfaces but are optimized for agentic workflows where the AI autonomously writes code and interacts with external tools. To support this, Poolside released pool, a dedicated tool for driving coding agents, and shimmer, a web-based coding environment optimized for mobile devices. This combination allows developers to maintain agent-based development cycles regardless of their hardware location. While Laguna XS.2 is open for local installation, Laguna M.1 is currently available through a temporary free API on deployment platforms including OpenRouter, Ollama, and Baseten.

Breaking the Dependency on Pre-trained Baselines

Most contemporary AI labs in the United States have followed a predictable path to efficiency by taking Alibaba's Qwen series and applying fine-tuning to adapt it for specific tasks. Poolside explicitly rejected this shortcut. Instead, the team pursued a ground-up training trajectory, utilizing a massive dataset of 30 trillion tokens to build the models from scratch. To accelerate this process, they implemented the Muon optimization technique, which increased training speeds by approximately 15 percent compared to current industry standards.

The intelligence of the Laguna models stems from a scientific approach to data curation. Poolside employed a system called AutoMixer, which used 60 different proxy models to mathematically determine the optimal ratio of code, mathematics, and general web data. To handle the rarest and most difficult coding scenarios, the team filled 13 percent of the total training data with high-quality synthetic data, creating edge cases that are rarely found in organic public repositories.

This architectural independence transforms the user experience from a cloud-dependent subscription to an on-premise asset. For government agencies or enterprise firms where security is the primary constraint, the ability to host model weights on internal servers means powerful coding agents can now operate in completely offline environments. The models further evolved through a reinforcement learning phase, where they solved actual software engineering problems within virtual environments and received rewards for successful execution. This transition from simple text prediction to goal-oriented problem solving is what elevates Laguna from a autocomplete tool to a functional agent.

The battle for AI coding supremacy has moved from the cloud API to the local GPU memory.

Poolside Laguna XS.2 Shifts Coding AI from Cloud APIs to Local GPUs

The Architecture of Local Agency

Breaking the Dependency on Pre-trained Baselines

Related Articles