The current state of artificial intelligence presents a jarring paradox. A large language model can draft a legal brief or write a symphony in seconds, yet a physical robot often struggles with the basic intuition required to pick up a plastic cup without crushing it. This gap exists because while text is abundant and structured, the data required to teach a machine how to navigate the physical world is prohibitively expensive to collect. For years, the robotics industry has been trapped in a cycle of slow, manual data gathering, where every single movement must be recorded by a human operator or learned through grueling trial and error in the real world.

The Billion Dollar Bet on Synthetic Intuition

General Intuition is attempting to break this bottleneck by treating the digital world as a massive, pre-existing training ground for physical machines. The company recently secured $320 million in funding, bringing its valuation to $2.3 billion. This latest round was led by Khosla Ventures, with significant participation from General Catalyst, Jeff Bezos, and Eric Schmidt. When combined with the $134 million raised during its launch last October, General Intuition's total public funding now stands at $454 million.

The core of their strategy lies in the utilization of data from Medal, a popular game-clip sharing platform. Rather than relying on sparse, expensive robotics datasets, General Intuition leverages hundreds of millions of hours of gameplay footage. However, they are not simply feeding videos into a neural network. The secret ingredient is the use of action labels, which are precise records of exactly which buttons a player pressed and at what millisecond those inputs occurred. By pairing the visual output of the game with the exact input that caused it, the AI learns the fundamental relationship between action and reaction in a three-dimensional space.

Beyond Observation to Causal Reasoning

Most contemporary attempts to teach robots via video rely on visual inference, where the AI watches a human perform a task and tries to mimic the movement. General Intuition argues that this approach is fundamentally flawed because it captures the result but ignores the cause. Watching a video of a door opening does not tell a robot how much torque to apply to the handle or the exact timing of the pull. By integrating action labels, General Intuition moves from correlation to causation. The model does not just see a character jump; it understands that the specific input of a button press triggered the upward trajectory, allowing it to develop a form of spatial and temporal intuition.

Scaling this level of reasoning requires immense computational power. To handle the massive throughput of game data and action labels, General Intuition has partnered with CoreWeave, a specialized GPU cloud provider. The majority of their recent funding is earmarked for this infrastructure to fuel the pre-training of their next-generation models. The goal is to create a world model that serves as a virtual gym, where an AI agent can experience millions of iterations of physics, shadows, and collisions without ever risking a piece of hardware. This approach culminated in a striking proof of concept: a model trained primarily on virtual game data was deployed into a four-legged quadruped robot. While traditional models might require weeks of real-world training to adapt to a new environment, this model required only 8 minutes of real-world data for fine-tuning before it could operate effectively.

General Intuition is not interested in selling the simulation gym itself. Instead, the company is building agentic models—AI brains capable of independent judgment and physical execution—which they plan to make available via an API by the end of this summer. By shifting the burden of learning from the physical world to the infinitely scalable digital world, they are redefining the speed at which robotic intelligence can evolve.