A user types the word disregard into the search bar of Google AI Overview, expecting a standard dictionary definition. Instead of a linguistic explanation, the AI returns a jarring system message: Understood. Let me know whenever you have a new prompt or question! The interaction ends there, leaving the user with a blank space where a definition should be. It is a momentary glitch, a digital hiccup in a system designed to organize the world's information, but it points to a much deeper, more systemic instability.
The Spelling Failures of a Generative Giant
These errors are not isolated incidents of software instability. When asked how many Ps are in the word Google, the AI confidently asserts there are two. When questioned about the number of Rs in the word poop, it insists there is one. The failures extend to basic orthography as well. The AI renders journalism as journadism and, while correctly identifying that the surname of a former U.S. president contains one P, it spells the name as trpum. There is a profound irony in a system capable of generating complex application code and solving high-level mathematical theorems that struggles with tasks a kindergartner could master.
This pattern of failure emerges at a critical juncture for the company. Google has spent 29 years building the world's dominant search engine, and it is now aggressively pivoting toward a generative AI-centric model. However, these basic errors suggest a fundamental ceiling in how these models process text. This is not a new struggle for AI Overview. The service previously gained notoriety for suggesting that users eat rocks or apply non-toxic glue to pizza cheese, often because it misinterpreted satirical posts from Reddit or The Onion as factual guidance.
Google has acknowledged that counting characters within a word is a technical limitation that current large language models have yet to overcome. While the company maintains it is working on a fix, the industry consensus is that this is not a bug that can be patched with a simple update. The problem is not a lack of data or a failure in training, but rather a consequence of the very architecture that makes these models powerful.
The Tokenization Gap and the Illusion of Reading
To understand why an AI cannot count the letters in its own name, one must understand that AI does not read text the way humans do. A human reads a word by scanning individual letters and combining them into a recognizable shape. In contrast, the Transformer models powering Google AI process text through a method called tokenization. Instead of seeing letters, the AI breaks text into chunks known as tokens, which are then converted into numerical values.
When the AI encounters the word the, it does not see T, H, and E. It sees a single unique numerical code that represents the entire token the. The individual characters are effectively erased during the encoding process, replaced by a mathematical representation of the word's meaning and its relationship to other words in a high-dimensional space. Consequently, asking an AI to count letters is like asking a person to describe the chemical composition of a brick while they are only allowed to see the brick as a single, solid object. The AI knows what the brick is and where it fits in the wall, but it has no inherent visibility into the grains of sand that compose it.
Matthew Guzdial, a professor at the University of Alberta, explains that the AI treats words as numerical blocks. It can determine the precise semantic role of a word within a sentence, but it remains blind to the physical components of that word. The AI is not reading; it is calculating patterns. In this numerical environment, the number of letters in a word is not a piece of data the model naturally tracks.
This design choice is a trade-off for efficiency. Sheridan Feucht, a researcher at Northeastern University, notes that creating a perfect tokenizer is virtually impossible because the model requires chunking to process vast amounts of data quickly. To maintain operational speed and handle long contexts, the AI must group information into larger clusters. The inability to spell is the direct result of a system optimized for rapid context acquisition over granular character recognition.
Because the AI operates on probabilistic patterns rather than literal truth, it does not realize it is guessing when it tells a user that Google has two Ps. It is simply generating the most likely sequence of tokens based on its training, even if that sequence is factually incorrect. This structural blind spot confirms that while AI can simulate intelligence and logic, it lacks a fundamental grasp of the physical reality of the text it generates. The necessity for human cross-verification remains absolute because the AI is not verifying facts, but calculating probabilities.
This gap between semantic understanding and literal accuracy defines the current era of generative AI, where the most sophisticated tools in history can still be defeated by a simple spelling bee.




