Every election cycle, voters are greeted by a deluge of glossy pamphlets and high-decibel promises. For the average citizen, these documents are often a wall of abstract rhetoric—vague commitments to improve the local economy or enhance public welfare that offer little in the way of concrete execution plans. In the rush of a campaign, critical but granular issues like the expansion of childcare centers or the repair of neighborhood roads are frequently buried under grand narratives. This information asymmetry leaves voters unable to truly compare candidates or hold them accountable, as the sheer volume of text makes manual analysis impossible.

The AI Pipeline for Massive Political Datasets

To bridge this gap, a research team from Korea University's Department of Political Science and International Relations developed Show GN, an AI-driven analysis service designed to decode the complexities of political promises. The scale of the project is immense, encompassing 550,000 pledges made by candidates in every local election held since 2006. Processing this volume of data required more than just a simple search tool; it required a sophisticated automated pipeline capable of turning fragmented, unstructured documents into a structured dataset.

The process begins with Optical Character Recognition (OCR) models that extract digital text from the images of official election brochures. Once the text is digitized, the system isolates individual pledges and employs embedding-based agenda analysis to categorize the content. This transformation converts physical paper documents into a machine-readable format that allows for quantitative analysis.

To ensure the analysis remains academically rigorous and globally comparable, the team implemented a dual classification system. They utilized the Comparative Manifestos Project (CMP), the gold standard for political science worldwide, alongside a specialized KMP classification system tailored to the specific administrative structures of South Korea. By blending these two frameworks, Show GN can categorize pledges through both a global lens and a localized context, ensuring that no nuance of the Korean political landscape is lost in translation.

From Text Extraction to Strategic Intelligence

While the extraction of 550,000 pledges is a technical feat, the true innovation of Show GN lies in how it interprets that data. The research team did not rely on a single model; instead, they combined the strengths of GPT and Gemini Pro, applying fine-tuning to optimize the models for political discourse. This hybrid approach allowed the team to accelerate the development cycle, moving from raw data collection to a fully functional service almost immediately after the National Election Commission released its data.

The system introduces a specialized AI persona known as Zhuge Gong-yak, named after the legendary strategist Zhuge Liang. This persona does not merely summarize text; it performs comparative analysis between candidates and identifies the strategic characteristics of their platforms. By analyzing the delta between different candidates' promises, the AI can highlight who is focusing on infrastructure versus social welfare, or who is offering specific metrics versus vague aspirations.

This shift represents a fundamental change in how political accountability works. The tension in previous elections was between the politician's rhetoric and the voter's memory. Now, that tension is shifted toward data integrity. By converting unstructured text into structured metrics, Show GN allows citizens to verify the feasibility of a promise based on data consistency rather than the charisma of the speaker.

The project is now expanding its scope beyond the campaign trail. The research team is developing a tracking mechanism to monitor whether these pledges actually translate into the enactment of local ordinances and legislation. By linking campaign promises to legislative outcomes, the system will provide a data-driven audit of political fulfillment.

Political promises are no longer static words on a page but dynamic data points that can be tracked, compared, and verified.