Skip to content

Related Research

Web Edition (EN) | Updated March 12, 2026

Context

The Grounding Page Standard defines an architecture for entity-centric web pages optimized for AI retrieval. It was developed from practical experience with AI systems and structured data, not from a single academic study.

However, a growing body of research is beginning to investigate similar questions: How should web content be structured for AI systems? What role does structured data play in Retrieval-Augmented Generation? And do entity-centric page designs improve answer quality?

This page collects academic and independent research that supports similar architectural principles. Each entry includes a summary of the study, its key findings, and an assessment of how it relates to the Grounding Page Standard.

Important note: None of these studies directly evaluate Grounding Pages. The connections described here reflect architectural alignment, not empirical validation of this standard. The wording on this page uses terms like "aligns with", "supports similar principles", and "mirrors" to make this distinction clear.

Structured Linked Data as a Memory Layer for Agent-Orchestrated Retrieval

Authors: Andrea Volpini, Elie Raad, Beatrice Gamba, David Riccitelli
Published: 2026
Source: arXiv:2603.10700

Summary

This paper investigates whether structured linked data can improve retrieval accuracy in RAG systems. The authors tested different document representations across four domains (editorial, legal, travel, e-commerce) using Vertex AI Vector Search 2.0 and Google's Agent Development Kit. Seven experimental conditions were evaluated, comparing plain HTML, HTML with JSON-LD markup, and enhanced entity pages.

Key Findings

The study found that adding JSON-LD markup to existing HTML pages produced only modest improvements in retrieval accuracy. The significant gains came from a different approach: restructuring the content itself into dedicated entity pages where facts, properties and relationships are directly visible and navigable in the page content.

These "enhanced entity pages" achieved a +29.6% accuracy improvement for standard RAG and +29.8% for the full agentic pipeline compared to plain HTML. The best-performing variant (Enhanced+) reached 4.85 out of 5.0 in accuracy and 4.55 out of 5.0 in completeness. The study follows the principle of one page per entity.

Relevance for the Grounding Page Standard

This paper provides the closest architectural parallel to Grounding Pages found in academic literature so far. Several design decisions align directly:

  • Visible content as primary source: The study confirms that AI retrieval systems extract information primarily from visible page content, not from metadata or markup alone. The Grounding Page Standard follows the same principle: the visible text is the authoritative source, with JSON-LD mirroring it as a secondary layer.
  • One page per entity: The enhanced entity pages in the study follow the same structural unit as Grounding Pages: each page describes exactly one entity with its facts, properties and relationships.
  • Structured navigation: The study's best-performing variant includes breadcrumbs and navigable relationships between entities. The Grounding Page Standard requires a similar structure through its entity ontology and cross-referencing rules.
  • JSON-LD as complement, not substitute: The finding that JSON-LD alone produces only marginal improvements supports the Grounding Page Standard's position that structured data must mirror visible content, not replace it.
What this does not mean: The study does not evaluate Grounding Pages specifically. The "enhanced entity pages" were built for a different context (Linked Data Platform) and tested in a controlled environment. The architectural alignment is notable, but direct transferability should not be assumed without further research.

Generative Engine Optimization: How to Dominate AI Search

Authors: Mahe Chen, Xiaoxuan Wang, Kaiwen Chen, Nick Koudas
Published: September 2025
Source: arXiv:2509.08919

Summary

This paper examines how generative AI search engines (ChatGPT, Perplexity, Gemini) differ from traditional search in sourcing information. It introduces "Generative Engine Optimization" (GEO) as a framework for understanding visibility in AI-generated answers.

Key Findings

The study identifies a strong bias in AI search systems toward earned media (third-party, authoritative sources) over brand-owned content. It also reveals that AI engines differ significantly from each other in domain diversity, freshness, cross-language stability, and sensitivity to phrasing. This variation makes engine-specific optimization impractical and points toward a single, well-structured source of truth as a more sustainable strategy.

The research recommends engineering content for machine scannability and building structured technical foundations rather than relying on traditional marketing content.

Relevance for the Grounding Page Standard

The GEO research supports two foundational assumptions of the Grounding Page Standard:

  • Structured content over marketing language: The study's finding that AI systems favor structured, machine-readable data over unstructured marketing content aligns with the Grounding Page Standard's core rule: no adjectives, no marketing claims, one fact per sentence.
  • One source of truth across engines: The significant variation across AI engines supports the decision to create a single authoritative page per entity rather than attempting platform-specific optimization. A Grounding Page serves as this centralized definition.
What this does not mean: The GEO study does not evaluate entity pages or structured data formats. Its focus is on source selection behavior in AI engines. The connection to Grounding Pages is at the strategic level (how to position content for AI), not at the implementation level.

The Importance of About Pages for Digital Identity

Author: Wai Kay (waikay.io)
Published: 2025
Source: waikay.io

Summary

This independent analysis examined more than 17,000 URLs across domains and entity types. It investigated which page types AI systems tend to reference when interpreting brands and entities.

Key Findings

The analysis found patterns suggesting that AI systems frequently draw on clear identity pages — particularly About pages — when constructing brand interpretations. Pages with a clear factual structure and consistent self-description appeared to serve as anchor points for entity understanding.

Relevance for the Grounding Page Standard

This observation supports the conceptual premise of Grounding Pages: that organizations benefit from maintaining a dedicated, clearly structured page that defines what an entity is. The Grounding Page Standard can be understood as a structured evolution of the classical About page — formalized for machine readability and optimized for AI retrieval.

What this does not mean: This is an independent practitioner analysis, not a peer-reviewed study. The observed patterns are correlational. The analysis does not test the Grounding Page format specifically, and the causal relationship between page structure and AI interpretation requires further investigation.

How to Read This Page

This page is maintained as a living document. As new research on AI retrieval, entity representation, and structured data emerges, relevant studies will be added.

The selection criteria for inclusion are:

  • The research addresses a question relevant to entity-centric web content and AI retrieval.
  • The findings support, challenge, or add nuance to the architectural principles behind the Grounding Page Standard.
  • The source is either peer-reviewed, published on a recognized preprint server, or based on a substantial independent dataset.

If you are aware of research that should be considered for this page, please get in touch.

Next Step

Explore the Standard

See how these principles translate into a concrete implementation framework.

Go to Technical Implementation →
Standard Overview Live Examples