Schema markup helps AI engines cite your content — but not in the way most SEO advice suggests. AI models do not semantically parse JSON-LD; they treat it as text during retrieval. The real mechanism is indirect: schema enriches search engine indexes (particularly Bing's), which then feed AI response generation. The largest independent study (Growth Marshal, n=730 citations) found that attribute-rich schema earns a 61.7% citation rate — but generic, minimally populated schema actually underperforms having no schema at all (41.6% vs 59.8%). At Whitehat SEO, we implement evidence-based schema strategies that support AI citation without the myths.
AI engines process schema markup indirectly — through search engine index enrichment — not by parsing JSON-LD semantically. A February 2026 experiment by Mark Williams-Cook placed address data exclusively inside invalid JSON-LD. Both ChatGPT and Perplexity extracted the address, confirming that LLMs simply ingest all HTML text, including <script> blocks, without validating structure.
The mechanism that actually matters is: schema → search engine index enrichment → AI grounding. Your schema feeds Google's Knowledge Graph and Bing's entity index. When ChatGPT, Copilot, or Google AI Overviews generate responses, they draw from these enriched indexes. This indirect pathway explains why schema matters despite AI models not parsing it as structured data.
Microsoft / Bing / Copilot
The only major platform to officially confirm schema helps its LLMs. Fabrice Canel (Principal Product Manager, Bing) stated at SMX Munich 2025 that "schema markup helps Microsoft's LLMs understand your content." Since ChatGPT and Copilot both use Bing's index, this is directly relevant.
Google AI Overviews
Uses structured data "to understand the content of the page" but has explicitly stated: "There's no special schema.org structured data that you need to add" for AI features. Schema feeds the Knowledge Graph, which informs AI response selection — an indirect but real benefit.
ChatGPT / Perplexity
No public statement on schema from either platform. OpenAI's crawlers do not execute JavaScript. A searchVIU experiment (December 2025) found product pricing present only in JSON-LD was not extracted by any tested AI system. Perplexity's bot found only 12.5% of test data points.
The difference between schema that helps and schema that hurts comes down to attribute richness — not schema type. Growth Marshal's peer-reviewed study (February 2026, n=730 citations, DOI-published) found that only schema with every relevant attribute populated earns a citation advantage.
61.7%
Attribute-Rich Schema
Populated pricing, ratings, specs
59.8%
No Schema At All
Pages without any structured data
41.6%
Generic Schema
Basic Article, Organization, BreadcrumbList
Source: Growth Marshal, February 2026 (n=730 AI citations, 1,000+ pages, DOI-published)
The advantage is most pronounced for lower-authority domains (DR ≤ 60): attribute-rich schema achieves a 54.2% citation rate versus 31.8% for generic — a meaningful lift. Among high-authority domains (DR > 75), the gap narrows considerably because domain authority dominates. The practical implication: if your site doesn't already have strong domain authority, well-implemented schema offers genuine competitive advantage. If your DA is already high, generic schema may actually work against you.
Industry Claims vs Evidence
"FAQPage schema gives 2.7× more citations" — This originates from Relixir, a vendor testing 50 domains using their own platform. SE Ranking's larger independent study found FAQ content blocks yielded roughly 11% more citations (4.9 vs 4.4) — not 170%. It's the FAQ content structure that matters, not the schema markup itself.
"@graph improves entity recognition by 300%" — This traces to a data.world study about Knowledge Graphs improving LLM accuracy on enterprise SQL queries. It has nothing to do with JSON-LD on web pages. No peer-reviewed research supports this claim for web schema.
| Schema Type | Evidence Level | AI Citation Impact | Recommendation |
|---|---|---|---|
| Product/Review (attribute-rich) | Strong | 61.7% citation rate (Growth Marshal) | Implement with all attributes populated |
| Article (with author, dates, wordCount) | Moderate | Supports E-E-A-T and freshness signals in index | Include dateModified, author with credentials |
| AggregateRating | Moderate | 89% of AI Mode products have 4.1–5 star ratings (SE Ranking) | Include with genuine review data only |
| Organization / Person | Correlational | Most common on cited pages (SALT.agency, 107K URLs) — but correlation, not causation | Implement with sameAs to Companies House, Wikidata |
| FAQPage | Overstated | ~11% lift from FAQ content structure (SE Ranking), not 170% | Focus on visible FAQ content; add schema as secondary signal |
| HowTo / SpeakableSpecification | Deprecated / No evidence | Google deprecated HowTo rich results; Speakable is beta-only, US English | Remove if present; focus effort elsewhere |
Use JSON-LD format, populate every relevant attribute, connect entities with consistent @id references, and ensure content-markup parity across every page. These four principles are supported by both Google and Bing documentation and reflect how AI systems actually consume indexed data.
JSON-LD Format
Google recommends JSON-LD when your setup allows it. Both head and body placement work — Google processes both identically. Note: SearchPilot A/B tests showed no traffic impact when switching from Microdata to JSON-LD, so the preference is practical, not algorithmic.
Attribute Richness
The single most important factor. Empty or minimal-field schema actively hurts citation rates. Populate pricing, ratings, specs, author credentials, dates, word counts — every property relevant to the schema type. If you cannot fill the attributes, do not add the schema.
Entity Linking via @id
Use consistent @id references across your entire site (canonical_URL + #fragment). Link Author → Organisation → Articles → Topics. Add sameAs properties to Wikipedia, Wikidata, Companies House, and LinkedIn for disambiguation.
Content-Markup Parity
Google's structured data policies require: "Don't mark up content that is not visible to readers of the page." FAQ schema needs visible FAQ content. Bing warns that "putting spam data in the markup can hamper your presence." Violations trigger manual actions.
2025–2026 Schema Deprecations
In June 2025, Google deprecated seven structured data types: CourseInfo, ClaimReview, EstimatedSalary, LearningVideo, SpecialAnnouncement, VehicleListing, and Book Actions. PracticeProblem was deprecated from January 2026. FAQ rich results remain restricted to authoritative government and health websites. HowTo rich results have been deprecated for both desktop and mobile. Core types — Product, Organization, Article, Review, BreadcrumbList — remain fully supported and relevant for AI.
The most common schema mistake is implementing it at all without populating attributes — generic schema produces an 18-percentage-point citation penalty compared with having no schema. Beyond this, five implementation errors consistently undermine AI visibility.
1. Generic / Empty Schema
Schema with only required fields, broad types instead of specific subtypes, and no sameAs, about, or knowsAbout properties. Signals template-generated content to AI systems. Fix: Use the most specific type available and populate every relevant attribute.
2. Schema Without Visible Content
Adding FAQPage schema without an actual FAQ on the page. Triggers Google's "Spammy Structured Markup" manual action, removing rich result eligibility. Recovery requires a reconsideration request and weeks of delay. Fix: Every schema claim must match visible page content.
3. Duplicate / Conflicting Blocks
Multiple CMS plugins generating overlapping schema. In HubSpot, template-level, module-level, and global schema can all output simultaneously. Fix: Audit rendered HTML (not just source), choose one implementation method, and disable conflicts.
4. Deprecated Schema Still In Place
HowTo, ClaimReview, and FAQ (on non-authoritative sites) schema is simply ignored by Google. Leaving it in place creates a false sense of optimisation and signals unmaintained code. Fix: Audit and remove deprecated types; redirect effort to supported schema.
5. Missing sameAs Entity Links
SALT.agency found fewer than 4% of schema-present pages link to Wikidata via sameAs. Without external entity references, AI engines cannot confidently disambiguate your organisation from others with similar names. Fix: Add sameAs to Companies House, Wikipedia, Wikidata, and LinkedIn at minimum.
UK B2B companies face specific schema requirements around address formatting, currency, regulatory identifiers, and entity type selection that differ from US-centric best practices. Getting these details right strengthens your entity signals for AI engines that need to disambiguate UK businesses.
Address & Currency
Use addressCountry: "GB" (ISO 3166-1 alpha-2) — never "UK" or "United Kingdom". For addressRegion, use county or city (the UK lacks US-style states). For London, use "addressRegion": "London".
For pricing, use ISO 4217 code "GBP" with a full stop as decimal separator. Never include the £ symbol in the price value. For bespoke B2B pricing, use priceRange: "££" rather than fabricating figures.
Entity Types & sameAs
Note: ProfessionalService is deprecated (Schema.org GitHub Issues #801, #1283). Use Organization on the homepage with LocalBusiness on location pages. For regulated sectors: LegalService (law firms, include SRA number), FinancialService (FCA-regulated), MedicalBusiness (healthcare).
Essential UK sameAs: Companies House URL (highest-trust UK government signal), LinkedIn, Trustpilot UK (DA 91), and sector-specific registries (FCA Register, Law Society Find a Solicitor). For multi-location businesses, use parent–child structure with subOrganization / parentOrganization linking.
The DMCCA (Digital Markets, Competition and Consumers Act 2024) has no specific schema requirements — no CMA guidance mentions structured data as a compliance mechanism. However, schema implementation supports DMCCA transparency principles: correct Offer schema with upfront pricing aligns with anti-drip-pricing rules, and genuine AggregateRating schema supports the fake reviews ban. For deeper context on UK regulatory compliance in marketing, see our responsible AI marketing guide.
Bing's AI Performance Dashboard (launched February 2026) is the first official AI citation reporting tool from any major platform — and it reveals that 99.6% of AI's use of your content is invisible. Otterly.AI's early analysis found one site was used by AI 44,469 times while being visibly cited only 169 times.
Bing AI Performance
Free. Total citations, cited pages/day, grounding queries. First official platform tool.
Best for: Copilot and ChatGPT visibility
Google Search Console
Free. Structured data errors and rich result performance — but cannot isolate AI traffic specifically. AI clicks count under "Web" search type.
Best for: Schema validation and errors
SearchPilot A/B Tests
Gold standard methodology: split pages 50/50, wait 4–8 weeks. Their FAQ schema test showed 9% organic uplift. Microdata→JSON-LD showed no impact.
Best for: Isolating schema causation
Before/after results are mixed. Previsible (January 2026) implemented schema across five websites: two doubled their AI traffic, one showed Google Search Console improvements, and two showed no change at all. Schema App reported a 19.72% increase in AI Overview visibility after adding entity linking — though as a vendor, their results carry potential bias. The honest conclusion from SALT.agency's 107,352-URL analysis is that schema distribution in AI citations "aligned closely with schemas you'd expect" — suggesting schema appears on cited pages because well-maintained sites have both good content and schema. For a comprehensive approach to tracking AI visibility metrics, see our AEO measurement and KPIs guide.
Schema enters LLM training data through a documented pipeline: Web Data Commons extracts structured data from Common Crawl, converting it into linguistic datasets that become part of training corpora. However, as Patrick Stox (Search Engine Journal) notes, training data is not cited in LLM output — the influence is on general world knowledge, not retrievable citations.
The practical advanced strategy is entity-first SEO: designate canonical "entity home" pages for each key entity (your company, your CEO, your core services), use consistent @id references site-wide, and connect via sameAs to authoritative knowledge bases. Schema App's entity linking case study showed a 46% increase in impressions and 42% increase in clicks for non-branded queries after adding spatialCoverage, audience, and sameAs properties. Google's Knowledge Graph contains 500+ billion facts about 5+ billion entities — Wikipedia and Wikidata are primary sources, with schema.org as a significant secondary contributor.
For implementation, our AEO audit guide covers how to assess your current schema readiness, and our ChatGPT optimisation guide details the technical foundations including robots.txt configuration for AI crawlers.
A proper schema validation workflow runs three stages: pre-deploy testing, comprehensive validation, and ongoing sitewide monitoring. Use Google Rich Results Test before publishing, Schema Markup Validator (validator.schema.org) for comprehensive checks, and Google Search Console Enhancements for ongoing error tracking. For sitewide crawl audits, Sitebulb (~£119/year) and Screaming Frog (~£159/year) are the UK industry standards. No dedicated AI-schema validator exists as of February 2026.
| Scope | Estimated Cost (GBP) | Includes |
|---|---|---|
| Basic one-off | £500–£1,500 | Organization + Article schema across 5–10 page types |
| Mid-range | £1,500–£5,000 | Full audit, 10–20 templates, custom types, entity linking |
| Enterprise | £5,000–£15,000+ | Knowledge graph build, entity linking, 100+ templates, automation |
| Ongoing maintenance | £300–£1,000/month | Monitoring, error fixes, new page types, deprecation updates |
Schema implementation is typically included in UK SEO retainers above £1,500 per month. UK agency day rates for technical SEO specialists range from £100–£250 per hour. For HubSpot sites specifically, template-level implementation via HubL head-injection tags is the most scalable approach, automatically populating Article schema from blog post metadata.
Does schema markup directly help with ChatGPT citations?
Not directly. ChatGPT does not semantically parse JSON-LD — it treats schema as text during retrieval. The benefit is indirect: schema enriches Bing's index, and since ChatGPT uses Bing for its search-grounded responses, well-implemented schema improves how your content is understood and surfaced by the underlying search engine. Microsoft is the only major platform to officially confirm this benefit.
Is generic schema better than no schema for AI visibility?
No — generic schema actually performs worse. Growth Marshal's peer-reviewed study found generic schema (basic Article, Organization, BreadcrumbList with minimal attributes) achieved only a 41.6% citation rate, compared with 59.8% for pages with no schema at all. Only attribute-rich schema with fully populated fields (pricing, ratings, specs, author details) outperforms having no schema, at 61.7%.
What is the best schema type for a UK B2B service company?
Use Organization schema on your homepage (note that ProfessionalService has been deprecated by Schema.org). Add LocalBusiness on location-specific pages. Include your Companies House URL in sameAs — this is the highest-trust UK government entity signal. For regulated sectors, use the specific subtype: LegalService for law firms (include SRA number), FinancialService for FCA-regulated firms, or MedicalBusiness for healthcare providers.
How much does schema implementation cost for a UK B2B website?
A basic one-off implementation (Organization + Article across 5–10 page types) typically costs £500–£1,500. Mid-range projects with full audits, custom types, and entity linking run £1,500–£5,000. Enterprise-level knowledge graph builds with 100+ templates cost £5,000–£15,000+. Ongoing maintenance runs £300–£1,000 per month. Schema is usually included in UK SEO retainers above £1,500 per month.
Should I use JSON-LD or Microdata for schema markup?
JSON-LD. Google explicitly recommends JSON-LD when your setup allows it, and it works in both the head and body of your page. Importantly, SearchPilot's A/B tests showed no detectable organic traffic impact when switching from Microdata to JSON-LD — so the preference is about maintainability and developer experience, not algorithmic advantage. JSON-LD is easier to manage because it sits in a separate script block rather than being woven into your HTML.
How can I measure whether schema markup is improving my AI visibility?
Start with Bing's new AI Performance Dashboard (launched February 2026) — it shows total AI citations, cited pages per day, and grounding queries for free. Supplement with Google Search Console for structured data errors and AI citation tracking tools like Otterly.AI (from £23/month) or Profound (from $100/month). To truly isolate schema's causal impact, you would need A/B testing using SearchPilot's methodology: split similar pages 50/50 and wait 4–8 weeks.
Free Technical SEO Audit
Generic schema hurts more than it helps. Our technical SEO audit evaluates your schema implementation against the evidence — attribute richness, entity linking, content-markup parity, and AI crawler accessibility — and identifies the specific changes that will improve your AI visibility.
Request Your Free Technical Audit →Sources & methodology: This article prioritises primary sources and controlled experiments over vendor marketing. Key sources: Growth Marshal (n=730 citations, DOI-published, Feb 2026), SALT.agency (107,352 URLs), SE Ranking (129K domains, Nov 2025), SearchPilot A/B tests, Microsoft/Bing (Fabrice Canel, SMX Munich 2025), Google Structured Data Documentation (updated Dec 2025), Mark Williams-Cook invalid schema experiment (Feb 2026), searchVIU cross-platform test (Dec 2025), Schema.org GitHub Issues, Previsible (5 websites, Jan 2026), Schema App, Otterly.AI, Bing AI Performance Dashboard documentation. Statistics marked with ⚠️ in the underlying research were omitted or contextualised. Last updated: February 2026.