How Are the AI Data Wars Affecting UK Business Growth in 2026?
Last Updated: 7 February 2026
How Are the AI Data Wars Affecting UK Business Growth in 2026?
The AI data wars are a global battle over who controls the data that trains and powers artificial intelligence, and they are directly affecting UK business growth. Over $1 billion in data licensing deals were signed between 2023 and 2025, while 79% of top news sites now block AI training crawlers. For UK mid-market B2B companies, this means your answer engine optimisation strategy and first-party data have become your most defensible competitive advantages.
The landscape has transformed dramatically since the early days of generative AI. Tech giants are spending $320 billion on AI infrastructure in 2025 alone, while publishers, platforms, and governments race to control how data flows into AI systems. For B2B marketing directors and heads of demand generation at UK companies turning over £5 million to £100 million, these shifts are not abstract. They determine whether your content appears in AI-generated answers, whether your CRM data gives you an edge, and whether your business is visible in the new search landscape.

This guide from Whitehat SEO, a London-based AI consultancy and HubSpot Diamond Partner, breaks down what the AI data wars mean for your business strategy, your marketing visibility, and your data governance obligations in 2026.
The Billion-Pound Data Licensing Landscape
The AI data wars are being fought primarily through licensing deals. OpenAI has signed over 18 publisher partnerships, including a $250 million agreement with News Corp and an estimated $70 million annual deal with Reddit. Google secured Reddit data for $60 million per year. Meta signed multi-year agreements with Reuters, CNN, and News Corp. Disney invested $1 billion in OpenAI alongside a content licensing arrangement in December 2025.
The structure of these deals has evolved rapidly. Early agreements in 2023 were flat-rate fees for training data access. By late 2024, the model shifted towards usage-based retrieval-augmented generation (RAG) deals. The word "training" is absent from five of the six most recent OpenAI contracts, reflecting a move from one-off training fees to ongoing payments for real-time content access and display rights.
New infrastructure is emerging to manage this. Really Simple Licensing (RSL) 1.0 launched in December 2025, backed by Reddit, Yahoo, Quora, Medium, Cloudflare, and Akamai. Microsoft launched a Publisher Content Marketplace. ProRata.ai offers a 50% revenue-share model for publishers. These systems will determine which content AI systems can legally access and surface in their answers.
For UK B2B companies, the strategic implication is clear: the era of free data is ending. Businesses that treat their proprietary data, particularly customer data held in CRM systems like HubSpot, as a strategic asset will have an advantage that cannot be licensed away.
How Platform Restrictions Are Reshaping Data Access
While some publishers are licensing data, others are locking it down entirely. Approximately 5.8 million websites now block AI crawlers such as GPTBot and ClaudeBot, up roughly 70% from July 2025, according to Press Gazette. A BuzzStream and Press Gazette study found that 79% of top US and UK news sites block at least one AI training bot.
The numbers reveal a broken bargain. Cloudflare data from June 2025 shows the crawl-to-referral ratio for Google is 14:1, meaning Google sends back roughly one visit for every 14 pages it crawls. For OpenAI, that ratio is 1,700:1. For Anthropic, it reaches 73,000:1. AI companies are consuming vast amounts of content while sending almost no traffic back to publishers.
Platform-specific restrictions have been equally aggressive. Twitter/X increased API pricing by 9,900% since 2022, with enterprise access now costing over $42,000 per month. Reddit introduced API pricing of $0.24 per 1,000 calls, triggering the largest Reddit protest in history with approximately 8,500 subreddits going dark. Reddit subsequently sued Anthropic in June 2025 for scraping over 100,000 times after being denied access.
However, blocking AI crawlers carries its own risks. A Rutgers and Wharton study from December 2025 found that publishers blocking AI crawlers lost 23% of their total traffic and 14% of human traffic. Whitehat SEO advises UK B2B companies to take a strategic approach: allow AI crawlers to index your content so you can be cited in AI answers, while protecting your proprietary datasets separately. Understanding how SEO and AI search interact is essential to making this decision correctly.
Why First-Party CRM Data Is Your Most Defensible Asset
As Big Tech companies build data moats from billions of user interactions, mid-market businesses have a comparable advantage in microcosm: their first-party CRM data. Google processes trillions of search queries. Amazon holds purchase history from hundreds of millions of customers. Meta reaches 3.5 billion daily active users. Reddit, now the most-cited source across AI models according to Profound AI research, has over 16 billion comments in its archive.
UK mid-market companies cannot compete at that scale. But they can build a moat from something equally valuable in their context: deep customer intelligence that competitors and AI systems cannot access. McKinsey research shows companies using first-party data strategies achieve five to eight times higher ROI compared with generic campaigns. Forrester reports double the conversion rates and 30% lower customer acquisition costs.
As Varun Krishna, CEO of Rocket Companies, puts it: "The companies with the most data will win, and no industry is safe from the disruption. As commoditisation accelerates, access to scaled proprietary data is what separates industry leaders from the rest."
HubSpot's Breeze AI suite, expanded at INBOUND 2025 with over 200 updates, positions CRM data at the centre of AI-powered marketing. Breeze Agents can resolve over 50% of support tickets autonomously, while Breeze Intelligence provides buyer intent scoring and data enrichment drawn from your own CRM, not third-party data that competitors can also buy. Crucially, HubSpot prohibits third-party AI providers from using customer data for model training. Your CRM data stays yours.
Katie King, CEO of AI in Business, captured the challenge well in October 2025: "Too often, AI adoption looks more like filling a shopping basket with tools, rather than building a strategy. The companies seeing real value are the ones that are linking data, applications, and intelligence into one system." For Whitehat SEO's clients, that system is HubSpot.
The Legal and Regulatory Landscape for UK Businesses
The legal framework around AI and data remains unsettled, with 51 copyright lawsuits pending as of October 2025. Two landmark decisions illustrate the conflicting signals. In June 2025, a US judge ruled in Bartz v. Anthropic that using copyrighted books to train large language models is "transformative, spectacularly so" and therefore fair use, but downloading pirated copies was not. Anthropic settled for $1.5 billion, the largest publicly reported copyright recovery in US history, at approximately $3,000 per work.
In the UK, the Getty Images v. Stability AI judgment in November 2025 became the first major UK AI copyright ruling. The High Court held that AI model weights are not "copies" of images under UK copyright law, though limited trademark infringement was found for outputs containing Getty watermarks.
For regulation, the EU AI Act entered into force in August 2024, with general-purpose AI model obligations applicable from August 2025. The bulk of remaining obligations, including high-risk AI systems and full transparency rules, take effect in August 2026. Penalties reach up to €35 million or 7% of worldwide turnover. The Act applies extraterritorially, meaning UK businesses serving EU customers must comply.
The UK itself has no dedicated AI legislation in force as of early 2026, relying instead on a principles-based approach with existing regulators (ICO, CMA, FCA, Ofcom) applying safety, transparency, and accountability principles. The ICO is developing a statutory code on AI and automated decision-making, and has stated that legitimate interest is likely the only lawful basis for web scraping to train AI under UK GDPR. The Italian DPA fined OpenAI €15 million in December 2024 for processing personal data without a lawful basis, a warning shot for any business deploying AI without proper data governance.
Sir Nigel Shadbolt, Executive Chair of the Open Data Institute, summarised the UK position in July 2024: "If the UK is to benefit from the extraordinary opportunities presented by AI, the government must look beyond the hype and attend to the fundamentals of a robust data ecosystem built on sound governance and ethical foundations. The feedstock of high-quality AI is high-quality data."
How AI Search Is Changing B2B Visibility
The AI data wars have a direct consequence for B2B marketing: your visibility in AI-generated search results. Seer Interactive research from November 2025, covering 3,119 queries across 42 organisations, found that organic click-through rates dropped 61% on queries where Google displays AI Overviews, falling from 1.76% to 0.61%.
AI Overviews now appear in 13% to 19% of all Google searches, up 102% in the first quarter of 2025 alone. Bain and Company report that 60% of searches now end without a click, up from roughly 25% five years ago. A 2025 Onely study found that 73% of B2B websites experienced significant traffic loss between 2024 and 2025. Even HubSpot itself reported a 70% to 80% decline in organic traffic.
The opportunity lies in Answer Engine Optimisation (AEO). Brands cited in AI Overviews earn 35% more organic clicks and 91% more paid clicks compared with brands that are not cited, according to Seer Interactive. Ahrefs research from August 2025 reveals that only 12% of URLs cited by ChatGPT, Perplexity, and Copilot also rank in Google's top 10, meaning AI search engines use fundamentally different citation criteria than traditional search.
Whitehat SEO's complete AEO guide for B2B marketers details how to structure content for AI citation. The core principles include providing direct answers within the first 40 to 60 words of any page, structuring each section to be independently comprehensible, and ensuring your brand name is embedded in extractable passages so that when AI systems quote your content, your company travels with the citation.
The "Invisible Middle" Problem for Mid-Market Companies
The AI data wars are creating a structural visibility gap that disproportionately affects mid-market businesses. Large publishers are signing licensing deals that guarantee their content appears in AI-generated answers. Small companies were never visible in these channels anyway. Mid-market B2B companies, the ones producing genuinely valuable thought leadership and industry content, are caught in between with no licensing deals and declining organic traffic.
Analysis by Will Scott in October 2025 confirmed that top-tier licensed publishers appear frequently in ChatGPT citations, directly correlating with OpenAI's licensing agreements. Mid-tier publishers without deals are becoming invisible in AI-mediated discovery. Profound Research's analysis of 680 million citations across ChatGPT, Google AI Overviews, and Perplexity found that UK .uk domains represent only 2.16% of all ChatGPT citations, a massive underrepresentation given the UK's share of English-language business content.
Matt Clifford CBE, Chair of ARIA and author of the UK AI Opportunities Action Plan, framed the choice starkly: "You can't opt out of AI. AI is going to change everything. So the real question is: are you going to be an AI taker or an AI maker?"
For Whitehat SEO's clients, the path to becoming an AI maker involves three actions. First, optimise your published content for AI citation using AEO best practices. Second, build and activate your first-party data through HubSpot CRM so you have intelligence competitors cannot replicate. Third, invest in AI consultancy to develop a coherent data strategy rather than adopting tools in isolation.
A Practical Data Governance Checklist for UK B2B Companies
With 51% of UK companies unsure whether their AI-generated data complies with regulations, according to IT Brief UK, and the ICO developing statutory codes that will directly affect how businesses deploy AI, data governance is no longer optional. Whitehat SEO recommends the following actions based on current ICO guidance, the Data (Use and Access) Act 2025, and UK GDPR requirements:
- Conduct Data Protection Impact Assessments before deploying any AI system that processes personal data
- Identify your lawful basis for AI processing under UK GDPR, noting that legitimate interest requires a documented balancing test
- Maintain transparency by informing data subjects how AI uses their data, as 84% of UK businesses using AI report human oversight according to DSIT
- Review your robots.txt to establish a clear, intentional position on AI crawler access to your published content
- Audit third-party AI tool contracts to ensure processors are not using your data for model training without consent
- Consider Really Simple Licensing (RSL) for controlling how AI systems access your published content
- Document AI governance policies as the ICO requires demonstrable accountability, with GDPR fines reaching up to £17.5 million or 4% of global annual turnover
Whitehat SEO's AI marketing consultants help UK B2B companies navigate these requirements while building data strategies that turn compliance into competitive advantage.
What UK Businesses Should Do Next
The AI data wars will not slow down. UK private AI investment reached $4.52 billion in 2024, making it the fourth-largest AI investor globally. The UK government has committed £150 million to AI programmes, with AI Growth Zones projected to generate £28.2 billion in investment. Gartner predicts that 25% of organic search traffic will shift to AI chatbots by 2026. Every month a business delays its response, the visibility gap widens.
The businesses that will thrive are those that recognise their CRM data as their AI moat, optimise their content for AI citation rather than just traditional search rankings, and build data governance that turns regulatory requirements into strategic advantage. As OpenAI told the UK Parliament in January 2024: "It would be impossible to train today's leading AI models without using copyrighted materials." The question for UK businesses is not whether AI will use your data, but whether you will be strategic about how it does.
Whitehat SEO helps UK mid-market B2B companies navigate the AI data wars through Answer Engine Optimisation, AI consultancy, and HubSpot-powered data strategies. Book a discovery call to find out where your business stands.
Frequently Asked Questions
What are the AI data wars?
The AI data wars describe the global competition between technology companies, publishers, and governments to control the data used to train and power AI systems. This includes billion-pound licensing deals between AI companies and content publishers, platform restrictions on data access, and regulatory frameworks governing how data can be collected, stored, and used for AI purposes.
How do the AI data wars affect UK B2B companies?
UK B2B companies face declining organic search traffic as AI Overviews reduce click-through rates by up to 61%. Mid-market businesses risk becoming invisible in AI-mediated discovery without licensing deals. Companies must optimise content for AI citation, protect first-party data, and comply with evolving UK GDPR and EU AI Act requirements that apply extraterritorially.
Should I block AI crawlers from my website?
Blocking AI crawlers is generally not recommended for B2B companies seeking visibility. Research shows publishers that block AI crawlers lose 23% of total traffic. Allowing crawlers enables your content to be cited in AI-generated answers. Whitehat SEO recommends allowing crawlers for published marketing content while separately protecting proprietary customer data in your CRM.
What is Answer Engine Optimisation (AEO)?
Answer Engine Optimisation is the practice of structuring content so AI systems like ChatGPT, Google AI Overviews, and Perplexity cite and recommend it in their responses. AEO differs from traditional SEO because only 12% of AI-cited URLs overlap with Google's top 10 organic rankings. Brands adopted AEO frameworks saw up to 40% higher visibility in AI search results.
How does HubSpot CRM help in the AI data wars?
HubSpot CRM provides a protected first-party data asset that competitors and AI systems cannot access or replicate. With Breeze AI agents for autonomous customer service, buyer intent scoring, and data enrichment, HubSpot turns your proprietary customer intelligence into an actionable competitive advantage. HubSpot also prohibits third-party AI providers from using your customer data for model training.
What AI regulations apply to UK businesses in 2026?
UK businesses must comply with UK GDPR for any AI processing of personal data, including conducting Data Protection Impact Assessments. The EU AI Act applies extraterritorially to UK companies serving EU customers, with high-risk AI obligations taking full effect in August 2026. The ICO is developing statutory codes on AI and automated decision-making expected to provide additional UK-specific requirements.
What is the "invisible middle" in AI search?
The "invisible middle" describes mid-market companies caught between large publishers with AI licensing deals guaranteeing visibility and small companies that were never visible. UK .uk domains represent only 2.16% of ChatGPT citations. Mid-market B2B businesses must proactively optimise for AI citation through AEO and first-party data strategies to avoid disappearing from AI-mediated discovery.
References and Sources
- Press Gazette: Eight in Ten News Websites Block AI Training Bots (January 2026)
- BuzzStream: Which News Sites Block AI Crawlers (December 2025)
- Seer Interactive: AIO Impact on Google CTR September 2025 Update
- McKinsey: The Value of Getting Personalisation Right
- DSIT: AI Adoption Research (January 2026)
- Ahrefs: Only 12% of AI Cited URLs Rank in Google's Top 10 (August 2025)
- Bain & Company: Zero-Click Search Redefines Marketing (February 2025)
- IT Brief UK: ODI Reveals Critical AI Data Issues (July 2024)
