Ptechhub
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs
No Result
View All Result
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs
No Result
View All Result
PtechHub
No Result
View All Result

Why the UK must lead on data to unlock AI’s full potential | Computer Weekly

By Computer Weekly by By Computer Weekly
February 11, 2025
Home Uncategorized
Share on FacebookShare on Twitter


The UK government holds some of the world’s most valuable datasets, including official statistics, cultural heritage records and NHS health data. These datasets have powered scientific breakthroughs, business innovation, and improvements in public services.

With the publication of the much-anticipated AI Opportunities Action Plan, the transformative potential of government data for AI has never been more apparent. However, recent research by the Open Data Institute (ODI) reveals critical shortcomings in how government datasets are prepared and published for AI. 

Government data and AI’s reliability challenge

Foundation models (FMs), such as ChatGPT and Gemini, are increasingly used to provide information on public policies and services. Yet, the ODI’s research highlights that while these models scrape government data repositories, they often fail to deliver accurate outputs based on them. Instead, models draw on secondary or unreliable sources, such as social media posts or opinion articles, or simply fabricate answers.

The consequences are significant. Citizens using AI tools to understand benefit entitlements, for example, may receive misleading or incomplete advice, undermining public trust in both AI and government services. This is particularly concerning given the UK government’s emphasis on improving public service delivery through AI innovation.

Data deficits in the AI ecosystem

The AI Opportunities Action Plan, authored by Matt Clifford, rightly emphasises the role of the National Data Library (NDL) as a means to unlock government data for AI innovators. Yet, the current state of government datasets presents significant barriers to achieving this vision.

ODI analysis of CommonCrawl, a key dataset repository for AI models, found that it scraped 13,556 pages from data.gov.uk as of April 2024. However, these pages rarely contributed to accurate model outputs. Across 195 test queries, models correctly referenced data.gov.uk statistics in only five cases.

This issue arises because government data is often not published in AI-ready formats. While technologies such as DCAT are used to make datasets discoverable, scraping infrastructure like CommonCrawl does not fully support these technologies. As a result, AI models rely on less authoritative sources, perpetuating misinformation. The ODI’s findings suggest that the UK’s ambition to lead in AI innovation could falter unless this disconnect is addressed.

Evidence from ODI experiments

The ODI conducted two experiments to examine how government data supports AI models and, in turn, how AI models are enabled to support residents of the UK.

The first experiment analysed how important UK government websites are for AI. Researchers conducted an ablation study utilising a ‘machine unlearning technique’ to remove gov.uk websites from a selection of FMs’ training data.

The results revealed a 42.6% increase in models’ inaccuracy when deprived of gov.uk content, leading to fundamental errors. For example, one test found that models that did not have access to government websites misinformed users about their eligibility for Child Benefit.

In contrast, the second experiment found that government datasets are currently unknown to AI models. This experiment, a study of models’ ability to recall specific statistics from data.gov.uk, found that out of 195 queries, models accurately referenced official government statistics releases just five times.

The conclusion from these experiments was that while government websites are vital for AI accuracy, government statistics datasets are underutilised despite their enormous value and potential in delivering public services. If we want to realise the potential of AI to deliver benefits such as improving care quality, safety, and cost-effectiveness in the NHS, the government must prioritise improving the quality, accessibility, and usability of its data.

The path forward

The adoption of FAIR principles – ensuring data is findable, accessible, interoperable, and reusable – has long been championed by data.gov.uk and remains a strong foundation. Emerging tools like Croissant, a machine-readable metadata format designed for machine learning, can further enhance discoverability and integration into developers’ workflows. If dataset descriptions are improved, they will be more usable for human and machine users.

The government must incentivise responsible data sharing to ensure equitable access to high-quality data. This could include tax incentives for private-sector data sharing, mandates for publicly funded projects to make their data open where appropriate, or even a levy on AI-generated content to fund trusted information sources. We must use privacy-enhancing technologies such as Solid, which offer individuals direct access and control of their data – for example, their well-being and health data – to ensure access to sensitive data without compromising personal privacy, commercial sensitivity, or national security. This could provide important benefits, such as using machine learning to identify personal risk factors for health conditions, enabling preventative action. Data Trusts can be built on top of Solid to aggregate data. This aggregated data can be collated into datasets with Croissant metadata to prepare it for research use.

Aligning with the Action Plan

The AI Opportunities Action Plan’s emphasis on high-quality data and strong governance aligns with the ODI’s longstanding commitment to socio-technical solutions integrating advanced data infrastructure with public trust. To support the development of interoperable systems, AI-ready datasets, and privacy-enhancing technologies, the ODI is advocating for a ten-year National Data Infrastructure Roadmap. This roadmap would support the Action Plan’s focus on driving AI innovation through investing in long-term data infrastructure.

However, the Action Plan leaves several gaps unaddressed. It does not fully detail how the National Data Library will incorporate user input or engage diverse stakeholders to ensure it delivers public benefit. There is limited detail about formal standards for data quality and provenance, which are critical for ensuring AI-ready datasets. Furthermore, while the Action Plan highlights the need to support AI innovators, it could more explicitly foster data-centric startups specialising in data preparation and governance tools. We hope these gaps are addressed as the government rolls out the recommendations.

International leadership through collaboration

The ODI’s research highlights the global importance of data-centric approaches to AI governance. However, few nations prioritise this focus, risking undermining the broader adoption of open and shared data practices. Without robust data-centric governance, the foundations of transparent and accountable AI systems could weaken.

The ODI has launched the Global AI Policy Data Observatory to address this. This initiative provides practical resources to support policymakers in developing data-centric AI governance. By offering insights into machine-readable metadata, toolkits for responsible data use, and best practices for transparency, the Observatory aims to strengthen the global evidence base for data-centric AI.

Realising the UK’s AI potential

Access to high-quality government data is essential for realising AI’s potential in public service delivery. By improving data publication practices and investing in long-term infrastructure, the UK can position itself as a global leader in data provision for AI. This leadership will unlock transformative economic and social benefits, aligning with the ambitions of the AI Opportunities Action Plan.

The full report is available to download at ODI Report: The UK Government as a Data Provider for AI.

Elena Simperl is the director of research at the ODI. Neil Majithia is a researcher at the ODI.



Source link

By Computer Weekly

By Computer Weekly

Next Post
Skills required for data engineering success | Computer Weekly

Skills required for data engineering success | Computer Weekly

Recommended.

Ericsson and Google Cloud team up to deliver carrier-grade 5G core as-a-service built with AI at the foundation

Ericsson and Google Cloud team up to deliver carrier-grade 5G core as-a-service built with AI at the foundation

June 12, 2025
Anthropic’s Claude Is Good at Poetry—and Bullshitting

Anthropic’s Claude Is Good at Poetry—and Bullshitting

March 28, 2025

Trending.

⚡ Weekly Recap: Oracle 0-Day, BitLocker Bypass, VMScape, WhatsApp Worm & More

⚡ Weekly Recap: Oracle 0-Day, BitLocker Bypass, VMScape, WhatsApp Worm & More

October 6, 2025
Cloud Computing on the Rise: Market Projected to Reach .6 Trillion by 2030

Cloud Computing on the Rise: Market Projected to Reach $1.6 Trillion by 2030

August 1, 2025
Stocks making the biggest moves midday: Autodesk, PayPal, Rivian, Nebius, Waters and more

Stocks making the biggest moves midday: Autodesk, PayPal, Rivian, Nebius, Waters and more

July 14, 2025
The Ultimate MSP Guide to Structuring and Selling vCISO Services

The Ultimate MSP Guide to Structuring and Selling vCISO Services

February 19, 2025
Translators’ Voices: China shares technological achievements with the world for mutual benefit

Translators’ Voices: China shares technological achievements with the world for mutual benefit

June 3, 2025

PTechHub

A tech news platform delivering fresh perspectives, critical insights, and in-depth reporting — beyond the buzz. We cover innovation, policy, and digital culture with clarity, independence, and a sharp editorial edge.

Follow Us

Industries

  • AI & ML
  • Cybersecurity
  • Enterprise IT
  • Finance
  • Telco

Navigation

  • About
  • Advertise
  • Privacy & Policy
  • Contact

Subscribe to Our Newsletter

  • About
  • Advertise
  • Privacy & Policy
  • Contact

Copyright © 2025 | Powered By Porpholio

No Result
View All Result
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs

Copyright © 2025 | Powered By Porpholio