ontologies

2026 - Week 22 - Tragedy?

Thursday and Friday of week 22 saw a break from usual working patterns as ‘brarian Jayne, boss ‘brarian Anya, boss boss ‘brarian Bryn, House of Lords Maya and House of Lords Big Ben popped Young Robert and Michael on their extendable leads for a jaunt to UCL. Joined there by friends of the family Andrew and Dan. And lovely to see everyone it was too.

The occasion marked was the first - to the best of our knowledge - AI-BRIDGES Symposium. And yes, you did read that right, we went to an actual symposium. Like what academics do. Now your regular correspondent has never been a massive fan of conferences, or indeed any event where lots of other people speak and he isn’t able to get a word in edgeways. And we have to be honest, on that front, this one was little different. But there was food for thought along the way.

The premise of the project is to build a community willing to engage with how Wikimedia in particular - and the commons more generally - engage with the infernal AI machines. Which is, one suspects, a topic that vexes us all, there being no easy answers. Even Dan - possibly the most bullish of us when it comes to the possibilities of what one might do with a reasonable computer and a large language model - doesn’t paint a particularly bright picture of our futures. Sounding, on occasion, like John Steinbeck waking up with a particularly hellish hangover. It’s brutal out there.

It’s also a confusing time for people who’ve spent the last couple of decades encouraging the world to adopt open licences so other people can reuse and reshape their work. Watching the machines chomp through everything in sight, it would sound churlish to suddenly change tune to, OH CHRIST, NOT LIKE THAT.

Part of the problem is at the input end, a lot of people present making note of the strain their poor servers are being placed under by having to satisfy the ravenous appetites of ever more aggressive crawlers. Some people spoke of being forced to move behind content distribution networks to keep their services up and running. If you have money to invest in the markets, CDN suppliers would not be a bad bet. Others, still less fortunate, spoke about having to take down whole collections because they lacked any other choice.

The commons all works fine if some mutual understanding and mutual respect is in play. I graze my sheep, you graze your sheep, we all graze our sheep. Until some bloke turns up with 6,000 sheep and the village green is denuded. Which is where we seem to be right now. From a chat with John, it would appear that legislation.gov.uk is currently experiencing one billion hits a month. Now we’d be the first to say that that website is one of the few bits of proper digital infrastructure the UK has. But even then, we doubt that’s half billion eyeballs worth of popularity.

But then what did we ever expect. For those of us nursed at the teat of the semantic web, most of this feels inevitable. Data would be published, the machines would consume it, the machines would infer new information, agents would be sent out to make deals, and none of us would ever have to risk repetitive strain injury by clicking pixels again. Quite the relief. At least one of your regular correspondents has been making the point that, one day, the only thing visiting your website will be robots, since at least 2012. But then what?

Over in the world of commerce, clicks are falling off a click followed in short order by advertising revenues. Which tend to keep the whole thing afloat. So more publishers disappear behind more paywalls and people like Google - not that Google is a person, except in the American sense - watch their advertising business disappear. And Google lose access to more content to train their infernal machines. Eventually the tragedy of the commons becomes a tragedy for commerce also. What a mess.

On the far side of the pipes, there’s quite clearly a problem at the output end. We’ve all seen the output. Often it is quite remarkably right. Also, quite often, it is remarkably wrong. For a bunch of Wikipedians who’ve been brought up on citing sources, it’s hard to celebrate the occasionally iffy sausages when the sausage making is hidden behind closed, commercial doors. What a mess.

One comfort blanket in all of this is the potential, not quite fully explored to take a RAG-like approach and ground the behaviour of the LLMs in more structured data. Which is a possible comfort to the Wikidata folk. And also to us. An area we continue to explore. Though, with no real ‘economy of attribution’ it’s rather difficult to gauge success. What a mess.

Adaptions are being sought. People spoke of making available data dumps in the hope that the LLM harvestors will use those, rather than crawl our poor websites to death. Though, unfortunately, our own ‘automated’ data dumps stopped working some time back. There was also much talk of deploying MCP servers which have the potential to solve both the input and output problems. The input side because the magical machines might just talk to those rather than attempt to grab everything. The output side because the LLMs will finally be grounded in some degree of truth. Whatever ‘truth’ is. For those not in the know, MCP servers are, apparently, some kind of standard. Though one designed and promoted by a single vendor. At this point, one has to ask whatever happened to the W3C? Stuck in a working group, one supposes.

Sticking with the subject of MCP servers, possibly the most productive part of day one happened when we clocked off and rolled up at the Marquis Cornwallis. Having necked his first pint of cider in under two minutes - Ben is very fond of cider, though we doubt he’s ever tasted any - he flipped open his mobile telephone, connected to his supercomputer back in head office, fired up Claude code and knocked out an MCP server for the beloved Egg Timer™. Again in a little under two minutes. And having never written one before. That code is not yet deployed, but Young Robert and Michael have plans. Even those two are not too old to learn some new tricks.

So that’s that then. We hope to continue to engage with the AI-BRIDGES folks, we know we’ll continue to chat to Andrew and Dan, we hope to work a little more closely with Ben, and we’ll most probably dip a toe into MCP waters. We also have the Study of Parliament Group’s ‘How AI is being used by Parliaments’ session to look forward to. Our very own Ben taking to the stage. Though obviously, that does not have a URL to point you to. Or not one that we could find. Head desk bang. Nevertheless, if you’re in or around Westminster in the early evening of Wednesday, 10th June, do say hello. Until then, we’ll try to stay hopeful.