ontologies

2026 - Week 7

One step forward? Or one step beyond?

We’ve now spent several months and a good deal of effort attempting to get some useable data into our shiny new Open Data Platform. Initial attempts concentrated on replicating the triplestore we already have - which lives inside the parliamentary network - into a new one on an external Azure subscription. Unfortunately, replicating a triplestore is a darned site harder than replicating a relational database. So those efforts met with little in the way of success.

Jianhan, god bless him, has been exploring a new approach, going by the working - and possibly optimistic - title of the ‘single triplestore solution’. This does away with the need for an internal triplestore completely, the components that had POSTed there instead being reconfigured to point to the external one. All so simple.

Data Analyst Rachel - in collaboration with ‘brarians Anya and Jayne - put her shoulder to the wheel, coming up with a pretty comprehensive test plan. A novel and welcome innovation. In week 7, test plan became test reality, assorted librarians putting Jianhan’s single triplestore solution through its paces. Happily, it was not found wanting.

At this point, our dear reader may well be thinking, “brilliant, we may soon have a parliamentary open data platform. I wonder what I’ll be able to build on it?” Not so fast, comrade. For those unfamiliar with the inner workings of Parliament, large computational projects go through a process that might be best described as Big Design Up Front. A multi-page document called a High Level Design is filled out and sent to Enterprise Architects and Security colleagues. Should they give the thumbs up, you’re free to build the thing. So long as you stick to that design. The problem is High Level Design documents are written in something approaching Enterprise Sanskrit, which none of the assembled team have been trained to read. Which means we’re not entirely sure if we’re still following the agreed design or not.

For that reason, Delivery Manager Lydia has managed to purloin some time from Sajid, a security analyst. He’s been tasked with parsing the High Level Design document, comparing it with our single triplestore solution schematics and working out the difference. If any.

Whilst that happens, the team are on tenterhooks. This is mainly because we’re now on a quarterly planning cycle and are on the hook for having both a populated Open Data Platform and a “pre-beta” Search application - taking its data from said platform - by the end of March. All fingers remain crossed.

Otherwise, work on the replacement for Parliamentary Search appears to be back on track. Developer Jon’s feet are firmly back under the desk - replete, finally, with a working, if slightly noisy, laptop. The rather nasty bug that had been accidentally introduced in our rush to ship features before his last contract expired is now fixed. And not only that, in the process of fixing it, Jon decided a proper refactor of both aliases and query expansion was in order. So that’s also now done. A couple of other requests that emerged in Jon’s absence have also been fixed, meaning search results no longer show duplicate parties and a feature we call “the Claw” - showing ‘what we actually searched for’ has been added. The Claw being the tool our crack team of librarians use to debug what Search is actually doing, its name being based on the contortions one’s hand must make in order to invoke it. Onwards!

We also now have both staging and production instances of the Search application, neither of which we can point you at quite yet. The staging version because we don’t want yet another application swamped by Larcenous Language Model scraping bots, and the production version because it isn’t yet plugged into the domain it will one day call its home. Delivery Manager Lydia continues to badger on the latter front, so the next time we put pen to paper, there might finally be something to click on. Stay tuned.

Psephologising wildly

Sticking with things we’re on the hook for, we’ve also promised to have a plan in place to replace the application we use to manage general elections before the end of March closes in on us. The plan to date has two parts: the management of core election entities - general elections, constituencies, elections in those constituencies, candidates in those elections and so on - and the management of the workflow in and around a general election.

On the first part, we’ve loaded all pertinent data from our election results website into Data Graphs. This week saw Data Analyst Rachel, Data Scientist Louie and computational midfield journeyman Michael complete the first round of testing. Which means next week, Michael needs to rip down that project, finesse the model a little, tweak the SQL to better match the new model, repopulate and retest.

On the second part, last week saw a workshop of sorts - with Librarian Anna, Data Analyst Rachel, Data Scientist Louie, whiteboard warrior Michael, Librarian Phil and Researchers Carl and Elliot scattering post-it notes hither and indeed thither. The results have been drawn up and can be found, as ever, on GitHub. It feels like we have a better idea of events happening in the real world, what we learn as a result of those things happening, and what we’re able to do, or must do, as a result of what we now know.

Unfortunately, it also turned out that the code Michael wrote last year to manage staged publishing of general election results was in fact the wrong code. Or the right code, but to a not entirely complete specification. We’re able to publish the full results of a particular election once all the winners are confirmed for all the elections. Sounds good, except this becomes problematic should one of the elections be delayed by death or natural disaster. Not unknown. It looks likely another workshop is called for in the not too distant future.

Recent updates to the election result website include addition of two new links from constituency pages to a couple of relatively recent Library data dashboards, pertaining to the accessibility of railway stations and bodies responsible for services in a constituency. Oh, and we’ve added Elliot to our humans.txt page. If you’re wondering about Elliot’s strapline, it’s because he comes to us fresh from a PhD in string theory. Which sounds complicated to our ears.

More promises made

It’s often said that it is best to under-promise and over-deliver. At this point in the cycle, it rather feels we’ve failed to abide by this mantra, there being two more things on the promised list before spring arrives on rainy island. Those two things involve Data Graphs and whether two of our somewhat antiquated data authoring tools can find a happy home there.

As of this week, our first pass data migration from the Procedure Editor application to Data Graphs is considered complete. An absolute shedload of SQL providing proof of work on that front. Librarian Jayne, Data Analyst Rachel and computational helpmate Michael completed yet another round of testing, spotting a gap or two in the model as a result. Meetings are already in calendars to plug those gaps, at which point, that project will also get ripped down and replaced by something slightly better.

The other application we’re hoping to replace with a Data Graphs implementation is OaSIS - or the Odds and Sods Information System. This provides a record creation function for parliamentary material for which no feeds have yet been provided. Should longer term plans come to fruition and all procedural systems propagate change notifications over message queues - OaSIS should get turned off. Which doesn’t help us much in the meantime, OaSIS being yet another system that’s been shown no developer love for the last 10 years or so.

The tricky bit with the OaSIS replacement is not changing it too much from its current implementation, for fear we’ll need to rebuild everything else. For this reason we skipped over our usual white-boarding stage and went directly to attempting to replicate the current model in its new home, as best possible. A mistake in retrospect. Librarians Emily and Jayne have just completed a first review, only to find that our current subclassing structure is, in fact, nonsense. Lots of content types are gaining attributes that make no sense in the real world of Parliament. So it’s back to the drawing board.

In more positive news, Shedcode James now appears to have a reproducible path for converting SQL Server databases - the database of choice in Parliament - to Postgres - the database that Michael can use. Which means we can fairly easily get new dumps of data from Procedure Editor, OaSIS and indeed the Research Briefings application, run them through a quick conversion, blat our current models and rebuild. Which is handy given the amount of rebuilding we need to do.

In other good news, colleagues from the Parliamentary Computational Section have removed any and all statutory instruments from the live OaSIS database. This is a belated tying up of a loose end - at one time, our librarians keyed records for certain SIs that are not laid before Parliament. Having stopped doing this back in 2018 or so, we’ve bitten the bullet and removed the old data.

On orders being standing

If you’ve been waiting with baited breath for our standing orders application to finally get the green light to go live, you’ll be waiting a wee while longer yet. Librarian Emily tells us she’s close to completing her tidying exercise for House of Commons public standing orders from the 18th October 2022. After which, it’s merely a matter of cloning that revision set for the next date on which the standing orders changed, updating accordingly, and cloning again. And so on. And so forth.

To help Emily complete her tidying duties, Shedcode James has added a markdown editor to our authoring application. He’s also made a first attempt to screen scrape the current House of Lords public standing orders, parse them into a shape suitable for ingest, and piped them into our standing orders database. You won’t find them in the application just yet, the House of Lords set will remain unpublished until Emily has also tidied those. Luckily, there’s a lot fewer of them. That said, their numbering does appear to have gone awry at some point.

Toward a single subject view of the Librar(y/ies)

If you’ve been following along from home, you’ll know that crack librarians Anna, Susannah, Ned and Silver have been hard at work on yet another Data Graphs project. That one being dedicated to our nascent Library Knowledge Base. A knowledge base that we cannot point you to because it contains contact information for the Library’s crack team of researchers.

The important thing to note is it uses the Library’s thesaurus to describe both the subjects of research briefings and the specialisms of researchers. Doing this in such a way as to fully exploit transitivity over the thesaurus’s poly-hierarchical structure. Very clever. In order to do that, it uses a component called Mirage which calls the API that sits in front of our thesaurus management service looking for changes in both labelling and structure. Until recently, these changes only covered taxonomic concepts being created or edited. There being no means to detect deletions. This problem has now been solved with the introduction of a new policy to soft-delete concepts by moving them across to a newly created concept scheme. Librarian Phil has manned the soft-deletion buttons and Librarian Susannah has tested the results. All is now working as both designed and expected. Phew.

Publications prodding

Our final Data Graphs project is more exploratory in nature, taking the rough sketch domain model that Anya and Michael scribbled last year and starting to prototype how Parliament’s three research services might better publish ‘metadata’ - yes, we know, we hate that word too - around their publications.

So far, we’ve added data model and fake instance data for the two Houses, the three research services, publication works, publication expressions, periods of withdrawal and the rough notion of one of more publications superseding one or more others. So there isn’t really much to see. That said, what there is is all available through yet another of our ‘browsable spaces’, this one put together by the expert hands of Shedcode James. Over the next few weeks, we’re hoping to get some actual publication data in there and expand the model to include House of Commons Library sections, together with publication owners, authors, contributors and editors. After that we plan to move on to the more ‘curatorial’ aspects, annotating publications with subjects - again from the Library thesaurus - placing them in collections, and pointing them toward order paper business items. Stay tuned.

Spring cleaning

It will not come as a surprise to our dear reader that Librarian Emma has been applying her feather duster to the far recesses - forgive the pun - of our data cupboards. A wee while back, we collaborated with esteemed colleagues at The National Archives to import all Acts of Parliament gaining Royal Assent from 1801 onwards into our Parliamentary Search backend. A collaboration that continues to pay dividends, only a few weeks back providing us with the ability to lookup instruments currently before Parliament by their enabling Act without crashing our Procedure Browsable Space™ application every time we lacked an Act record.

The initial import had all the things one might expect, from titles, to long titles, to dates of Royal Assent, to legislation.gov.uk URIs, to Wikidata identifiers, to Regnal year session citations. What they did not have were subject indexings. Now, thanks to Emma’s diligence, all Acts back to 1987 come with subject indexings, making them much more easily findable in Parliamentary Search, both in its current guise and in its future incarnation. Marvellous stuff.

Emma has also been busy polyfilling another crack in our data armoury, whereby ten sessions were found to be lacking prorogation records. All of these have now been tracked down, colleagues in the Computational Section have added records, and Emma has procedurally indexed them as appropriate. Enough to put a smile on any Data Librarian’s face.