ontologies

2026 - Week 15

An inflection point, long overdue

Our regular reader - hi, Keith - knows we’ve spent rather longer than expected attempting to build a new open data platform for procedural parliamentary data. Good work was done along the way, but we hit something of a snag when it came to populating the blasted thing. And an open data platform with no data is not much of a platform. Open or otherwise.

Early efforts concentrated on trying to replicate data from its current home inside the parliamentary network to a new home on the servers of our cloud provider of choice. But that did not go so well - weeknotes passim. A whiteboard session was organised, coffee downed, cigarettes smoked and options explored. At which point, our Jianhan chipped in with, “why don’t we just update the four things that currently POST to the internal triplestore to POST to the external one instead?” More weeknotes passim.

Now, when faced with a technical conundrum, trying the simplest thing first is never the worst idea. And Jianhan’s solution was certainly the simplest one on the table. So, following that meeting Jianhan threw heart and soul into what Young Robert would undoubtedly call a ‘technical spike’. The results of which were tested by Data Analysts and our crack team of librarians and not found wanting. All that remained was the crossing of some bureaucratic Is and the dotting of some adminstrivia Ts and we’d be good to go. Not to make light of the administrivia. Adminstrivia being a hurdle we’ve stumbled over on an uncountable number of previous occasions.

But not this time. Jianhan prepared his paperwork and headed into the arena, clearing all remaining hurdles, showing a clean pair of heels, crossing the live-and-in-production finishing line before the crowd had time to draw breath.

All of which means, our new open data platform finally has data in it. About 35 years worth to be exact. All of it interlinked and subject indexed for our reader’s delectation. Though not quite yet. There is a SPARQL endpoint in place - I mean, of course there is - but, for now, that’s locked down to people with a subscription key. Something we hope to remedy soon. For some parliamentary definition of soon. An early day plan, if you will.

This work covered shifting the data store to the new environment. A whole host of other components remain scattered across another couple of environments. So tidying that is our next job. If you’re actually interested in any of this - and we recognise it can be a struggle, our ‘single triplestore solution’ Trello board has at least some of the details of what awaits us over the next few months.

Fantastic work, as ever, Jianhan.

New, old Parliamentary Search

Whilst Jianhan has been plumbing in pipes, Developer Jon has been slaving over his kitchen sink, polishing the pixels that pop out of said pipes. This mostly in response to testing from our crack team of librarians.

Our dear reader, having avidly consumed the previous issue of these notes, may well remember us waxing lyrical over Developer Jon’s integration of our taxonomy with the search query service. Which remains a lovely bit of work, and of course relies almost entirely on our crack team of librarians paddling like maniacs under the waterline to subject index and procedurally index almost everything that comes out of Parliament.

On rare occasions, that can go awry, if

an item of content gets indexed with a concept that is subsequently removed from the taxonomy, or
an item of content gets indexed with the identifier of a non-preferred concept. Or a synonym if you will. Explaining that would require a little too much detail, even by our standards.

Neither of these things should happen and policies and procedures are in place to attempt to ensure they never do. But should is no guarantee of never. That it has in fact happened at some point in the past bubbled to the surface when Young Robert ran a query, one of the results contained a reference to a non-preferred concept identifier, Jon’s code understandably failed to find it, and the page exploded. Not good.

Jon has now adapted his code, so if a concept can’t be found - because it is not a primary concept, or because it is a deceased concept - the page no longer explodes but instead just displays the identifier of the missing concept. This solution accompanied by a new request for Developer Jon to log any non-deferenceable taxonomic identifiers in some place that is amenable to librarian eyes. ‘A dash of system observability would not be lost,’ points out Young Robert.

To help with the tidying task, Librarian Phil has busied himself querying our triplestore for all non-preferred concepts currently used to index parliamentary material. Very handy. And Ned has reviewed and resolved many of them, and plans to complete the work have been taken up by Susannah. Thanks Ned, Phil and Susannah!

Sticking with search result pages, Jon’s also made a change to date faceting. Previously, when you chose to facet on a given year, a list of months became available allowing for further faceting. Oddly - to librarian eyes - having then clicked on a month, the year selector remained available, allowing our user to expand their search to the same month in other years. Odder still, having selected a year, then a month, our user could deselect the year, the results then being scoped to everything that happened in February. Any February. Or at least any February we have content for.

Now, one should never second guess one’s users, and it is perfectly possible there’s someone out there who’s interested in finding parliamentary material in a given month in any of combination of years. That said, the rest of the interface is designed to enable drilling down. Drilling outwards being perhaps an affordance too far. For that reason, having selected a month in a year, it is no longer possible to select an additional year. Or years. Apologies to any February fanboys out there. We think it’s for the best.

The other two changes to report are a little less immediately noticeable. The first is a CSS tweak to cope with a fair few and rather large images that accompany Library research briefings that were bursting the banks of the browser. Happily, that is now solved. And secondly, Jon has decorated the search input box with a ‘required’ attribute, meaning it’s no longer possible to arrive at the search screen, skip over the text box, press the little magnifying glass button, and expect the entirety of our Solr search service to be returned to your browser. Given the vociferous appetites of the magic sand lad’s web crawlers, this can only be a good thing. Although we’re fairly sure they’ve already gobbled up most of it.

It’s called information technology for a reason

Whilst Jianhan and Jon get their fair share of the glory, technological triumphs only get us so far. Computational technology without information being akin to the proverbial fish with no bicycle. Well, far worse than that, quite clearly. Happily, we have our crack team of librarians, quality information management skills coursing through their veins. Alongside Guinness, perhaps.

Not only do they pedal like fury to keep on top of this week’s parliamentary business, they pour an equal amount of love into the parliamentary business of bygone days. This week, for example, Librarian Emma reports that she’s combed through 30 sessions worth of proceedings, and indexed them with the relatively new ‘Privilege amendments’ taxonomic concept where appropriate. 30 sessions! All the way back to 1992. Wow!

Emma has also taken a spreadsheet of bills in search of a lead Member, adding Member taxonomic concepts as appropriate. This time across 25 sessions. Because our tools don’t allow for edits to data of a certain vintage, that spreadsheet has now made its way to colleagues in the Parliamentary Computational Section, who’ll hopefully be appending the results to existing records in the not too distant.

I am a procedural cartographer - to the tune of the Palace Brothers

Sticking with our theme of excellent librarianship, Librarian Jayne has also been busy, once more exercising her procedural map making skills. This week has been less about charting new territories, and more about redrawing old ones, our Constitutional Reform and Governance Act treaty procedure map seeing both Commons motions and Lords motions moved out to their own component procedure maps. Now we come to think of it, when people hear the word ‘refactor’, they tend to think only in terms of code. But quality information management is all about refactoring too.

Our procedural cartographic refactoring efforts do not end there. Quite some time back, Librarian Ayesha and computational ‘expert’ Michael met over a whiteboard to test their thinking on whether a decision step could ever sit directly in front of a summation step. Logic charts were sketched, ones and zeroes applied and a conclusion reached. Yes, they decided, a decision step could sit directly in front of a summation step according to all known logical combinations. Again, the reason why is beyond the scope of even these notes. Catch us in a pub sometime. We’d be happy to explain.

The upshot of that decision meant we could replace one summation step, two decision steps and four routes with one summation step, one decision step and three routes. Not much of a saving, one might think, until one considers that that pattern was scattered far and wide across 50 different procedure maps. Over the course of several weeks - months perhaps - the cartographic couple of Librarian Ayesha and Librarian Jayne have gone back over all 50 maps, simplifying where possible. Which not only makes the maps and data easier to maintain going forward, as Young Robert would inevitably add, but also means, when we finally add procedure parsing code to our Procedure Browsable Space™ - there’ll be a significant number of routes it need no longer traverse. Which should speed things up a little. Though perhaps not noticeably to the human eye. If our user doesn’t thank us, perhaps our computer might. It gets tired too, you know.

Filling in the odds and the sods

Explorations into updating our librarian-facing tools also continue. One of those systems being the oddly named Odds and Sods Information System. Or OaSIS as we say. Odds and sods is an odd name, but it does explain itself quite well, being a tool allowing librarians to create records for things that most probably exist in other systems but where no piping has been put in place to replicate said records into Library systems. So a bit of a double-keying nightmare to be honest, but not a double-keying nightmare that’s likely to go away anytime soon.

Like many of our systems, OaSIS was created a decade or so back and has not been touched by the loving grace of a developer since. So it’s old, it’s clunky and it’s approaching its retirement years. To that end, this week Librarian Emily has traversed every screen in the application to make a map of which properties apply to which content types. The next job being to check whether that agrees with what’s actually in the database. And how likely do you think that is?

Psephologising wildly

In exciting psephological news, our election results website now comes complete with by-election results for Parliament 55. All 21 of them. This thanks to more sterling work by Librarian Anna, who’s had her head buried in elderly spreadsheets, attempting to make them amenable to machine consumption.

Prior to this release we’ve had general election coverage for parliaments 55 to 59, whereas our by-election coverage was restricted to parliaments 56 to 59. Our regular reader may be under the impression that we’re not entirely unaffected by obsessive compulsive disorder, and our dear reader would not be wrong. Such a dog-legged offering felt intrinsically iffy, our coverage page caveats being stretched to breaking point. Now we can say that if there’s been an election to the House of Commons, as part of a general election or a by-election, at any point since the general election of 2010, it’s in our database and on our website. Satisfying.

Should you be a consumer of our election data, it might be worth mentioning one small snaggle. Invalid vote counts have been available for all other elections we’ve added. But for two of the 21 by-elections that took place during Parliament 55, despite requests to local authorities, no such data was forthcoming. Presumably because it hadn’t been recorded. Or if it had, that bit of paper is now missing. To cope with that, we’ve had to add a new column to the elections table in our psephology database, recording whether the invalid vote count is available or not. Both ERD and data dictionary have been updated accordingly.

Going forward - thanks Young Robert - we intend to move backwards, keeping general elections and by-elections in lockstep by Parliament period. Unsurprisingly, efforts are currently focussed on Parliament 54, which we hope to have live sometime in late April or early May. Whichever comes later.

Sticking with the election results website, Librarian Emily has been busy keeping our Maiden speeches in the House of Commons since 1918 Parliamentary Facts and Figures publication up to date. That has also been reimported, meaning Hannah Spencer is now reunited with her maiden speech. Thanks Emily.

Stepping behind the scenes, Librarian Phil has also been on by-election duty, this time over in the Members’ Names Information System. Picking up the baton from Librarian Anna, Phil has taken our naming conventions for by-elections over the finish line: all by-elections now adhering to a consistent and coherent naming structure. Top work Phil and Anna.

Back with the election results website, we’ve also done a wee bit of tidying of the result summary text. These appear at the top an election page, for example ‘Ulster Unionist Party gain from Democratic Unionist Party’ and ‘Social Democratic & Labour Party hold’. We’d spotted that our approach to the Labour and Co-operative parties was less than consistent, sometimes including Co-operative, sometimes not. Notional election results - upon which we are somewhat dependent - brush aside the very existence of the Co-operative Party. This constraint has led us to a place of consistency whereby a gain - or indeed a loss, or indeed a hold - by Labour and the Co-operatives is chalked up to the Labour Party only.

In testing that, Librarian Ned noticed we were less than consistent with other party names in result summaries. The problem originates with the input data, which only comes with party abbreviations. ‘UUP gain from DDP’, for example. By means of an additional database table and a Ruby rake task that looks up the official party names, we somehow managed to wrangle these into their expanded equivalents. For example, turning ‘UUP gain from DDP’ into the slightly more legible ‘Ulster Unionist Party gain from Democratic Unionist Party’. Unfortunately, quite a lot of that expansion happened before we’d quite nailed down our party names, so that script has now been rerun and, where there’s a party name in a result summary, that party name now agrees with the party name across the rest of the website.

Toward a single subject view etc

Our Library Knowledge Base™ remains a little hard to explain given we can’t actually link to it. This because it contains the contact details of our crack team of researchers and is due to also include some data on working patterns. Which we do not wish to paste liberally over the web.

Nevertheless, a quick catch-up on the last two weeks. It’s mostly been Librarian Susannah testing things and declaring herself happy. A handful of research publications that had failed to make their way down the pipes now do, short synopses have been applied to research sections, researchers who are heads of research sections are now flagged as such, the option to start a Teams chat has been removed, control over the ordering of specialists for a given subject has been fully tested, researchers on leaves of absence are now dealt with, and the homepage has finally gained some text. Marvellous.

Facts and indeed figures

All of that leaves us with just a couple of things to report over in the world of Parliamentary Facts and Figures. As our crack team of librarians work their way through the refurbishing and refactoring of our spreadsheet-based publications, they have their eyes firmly fixed on two sets of users: those perusing said spreadsheets with their eyeballs, and those adopting a more computational approach. To aid in the latter case, they try, wherever possible, to include both a column with a label and a column with an identifier for each conceptual thing. This becomes problematic for Members, because our core dataset in the Members’ Names Information System only stretches back as far as the 1980s. Beyond which, it gets patchy at best. How then to provide an identifier for Members pre-dating that cut-off point?

When Librarian Claire first approached the problem, she settled on Rush IDs as a useful addition. Those identifiers expanding our coverage back to 1832. When Librarian Emily took over the maiden speech publication, a change in policy was agreed to prefer Wikidata IDs to Rush IDs. This mainly because one can more easily traverse to additional data from a Wikidata ID than one can from a Rush ID. As of this week, that change is now made.

Not that the Rush database is not helpful here. Indeed, Librarian Phil has made a wee Heroku dataclip listing all the Members in the Rush database with their Rush ID, their Wikidata ID and their MNIS ID, should they have one. Most useful for us, and, we imagine, for other people. Fill your boots.

Always cry at endings

Following several trips to the local public house, we felt we were starting to get to know Young Rachel. Imagine then our surprise when one sad day she announced she’d had quite enough of our jibber-jabber and had handed in her notice. It turns out that she’d met some bloke in a pub - quite on brand then - who’d offered her a new gig involving wearable technology and Alzheimer’s. So she won’t be escaping Young Robert and Michael that easily.

Whoever that bloke might be, we’re confident in saying he’s done well for himself. Who would not wish to employ a humane technologist with an excellent grasp of data, an interest in the processes that create that data and compassion for the poor people subjugated by those processes and the tools that accompany them. We know we would.

It is a sad day when people like Rachel leave Parliament, but leave she did, the Thursday before last seeing one last trip to the Two Chairmen for a quick round of farewell drinks. And when we say quick, we are, of course, lying. Both Anya and Rachel outlasted the crowds to put in a solid 10 hours of pub. Impressive to the end. Au revoir, Rachel!