ontologies

2025 - Week 43

Farewell then Professor Rush (slight return)

Back in week 36, we opened with the sad news that the world had lost Professor Michael Rush. Around that time, Paul got in touch suggesting we write some words on Michael’s passing, to serve as a tribute to the man, an introduction to his database, and an acknowledgement of his generosity toward the political research community at large. Those words are now published on the History of Parliament Trust website, alongside photos of the great man contributed by his sons, Jon and Tony. He remains much missed.

Quality resumed

For a great many years our crack team of librarians have been checking each other’s work. This is not symptomatic of trust issues, rather it’s to ensure that the whole team is indexing accurately, consistently and to the highest possible standards. Indeed, we have an Indexing Quality Team™ dedicated to such activities. Checking subject indexing terms have been appropriately applied to every written question, oral question and business question is their bread and butter. Or would be, if public sector wages ran to extravagances such as butter.

In the time BC - Before Covid - the work involved poring over paper copies of Hansard and equivalent records in Parliamentary Search to ensure that all the right questions linked to all the right answers and were appropriately indexed. Progress was easy to track by looking at the ‘to do’ and ‘done’ piles of Hansard on the shelves in Indexing Quality Team™ corner. Like a managerial dashboard if you will, but made of paper.

This worked well until Covid came along and forced us all to work in our own home offices / bedrooms (delete as appropriate). With no access to paper copies of Hansard, the general strain of working from home, everything tasting of metal, and technical problems with the contributions view in our Indexing application, the checking of oral and business questions was left on the back burner for a wee while. The backlog of the Indexing Quality Team™ quietly grew whilst eyes were elsewhere.

Fast forward to the start of the 2023-24 session and a full strength Indexing Quality Team™ resumed the task of routinely checking oral questions and business questions. All well and good, but by now a checking backlog had built up stretching back to February 2020, a whopping 133 weeks’ worth of questions. With one week of questions taking roughly eight hours to check, this amounted to over 1,000 hours of work. Poor librarians.

Now if there’s one thing our crack team of librarians don’t like, it’s a backlog. Librarian Steve rallied the troops, created a spreadsheet, and asked for volunteers to bring the backlog to heel. Quality Teammates Anna, Emily, Steve and Ned, aided and abetted by Librarians Jayshree and Jason, started to tackle the backlog, mostly during recesses.

With Steve, Ned, Anna and Emily attempting to keep heads above water on the day to day work, much of the burden fell upon Librarians Jayshree and Jason. A helping hand was also offered by our Jianhan, who somehow managed to fix the indexing application contributions view, which had inexplicably failed for everything in the 2019-21 session. Almost two years of quiet endeavour followed until, on the 18 September 2025, Librarian Jayshree informed Librarian Steve that the backlog was no more, that she had checked the last oral question and all was found good. A task worth doing said Jayshree, especially given some questions had never seen the sharp end of our indexing tool. Applause to all.

Quality expanded

Still on the subject of indexing, prior to 2012 our computational systems provided no separate witness field for committee business. With no other option, our crack team of librarians instead added witnesses - be they people or organisations - to the subject field. Now, witnesses are important to the work of committee inquiries, but, I think we can all agree, they are not the subject of such inquiries. Well, not often. And if there’s one thing we’re sticklers for, it’s semantics.

A dedicated witness field finally arrived in 2012, but that left us with 14 sessions worth of committee records with all of our witnesses filed in the wrong field. Or 15 years worth if you count time like a normal person. Which is a lot of unpicking. Undaunted, Librarian Martin broke the task down by session and got to work. That was back in July 2024. It is now October 2025 and Martin reports he has finished. Yet another Herculean effort made possible by excellent librarianship. Martin, we salute you.

Not to be outdone, Librarian Emma has also been busy. In the first instance, Emma combed through 16 years of statutory instruments, eyes peeled for those with an SI number but no made date. Simultaneously, Librarian Tim was working on adding missing coming into force dates and coming into force notes to instruments that already had a made date. Both successfully patched those gaps some time back. But Emma cannily noted that having completed the original task of adding made dates, there were now more records missing coming into forceness. A librarian race condition, if you will. Did she rest on her laurels? No, she did not. As of this week, Emma has gone back over that second list, applying either a coming into force date or a coming into force note, as appropriate. Emma and Tim, we also salute you.

Another long slog for poor Jianhan

It goes without saying that all indexing efforts would not be possible if our librarians had not been pouring heart and soul into the expansion and upkeep of our thesuarus for the past forty-odd years. And would also not be possible if that taxonomy did not have a computational home to call its own. For at least the past dozen years that home has been found in an application called Ontology Manager. A confusing name, given it is used to manage a taxonomy and not an ontology, but we’ll let that pass.

Our regular reader will be well aware of our problems with maintaining and updating software. Once it’s live, the project closes and the developers disappear. Or that was the traditional pattern. It turns out that Ontology Manager is two major versions behind what’s currently supported and the only chap with any experience of developing it is now a Vice President or some such. With very little time for tweaking SKOS files.

Nevertheless, the vendors have been most kind. Pixel-based meetings have been held, data dumps transferred and migration scripts have crossed the Atlantic on a number of occasions. On the receiving end of much of this have been poor Jianhan and Librarian Phil. We are delighted to announce that the latest version of the migration script worked perfectly, all the former semantics transferred to the new semantics. It is not often one is able to report that a vendor has been an absolute pleasure to work with, but we make an exception on this occasion.

Glancing left across the Trello board, it would appear Jianhan has already adapted three applications to work with the new API. Though, it should be stressed, they have not yet met librarian eyes. Only another 11 to go. Onwards!

Psephologising wildly

Back when we first put our elections results website live, there were a handful of page types that proved quite beyond Michael’s fairly rudimentary SQL skills. These being the list of general elections for a political party, and general election party performance pages at United Kingdom, Great Britain, country and English region levels. To sidestep this problem, Michael added three new denormalised tables. In fairness, the three tables have served us well, the resulting pages never failing to not fail. Which is not nothing.

On the downside, the existence of the three additional tables doubled the time it took the load the data. It meant we first had to load all the election results into the tables proper, and then, by a series of loops within loops, load the denormalised data into the denormalised tables. This in itself, would not be a problem, if it weren’t for elders and betters in the House of Commons Library requesting a faster publishing cycle when the next general election inevitably creeps up on us. Last time out, the loading script took so long to run, Librarians Anna and Emily were sending through data corrections whilst Michael’s poor machine was still churning, popping, banging and smoking. Which was, quite frankly, less than ideal.

Happily, Data Engineer Rachel has now been piped aboard the good ship psephology, and her SQL skills are a cut above Michael’s. A fairly large cut if we’re honest. Rachel has now rewritten the queries that sit behind the page types listed above, so that they no longer use the denormalised tables. Instead they use the database tables proper. Which means, come the next general election, we’ll have three fewer tables to populate and our import scripts will run a heck of a lot faster. The three denormalised tables have now been stripped from the database, the entity relationship diagram and data dictionary being updated accordingly.

In the course of course correcting Michael, Rachel bumped into a bug the lad had managed to introduce alongside his denormalised tables. It turned out that party vote share figures at country and English region levels had been calculated by dividing by the cumulative votes of all candidates standing for parties in those areas. Thereby ignoring votes gathered by independents and the Speaker of the House of Commons. This bug has now been fixed. Thanks Rachel. Your reward is in the post on the /humans page.

Waddingtonification and Korrisification of the browsable procedure space

A couple of changes to report in the Procedure Browseable Space™ space. Firstly, in response to feedback from Mr Korris, our work package list is now broken down not only by the ones currently before Parliament and all of ‘em, but also by the nature of the paper the work package focusses on. Which means we now have lists of work packages pertaining to primary legislation, secondary legislation and treaties. All very much more approachable, and, dare we say, useable.

Secondly, in response to our own feedback, Young Robert and Michael took out their tiny pixel chisels and carved out new work package timeline pages. Meaning the business items listed now link to both the source document - where present - and also to a page listing the step or steps actualised. Thereby allowing our dear user to follow their nose to other work packages where the same step or steps have taken place. In this case, all work packages where a question on a motion that the treaty should not be ratified has been put to the House of Lords. Both times then. At least, since 2017 that is. Very nice, or so we think.

Painting in pixels

The last website to leave Robert and Michael’s respray garage was our beloved Egg Timer™. As we reported on our last outing, the results of those efforts were not entirely successful, with some pixels sticking out at rather strange angles. Acting upon advice from Design System Mary and under instruction from Service Owner Jayne, our terrific twosome took a second shot at resolving the problem. Several hours later, we were up to version 0.3.18 of our design system Gem, visual improvements to the old Egg Timer™ hopefully being noticeable. Still a bit Duplo, reports Librarian Anya. Well, you can’t please all of the librarians all of the time.

Less successful was our attempt to apply the Shedcode James design system Gem to our in development version of Parliamentary Search. Things had been going well until the Bootstrap grid system came into contact with Developer Jon’s grid system, causing all of the latter to turn into a list. Which might actually be what we want, given large parts of the grid are empty for a large number of content types.

Things went from bad to worse when the branch Robert and Michael were working on was accidentally merged into main. “Oh god,” said Michael, “Developer Jon is going to kill us”. “Let’s not panic”, said Robert, “James is down on Wednesday. He knows about computers and suchlike. He’ll know what to do.” And know what to do he did. Thanks for the rescue efforts James. Heroic use of a terminal there.

Bots to Bluesky (and beyond)

Still with pixel painting, our Made ‘n’ Laid and Tweaty Twacking bot accounts now come with their own website, again using the design system Gem to paint it into parliamentary colours. Both views are fully equipped with RSS feeds - because of course they are - but we would not encourage subscription until the website finds a more on-brand domain to call its own.

Not having the time for such trivialities, our Jianhan has now turned off his Made ‘n’ Laid Mastodon posting code and Michael’s efforts have taken over the task. We await the laying of a treaty to check whether those pipes are free from blockages. At which point, our Jianhan will turn off his tweaty twacking code and we’ll hopefully switch to Michael’s.