ontologies

2023 - Week 43

It’s been a while and please rest assured that you, dear reader, are never far from our thoughts. Let’s kick off with the question we know is on everyone’s lips …

Is your triplestore still on fire?

Our regular reader will be all too aware that, last time we placed ink on paper, our beloved triplestore had caught fire. How the flames started, we’re still less than sure. And may well never know. What we do know is our crack team of librarians returned from summer recess, fresh-faced and eager, only to be greeted by a pile of questions several feet high and an indexing application that had slowed down to the point it was taking anywhere up to 45 minutes to subject index a single record. All of which led to the poor souls being forced to work well past their usual cocoa time. Less than ideal.

Since then a couple of things have happened. Unfortunately - at least scientifically speaking - they happened at the same time and attempting to disentangle competing correlations into a single causation is a little too wave-collapse-y for our limited brains. First off, the machine the triplestore was running on was upgraded from a hard drive to a solid-state drive. Second off, some virus scanning software that had recently been installed on that machine was removed. What fixed what remains a matter for conjecture; you may pick your favourite theory.

Whilst performance noticeably improved, the two changes coincided with conference recess. Which meant the stack of jobs to do was considerably shorter than on a normal working day and fully testing the changes proved almost impossible. The true test would come when our Members returned and the questions started to flow again. Last Monday, the Erskine Express once more pulled into Westminster Station, our Members disembarked and the tabling floodgates opened. Reader, all was well. The indexing application did not break under load and our crack team of librarians had all 1937 questions indexed by 2pm. Lovely stuff.

If we’re absolutely honest, the reason you’ve not heard from us in a wee while is all down to new, old search. All the time we used to have for whiteboard sessions, chat, gossip, data modelling, information management and prototyping - or safe-to-fail probes, as Young Robert might say - appears to have disappeared into a vortex of meetings and, indeed, workshops. Facilitated at that. On the positive side, we get to spend quality time with Lydia, Yomi, Graeme, Jon and Raafay, so it’s not all bad.

When we’re not in meetings, work on new, old search is - generally speaking - still going quite well. With a helping hand from our Jianhan, data analyst Raafay has done sterling work to populate the backbone of our shiny, new data dictionary, deploying his data analysing spanner to tell us - for the first time - which content types carry which data attributes. All interesting information. Librarians Anya and Jayne have followed in his wake, populating columns to tell designer Graeme which attributes appear on our new object pages and roughly where. And not only this. Spotting a whole bunch of inconsistencies, oddities and generally wonky data, Anya and Jayne have also been tidying data where they can and putting in calls to team:Ian where they can’t. If this project fails to deliver, it will all have been worth it for the data dictionary alone, says Anya. Thereby cursing us all.

Off the back of the data dictionary work, Graeme has taken the spreadsheet and started to annotate his wireframes with the names of attributes to use. All of which means developer Jon now has a much better idea of what he’s expected to do and how defensive his code needs to be. Very defensive, it would seem.

As well as turning up dodgy attribute associations, Anya and Jayne have also uncovered quite a few attributes we should have included in our designs and a smaller number of things that might one day appear on an object page, should information management practice flex slightly. All of which makes more work for Graeme, when he probably thought he was done. A lesson for us all: always do your data analysis before putting pixels to page. One lives, one learns.

Back in the backend, our Jianhan has been plugging away at upgrading the bits he can upgrade. Ably assisted by Librarian Ned, this time in a testing role. The data that was in Solr version 3 - yes, really - has been ported successfully to Solr version 9. Tweaks to the indexing rules are still required, but - before we can do that - Young Robert and Jon are planning to combine heads and start building a test suite for our different Solr APIs. Our Jianhan has also managed to get the data that was in our OWLIM - yes, really - triplestore into Graph DB 8, some wonky URIs being tidied on route. Librarian Ned once again undertaking testing duties. Ideally, we still need to get up to version 10, but Jianhan reports the free version of that is not quite so easy to work with.

All of which puts us well along the path to upgrading our assorted components. Unfortunately, coupling those components together is not going quite so well. Especially given the complications of our assorted environments. It’s a little like having a new washing machine, a new electricity supply and new plumbing, but no pipes or cables. Next Tuesday sees yet another meeting - it may even be a workshop - at which we aim to sketch out where our assorted components live and how they connect together. Wish us luck.

In a small item of good plumbing news, Jon and Jianhan have combined efforts and Jon’s code is now pointing at Jianhan’s upgraded Solr. Have a fiddle with the query parameter here if you don’t believe us. We feel sure you’ll be amazed.

People, places, parties - future facing

Splendid progress continues to be made on planning for the next general election. Thanks to researcher Neil, we now know when constituencies begin and end. Which turns out to be quite different to what our database thinks. MNIS has new constituencies beginning on the date of the general election following the making of the Order in Council creating the boundary change in the country the constituency lives in. Putting database definitions to one side and donning his statutory interpretation cap, Neil reports that the legislation says new constituencies come into being on the date of dissolution following the passage of the Order in Council etc. Quite different. Really quite different. It just goes to show, one should never create a database without first reading the legislation.

By way of a compromise, we’ve decided to set constituency end dates to the date of dissolution and constituency start dates to the day following the date of dissolution. This because our database doesn’t handle times and we don’t want to accidentally fall into the trap of having 1300 constituencies active on dissolution day. Neil has signed off on this compromise, which leaves us with some data to tidy. Quite a lot of data to be honest. Luckily, we have handy lists of parliament periods and dissolutions to rely on and the pattern of errors is - for once - quite predictable. So we’re hoping the kindly machines might take care of the problem without librarians having to soil their hands and chip their nails.

Librarian Phil has been busy investigating the ‘standing down’ boolean in MNIS. It turns out the checkbox has been ticked 151 times with most Members having the end reason ‘retired’. The next job is to find out if all these records are indeed legitimate. Did these 151 MPs actually announce they were standing down and did they not run in the following election? Once we know that, we can start to backfill the data for all the other Members in MNIS who’ve announced they were standing down. A job made much easier by Librarian Emily having recently completed an enquiry asking for a list of all standing-downers since 1997. Top work Emily.

Still with Librarian Emily, her lovingly compiled list of reasons why a Member might leave the House of Commons has been passed to the red carpet contingent for thoughts and feedback. Not only that - being nothing if not bicameral in nature - librarians Anna and Emily, together with computational buffet-bowler Michael, offered their data mapping services to the upper House. Which is why last Tuesday saw the three of them joined in pixels by Ms McAskill and Mr Korris for an all too short session on why peers might disappear over the legislature event horizon. Aside, perhaps - and it is a very big perhaps - from the Insolvency Act 1986 and the inevitability of death, not much was found in common. Work continues.

Over with data scientist Louis, initial tests of the general election data import process are nearly complete, although there has been one rather important change. Thanks to the tireless efforts of colleagues Ann and Steve, the old candidates database tool has been resurrected and improved. This means we can seed that tool with data from Democracy Club, which used to be the most time consuming part of general election data preparation, whilst also using technology that is already battle-tested with Library staff on election night.

People, places, parties - past tense

A decision to decommission our election results website provided an opportunity for the Library to take a look and re-envisage the uses we could put election data to. Given the data starts its parliamentary life in the House of Commons Library and given House of Commons Library researchers are the people doing much of the analysis, data scientist Louie now finds himself building an election data visualisation tool for the Library website.

Visualisation tools are all well and good, but we are always aware there are people out there who want their data raw not cooked. Given there’s no authoritative source for general election data, we found ourselves facing the question - why not us? This is how computational trundler Michael found himself attempting to replace the current web pages with something not too dissimilar, on a budget of five english pounds and a toffee apple.

As good luck would have it, the Library publishes a briefing following every general election, those briefings are accompanied by spreadsheets and those spreadsheets are kept up to date with all the latest numbers. So far we’ve taken the spreadsheets from 2015, 2017 and 2019, parsed them into Postgres and started to make a website. Over time, we aim to make something that also covers by-election results and - as more data gets tidied - increase our historical coverage.

The actual election part is pretty easy - though there is librarian work to do to standardise the spreadsheets and add MNIS Member IDs and Electoral Commission party IDs where appropriate. At this point, we take a short break from typing and cast a hopeful glance in the direction of Librarians Anna and Emily …

The more difficult work is defining constituency areas, their boundary sets and the legislation creating those boundary sets. Not wanting to fall into a boundary change hole, we’re engaging the mighty brains of both statistician Carl and researcher Neil. We’re currently grappling with interim boundary reviews and whatever the hell happened to Milton Keynes under article 2 of the The Parliamentary Constituencies (England) (Miscellaneous Changes) Order 1990. The words ‘Miscellaneous Changes’ never inspire confidence. Stay tuned for further adventures in electoral geography.

Tweaty twacking

Way back - way, way back - in week 36, our eagle eyed reader will have spotted Librarians Ayesha, Claire and Jayne’s valiant attempts to pin down our CRaG treaty procedure map and decorate it with committee correspondence steps. At least for the Commons. This week, we’re happy to report, equivalent steps have been added for House of Lords committees. At which point, we take a wee moment to offer a jaunty salute in the direction of the International Agreements Committee and the European Affairs Committee. We are, as ever, happy to be of service. Taking time out from endless meetings, and with help from computational ball-tamperer Michael’s committee papers application - try it, you might like it - Librarian Jayne has even managed to actualise those steps. Marvellous.

Because, in the House of Lords, treaties are scrutinised by either the IAC or the EAC, we can’t point our dear reader to a single timeline covering both. If you take a look at the timeline for the Free Trade Agreement, done at London on 8 July 2021, between Iceland, the Principality of Liechtenstein and the Kingdom of Norway and the United Kingdom of Great Britain and Northern Ireland, you’ll see the ITC on full stamp-licking duties. A quick glance down the timeline for the Council of Europe Convention, done at Istanbul on 11 May 2011, on Preventing and Combating Violence against Women and Domestic Violence, reveals the same for the IAC. So much correspondence, though, let’s face it, that is where most of the action is. We hope our user is pleased with our efforts. Looking at you Arabella.

It’s been quite a while since we attempted to wrap our heads around the matter of legislative consent motions. Our dear reader may well remember time well-spent with colleagues in Belfast, Cardiff and Edinburgh, mapping out their procedures. The matter of mapping such matters first came to our attentions at the Study of Parliament Group annual conference. It resurfaced when we found ourselves on the receiving end of a spreadsheet created by Legislation Office Liam, attempting to align timings between devolved legislatures - or perhaps, more accurately, devolved governments - and Westminster. At which point, we started to wonder if our widely celebrated map making and actualisation skills might come in handy.

Unfortunately, our LCM maps still only exist in the form of pixels, never having made it as far as software and data. Equally unfortunately, our cartographic efforts took place quite some time back, which means we’d forgotten more than we ever thought we knew. One thing we’d forgotten was how the UK Government signify their opinion as to whether a given bill engages legislative consent. The answer being, it’s all in the explanatory notes. Obviously.

Librarian Jayne and computational bunny batter Michael spent a pleasant hour with Liam on Thursday, digging back into maps and cross-checking the semantics of Liam’s spreadsheet, all to establish where their thinking aligns. We now know we need to capture any requirements for LCMs outlined in explanatory notes, undo some of our self-precluding steps - LCMs being a thing that can happen more than once during the passage of a bill - and possibly add in some ‘communication with Westminster’ steps. Again, our lovely little bill papers application may come in handy here.

Before we can do any of that, we really need to adapt both the procedure editor application and the procedure editor database to cope with things presented as well as things laid. And in order to do that, we definitely need some of our Jianhan’s time. Which, at the moment - at any moment to be honest - is quite difficult to get. Please rest assured, should any progress be made, our dear reader will be the first to know.

Bots to blue skies

Given the architectural constraints in the rest of our lives, we’ve frequently found an outlet for our computational talents in the form of bot making. Bot making not requiring High Level Designs to be typed. Most of our bots started life on Twitter, because that’s where both the eyeballs and the best tools were. At least before the new owner took over. These days, Twitter makes it almost impossible to get the API access necessary for bot making. And, anecdotally at least, there doesn’t appear to be nearly so many eyeballs there. And also, just ugh, Twitter.

For that reason, we turned our attention first to Mastodon; all of our many and varied bots are posting quite happily over there. Recently, we’ve also dipped a finger in the waters of Bluesky, with the first of our written answer bots now being piped through the ATProtocol. And what a palaver that is. The requirement to identify link locations by byteStart and byteEnd is one hurdle we weren’t expecting. The hurdle we were expecting being the difficulty of registering new accounts given Bluesky is currently invite only. For that reason, we’d like to thank Karl, Ant and Paul for their very generous invite code donations. Thanks lads.

Exciting regnal year news

Some weeks back, computational left-arm orthodox spinner Michael caught the tailwind of #AnyasCold and was forced to take to his bed for a day or two. Being easily bored, he decided to entertain himself by writing a regnal years citation generator. Niche, even by our standards. We’re happy to announce that the resulting code has now been signed off by Professor Paul. On the matter of historical regnal year session citations, it’s hard to imagine you’d get a better sign-off than that.

In the course of making our regnal year citation generator, we noticed we were missing a data source for monarch abbreviations, having failed to notice such things in our earlier peerage explorations. Shedcode James has kindly plugged that gap adding a new column to our peerage database. A column which has since been populated by Librarian Ned. Thanking you kindly, Mr Jefferies. Top work, Librarian Ned.

Facts / figures

Traditionally speaking, our crack team of librarians have fought shy of the limelight. Not for them the fame and indeed fortune of publishing Research Briefings, that job being firmly in the hands of the research teams upstairs. That changed this year, when the Parliamentary Facts and Figures publications found themselves in need of a new home. You lot do computers, someone said, fancy taking on some spreadsheets. Why yes, answered our crack librarians, ever eager to help. Which is why this week came to see the publication of a revised and improved PFF on the subject of hybrid bills receiving Royal Assent since 1979. Lovely to see.

Prorogation transformation

Parliamentary rumours for once proving true, following a 17 month session, Parliament was finally prorogued on Thursday. Our ‘what to do when’ manual having been perused under P for Prorogation, the usual operations went into effect. Spreadsheets were updated, downloaded and pumped into our beloved egg-timer, which is now prorogation compliant.

In days gone by, this would have been only the start of our workload. Prorogation obviously affects sitting days and sitting days affect the end dates of scrutiny periods. With over 100 instruments before Parliament at any one time, that used to mean one hundred and odd clock end steps to update, our poor librarians retiring from the field with very sore fingers. As prorogation closed in, we had 122 instrument clocks to update. Not a prospect that sparks joy.

At this point, our eagle-eyed reader may well be saying, wait a moment. Back in week 29 you told us you’d wired up the egg-timer to the procedure editor and that updating clock end steps would no longer be a problem. Dear reader, we did and, dear reader, it was not a problem. Data flowed seamlessly from Google Calendars to egg-timer and from egg-timer to procedure editor and from procedure editor to triple store and from triple store to SPARQL and from SPARQL to website. 122 clock end steps updated with no work required.

We’re never entirely sure what folks mean when they talk about the old ‘digital transformation’, but taking scrutiny end date calculations out of JO Jane’s abacus and many and varied email chains and into a single system that reliably calculates end dates, pushing that data to the web and integrating the whole thing to operate like the proverbial clockwork must have saved many hours of effort for both librarians and clerkly colleagues. At this point, Jayne, Jianhan and Michael retire to a public house to congratulate themselves. Slàinte.

Wikidata meetup

Because we don’t get invited to enough facilitated workshops, sometimes we just have to organise our own. Which is how Monday, 25th September found your regular correspondents ensconced in a Death Star-like meeting room at Broadcasting House for yet another Wikidata meetup. Chatter was mainly confined to elections, constituencies and blasted boundary changes, though Tom and Duncan found time to show us a couple of nifty, Wikidata-powered demos. Thanks for hosting Vanessa, Jeremy, Duncan and Tom. Lovely to see you, as ever, Andrew.

Mods / rockers

SW1 and W1A are all well and good, but they’re not particularly exotic. As the weather closed in, London wasn’t really looking its best - it rarely does these days - and we all felt the need to escape to more maritime climes. Which is how we came to turn up in Brighton to pay a call on our Silver. Announced this time, we can’t keep abusing his hospitality.

The topic for the day was ‘an information architecture for the Commons Library’ - or a ‘single-subject view of the Library’ as it’s come to be known. Present were librarians Anya and Susannah alongside computational fine leg ticklers Young Robert and Michael. Silver had been kind enough to book a meeting room on the beach. It was a stormy day - the best stories start this way - and with the windows open and the sound of seagulls circling and waves crashing over pebbles, virtual markers were put to virtual whiteboards and a plan was formed. Loosely speaking. At the end of the day, half-decent models and half-decent information management is about the only plan you need. The rest being exploration. Anyway …

… work done for the day, we downed markers and headed off to meet data scientist Louie in one of Brighton’s premiere public houses. Because of course we did.