ontologies

2024 - Week 5

A bit of a mixed bag of a fortnight, half of it spent slogging away on new, old search, the other half packed full of outreach opportunities. Let’s start with the latter.

Outreach opportunities

Anyone who knows us would vouch for the fact we’re heads down kinda guys. Occasionally though, one has to glance up from one’s laptop and engage with one’s community of practice. After all, as Young Robert would no doubt point out, how is one to populate one’s benefit realisation matrix dashboard if one doesn’t engage with one’s community of practice?

First up in our many and varied outreach opportunities was a meeting with Jennifer and Martin from the History of Parliament Trust. Present and correct were Librarians Anya, Anna and Susannah, computational mic controller Michael and Shedcode James. Jennifer has recently taken over the helm of HoPT from Professor Paul, her arrival coinciding with a request for finance for the continuation of our Rush database normalisation work. Sorry Jennifer. Subjects for discussion were the Rush database, the Peerages database and some work James had been doing for Paul on division records both ancient and modern. A fine time was had by all and we look forward to working together in the future.

No sooner had telephone receivers been placedback in their cradles than Anya and Michael found themselves in a meeting with John from the British History Online project. John had got in touch off the back of our constant quest for Bluesky invite codes, necessary for bot making purposes - weeknotes passim. Kindly donating a handful of codes, he suggested a chat would also be good to explore where our interests overlap. Not only that, it turned out John is in fact our reader. Remarkable. We meet at last. Hiya John.

Unfortunately, aside from the aforementioned Rush and Peerage databases, we don’t have an awful lot of historical data, Parliamentary Search only covering the last 40 years or so. Almost inevitably, the subject of Historic Hansard came up and - feeling himself quite out of his depth - Michael paged Young Robert who promptly joined the call. Somehow or other we seemed to squeeze most of the production process of Historic Hansard - from scanning to OCR to aborted attempts at corrections - into the last ten minutes of the meeting. We hope it was of some use to John. It certainly seemed to give food for thought. Lovely to meet you John. Hoping our paths cross again soon.

Those with long memories may well recall our efforts to make persistently citable standing orders. That from way back in 2020. My word. Since then, the code has been handed over to Shedcode James to add editing functionality and a general veneer of professionalism. A project that sadly stalled when we ran out of money to pay him. More below. The standing orders work was made possible by Rad and the ParlRules team kindly making their lovingly-compiled data available under a Creative Commons licence. Rad got back in touch after Christmas to talk through his funding bid for the next stage of the ParlRules project: using machine learning techniques to identify powers, duties and the agents holding them in House of Commons public standing orders. The proposed project appears to complement a project being run by John from The National Archives, looking to do notionally similar work with legislation. So we were, of course, keen to support. A letter to that effect has now been drafted, signed by Sarah, Eve and Grant, and dispatched in the direction of Oxford. Thanks all.

On orders being standing

If you’re one of the very many people on the waiting list for a website with persistently citable standing orders, please bear with us. The delay has not been caused by any tardiness on our part, nor on the part of Shedcode James. Public sector procurement rules being public sector procurement rules, at some point we ran clean out of ways to pay the lad. And we all know what Yorkshire folks are like …

Thanks to the diligent efforts of Librarian Susannah, this slight obstacle has now been resolved, James has reopened his text editor and resumed work. Changes made so far include sorting out the editor authentication, removing the unused and confusing draft status and changing the word destroy to the much more friendly delete. At some point fairly soon - this year at least, we hope - there should finally be something to show. Though, obviously, we have said that before.

We are sad to report that designer Graeme has decided to abandon us and move on to pastures new. He’s been a pleasure to work with and we’ll miss him greatly. Best of luck for whatever’s next Graeme.

Whilst Parliament is still paying him, we thought it remiss to not reward the cash with the dignity of labour. So Graeme’s been spending his final hours looking at “treatments” - sorry - for result page “furniture” - more sorry. He’s also been looking at error pages in the form of both 404s and 500s. Reviewing the 404 work caused Young Robert to comment that any messaging that a page cannot be found always seemed odd when a page was clearly being displayed. Thanks for the departure into metaphysics, Young Robert. We know we can rely on you.

Librarians Jayne and Ned are having great fun checking and rechecking Developer Jon’s search object pages. For some definition of fun. The Librarian Slack instance is currently lit up with cards moving from the doing pile to the checking pile, back to the doing pile and so on and so forth. A Librarian / Developer version of ping-pong, if you will. At some point soon, we may well have an object page or two that everyone’s happy with. Stay tuned.

Elsewhere, organ grinder Anya together with her faithful monkey Michael have been trying to get ahead of the design and development game. For once. Not wanting work to start on search results until they had a better grasp of the materials at hand, they’ve created yet another spreadsheet, this one documenting what they think should appear on a search result for a given content type. If you’re a user of Parliamentary Search, why not take a look? The sooner we get feedback, the sooner we can course correct where necessary. There being nothing more valuable than an actionable insight from one’s community of practice, as Young Robert might put it.

On the subject of Young Robert, he’s also been busy, roping in computational odd-job man Michael to cobble together Cucumber and RSpec tests for our Solr upgrade. Since our Jianhan made the leap from Solr 3.5 to Solr 9, we’ve known that some of the result listings returned have been different. Different better or different worse, we don’t know. Just different. The plan is to write a set of tests that compare the number of results returned for a given set of queries and, once that’s done, a set of tests to compare result positions for a given set of queries. So far, they’ve managed to get the machines to check they can connect to the internet, check they can connect to old Solr, check they can connect to new Solr and check they can authenticate to the former. Not much, perhaps, but, as a great man once said, not nothing.

Once we have tests in place and once we have the ability to differentiate between different better and different worse and once we can fix up the different worse stuff without making the different better stuff worse, well, then we have a plan. A short-term plan admittedly, but nevertheless, a plan. Different worse being mitigated, we plan to swap out the antiquated Solr sitting behind our current search interface(s) for Jianhan’s upgraded efforts. This may, in fact, be a medium-term plan. It’s rather hard to tell. We’d also like to have rid of the current external service and make the service that’s currently internal-only, externally available. We hope you’re following.

As part of this ‘plan’, we’ve had a card knocking around for some months to describe mappings between the old external URLs and the new external URLs. This in the hope of one day writing an htaccess file to redirect from what was to what will be. As part of the last search backend meeting, we took a look and, my word, what a mess. Michael would like to astrally project his best Paddington Bear stare at whoever thought Parameters.Fields.house was a reasonable name for a House parameter. We recognise that not everyone has the same feelings about URLs as Young Robert and Michael, but still we challenge you to look at this URL and not weep.

In terms of content types supported, a decision has been made. Wonders will never etc. Some years ago, we forget quite when, the mists of time being quite thick, we undertook a short-lived and ultimately unsuccessful attempt to load petitions of the electronic type - or e-Petitions as we still quaintly call them - into Parliamentary Search. There to be subject indexed and interlinked with debates and whatnot by our crack team of librarians. It was all done with the best of intentions but, unfortunately, our creaking infrastructure could not support the volume of data generated. The whole thing coming to a grinding halt, leaving behind the vestigial stump of an unmaintained content set. In consultation with the Petitions Committee, it was decided to remove said stump from Parliamentary Search, from the Solr instance that underpins Parliamentary Search and from the triplestore that underpins Solr. Since then, our Jianhan has popped on his best pinafore and scrubbed the decks clean. Searchwise, e-Petitions are now a goner. Quite dead. Stone dead. With calm winds, a flat sea and a more seaworthy vessel, we hope to one day return to the upper reaches of the e-Petition passage. To that end, Jianhan has also taken a backup of what we did have. Of course he has.

In late breaking news, it would appear that our Jianhan has also cleared a couple of other cards. The first change was a simple one: hide a tab from the interface, but not the URL the link in that tab points to. That tab is now hidden and that URL still works. Marvellous. We also have a problem with some content in the Search and Indexing triplestore having no public web presence. Some papers, for example, are laid before Parliament but never get published to the open web. Or are unavailable for HTTP dereference over an unauthenticated connection, as Young Robert might say. Such papers may - or indeed may not - be available somewhere on the intranet but, if you don’t have a Parliament login, you’re not gonna find them. Given the old public search excludes these documents but the new one will not, we needed some way of flagging to users that clicking the link may or may not work for them. Our Jianhan has also added that messaging. Top work, Jianhan.

I am a procedural cartographer - to the tune of the Palace Brothers

With Librarian Jayne busy testing search, we have very little to report in terms of map making this week. That said, Librarians Ayesha and Claire have been beavering away in the background, remapping committee consideration steps to take of account of both causal relationships when instruments are laid and allowed relationships when instruments are withdrawn. And if that means anything to you, we’d be mightily surprised. Tune in next time, when we might have more to say. Or might not.

People, places, parties

Over in psephologyland, we have a couple of small changes to report. First off, a new Order in Council has been added, creating four new boundary sets - one in England, one in Wales, one in Scotland and one in Northern Ireland. Or rather it will do once dissolution happens. According to the legislation, new constituencies - and, by extension, the boundary sets containing them - come into being the day after dissolution day. Since we don’t yet know when dissolution day will fall, it’s not yet possible to apply end dates to current boundary sets, nor start dates to the new ones.

We’ve also taken the spreadsheet of new constituencies lovingly compiled by Librarians Anna and Emily, imported them and matched them up to our new boundary sets. Which means our full list of constituencies is somewhat confusing, containing a mixed list of constituencies with start dates but no end dates and constituencies with neither start dates nor end dates. Conversationally, we’re sticking with distinguishing between current constituencies and ‘new’ constituencies, though the ‘new’ constituencies aren’t really ‘new’ yet. Rather in some Schrödinger state between proposal and actuality. Gestating constituencies might be closer to the truth.

Said gestating constituencies were just about in place when the telephone rang and news of the publication of notional results flooded in. These being some approximation of what might have happened in the 2019 general election if the gestating constituencies had been in place back then. For the psephologically-minded amongst us, the notional results were published by the BBC last week alongside a PDF explainer. Imagine our excitement, as we clicked and downloaded. Excitement only tempered when we double clicked and opened. My word. What a treat. An 80 column wide spreadsheet is quite the sight to behold.

But you know us. We’re not amongst life’s complainers. Picking up his trusty SQL spanner, Michael set about importing the data. Or at least trying to. Which is why, if you’d been in Dartmouth Street last week, you’d have heard mournful whimpers from meeting room 11 and possibly seen tears leaking under the door. 80 column wide spreadsheets and code are just not meant to meet. No-one has that many fingers.

Still, we are nothing if not team players. Can I borrow your brain? Michael asked Young Robert. Also your fingers. A meeting was set up. If those two computational allrounders can’t count to 70, who can? They were just about to get stuck in with notepaper and abacus when Data Scientist Louie got in touch. That spreadsheet you were complaining about, he said, I’ve transposed it. And lo! What had been 80 columns and 650 rows is now 16 columns and 4253 rows. One row per ‘party’ / constituency combination. A thing of beauty that must have saved several days of work. If you find yourself working with notional result data and only have ten fingers, please do download Louie’s spreadsheet. You will not regret it, we promise. Thank you Louie. You have rescued us from madness and deserve an beer at the very least.