ontologies

2023 - Week 46

Librarians of the Week

Librarians of the Week being much like buses, you can wait a wee while, then they all turn up at once. Last time out Librarians Anna, Emily and Jianhan were lucky enough to get their mitts on our much-prized trophy. This week, it’s the turn of Librarians Ayesha, Claire and Deanne to step up to the podium, Prosecco corks popping.

For the last few weeks they’ve had their heads buried in committee reports and secondary legislation maps, attempting to capture the assorted ways in which the Joint Committee on Statutory Instruments might report on an instrument. Their fine, fine work is now complete, but may require some explanation. Read on, dear reader.

All statutory instruments laid before both Houses and subject to procedure are examined by the JCSI. Actually, that’s not quite true but we’re not about to read the standing orders right now. So let’s say ‘most’ rather than ‘all’. The JCSI is not a policy committee. It does not scrutinise the impact of an instrument on wider society. That work happens elsewhere. Rather, the JCSI examines the legal aspects of the instrument and is empowered to report under eight criteria. Being a Joint Committee, these criteria are set out twice: once under Standing Order 74 in the House of Lords public business, once under Standing Order 151 in the House of Commons public business. The criteria include things like defective drafting and the appearance of unusual or unexpected use of delegated powers. Verging on the ultra vires, as we say in these parts.

Instruments reported under one or more of the eight criteria are of interest, both inside and outside Parliament. We’d long planned to capture such procedural steps as data, allowing our dear user to get real-time lists of instruments so reported. Indeed, the JCSI themselves asked for such a report, but, until recently, we haven’t been able to do much about it. The procedural timelines on our statutory instruments website must manage a balance between breadth and depth, and JO Jane decreed that JCSI reporting criteria were a detail too far. Not being able to separate actualised steps - that is, things that have occurred - that should appear on the website from actualised steps that shouldn’t, our options were limited to zero.

All of this changed back in week 38, when we added step collections to our procedure model, placing what does and doe not appear on our website timelines firmly in the hands of our crack team of librarians. Since then, Ayesha and Claire have been slaving away updating our secondary legislation maps to reflect reporting criteria. Not only that, with thimblettes on repeat order, they’ve been leafing through JCSI reports since 2017 - back to day zero of our much-loved statutory instrument service. Particularly fine work Ayesha and Claire and particularly deserving of our much coveted Librarian of the Week award. Or LOTW as our acronym loving colleagues might say.

For reasons aforementioned, you won’t find such steps on our website timelines. But, for those interested, Librarian Jayne has added a whole new set of queries to our much celebrated SPARQL library. I mean, would you expect anything less? At this point, you may well be asking, why only seven queries when there are eight criteria. Having checked in with Committee Clerk Jonathan, we were informed that two of the categories came up so infrequently, they were hardly worth bothering with. So we collapsed two of the criteria into one overarching ‘other reasons’ category. Should the committee start reporting under such criteria, then, obviously, we will split the combined query into two.

At this point we know exactly what our dear reader - ever the pedantic proceduralist - is thinking: what about instruments that are laid before the Commons only and are scrutinised by the Select Committee on Statutory Instruments? Fear not, dear reader, all that work is done too, as Jayne’s SCSI queries testify. As if we’d let you down.

New, old search - frontend

We’re delighted to report that data scientist Louie has completed the second - and possibly final - phase of the data analysis work. Which means our well-thumbed data dictionary now comes with occurrence counts for attributes across content types. Louie reports that NumPy is really quite fast. So fast he sounds almost surprised.

Designer Graeme uses the data dictionary to annotate our wireframes with which attributes to use in which slots, the combination of the annotated wireframes and the data dictionary informing the fine, fine work of developer Jon. Object page code comes along in leaps and bounds and we’re really starting to see the shape of the thing.

Still with Jon, he’s also added a dash of code to turn our person labels - think Smith, Mark E. - into something more human friendly for display. Cases where flipping on the comma just won’t do - looking at you Archbishops, Bishops, Earls, Dukes and the rest - have been spotted, a change to the style guide has been approved by team:Thesaurus, and Librarian Ned has rattled through a revamp of the related data. Stay tuned for the addition of Cha-Cha, Marquis of. People with brackets in their names lurk on our backlog.

Not one to put his feet up, Jon has also chipped in with some unexpected discretionary labour, adding a rudimentary search box and search results to our prototype. A search prototype one can search? Whatever next?

In the background, boss brarian Anya spent a large part of her weekend donning her sorting hat and applying herself to the problem of sorting. Particularly to the problem of sorting what have come to be known as the ‘secondary attributes’ on our object pages. Michael suspects his colleagues may have started to say ‘secondary attributes’ in some attempt to not say ‘metadata’ in front of him. A word that makes our mild-mannered computational “expert” extremely cross. Furious actually. We now have a definitive sort order for said attributes, which should hopefully make both Jon’s life and the handling of future feedback much easier. Unfortunately, it’s on Sharepoint, so it cannot, by definition, be shared. Nominative misdeterminism. Nomintaive misdirection?

New, old search - the computational backend

Over in backend world, our Jianhan reports yet more success. The work to redact personal information - for some very liberal definition of personal - from our upgraded version of Solr, has now been completed. The pipes between our triplestore and new Solr have also been fettled to filter out the flow of such attributes in the future. Because there’s no point cleaning your bath, then filling it with dirty water.

Meanwhile, Young Robert has made a bit of a start on an RSpec test suite to compare what comes out of our old Solr 3 with what comes out of our new Solr 9. Michael had half planned to lend a helping hand but came down with a bad case of toothache, forcing him back to his daybed. Stay tuned for more test-driven news.

New, old search - the real backend

Our more computationally-minded colleagues use the word ‘backend’ to describe the many and varied flavours of boxes in which data is stored. But we librarians and our computationally-adjacent colleagues take a different view. The real backend is the information management and the policies that underpin it. After all, as the old saying goes, one does not simply hack the code, one fixes the data.

To this end, Librarian Claire has taken some unexpected outliers in the data dictionary, slipped into her best pinny, rolled up her sleeves and planned out a much-needed autumnal clean out. For reasons lost in the mists, our observations on petitions used the ‘answering Member’ field for the Minister providing the observation and the ‘lead Member’ field for the Member presenting the petition. All a little topsy-turvy. Henceforth, it has been decided to update both information management policy and data. The ‘lead Member’ field will be used for the Minister providing the observation and the ‘answering Member’ field will not be used. For those interested in the Member presenting the petition, the petition itself and its presenting Member are but a hypertext hop, skip and a jump away. No librarian being an island - and with quite a fat chunk of data to tidy - Librarian Claire has identified where the issues lie, broken it all down by session and the tidying work is already underway - thanks Ayesha, Emma and Martin!. Excellent stuff.

How’s poor Robert?

No better if we’re perfectly honest. Still, TD MVP HLDs won’t write themselves.

The work of DECADES

In yet more Parliament thesaurus news our mammoth Capitalised Organisations data tidying task, on our books for nearly two decades - yes, you read that right, two actual decades - has finally been completed. To provide some context, our crack team of librarian are responsible for Parliament’s Thesaurus - the terms used to index Parliamentary business. This thesaurus has been around in some form since 1979.

For reasons pre-dating the working lives of anyone still here, in bygone days organisation terms were capitalised. So, BRITISH GAS, not British Gas. All a little shouty. Work to remediate this defunct data management policy has been sitting on our backlog for almost as long as some of our younger colleagues have been alive. Now normally, backlogs are where work goes to die, but not for our diligent team of crack librarians. They’ve been doing their best to chip away at the problem during quieter recess moments, but, even with their best efforts, there were still upwards to 2500 CAPITALISED ORGANISATIONS to deal with.

Unfortunately, turning off the CAPS LOCK key was not the kind of job one can automate by throwing a .titleize at the problem. Any organisation term being still capitalised meant it hadn’t been used since the turn of the century or was only used in the Library Catalogue. Before amending any term, we needed to check if the term was still in use, whether it had been used appropriately in indexing Parliamentary business, if it was still accurate and whether it was in line with our style guide. We also needed to check whether such organisation terms were missing relationships to other terms in the Thesaurus.

We are delighted to report that - some years on - this mammoth task is now complete. The remaining 2500 terms tidied thanks to a quite magnificent effort from Librarian Deanne who single-handedly reviewed and edited over 1250 terms. Amazing work, quite worthy of our Librarian of the Week award.

People, places, parties

As we reported only last week, the Library spreadsheets on which we’re basing our elections website on are tip-top in isolation but somewhat inconsistent taken as a whole. The 2015 and 2017 offerings being markedly different to the ones published in 2019. We had written some code to cope with this, but - given we also wanted to amend the spreadsheets to add MNIS and Electoral Commission IDs - the decision was taken to bite the bullet and bring all spreadsheets into the same format. After all, when faced with inconsistent data, what does one do? We’ll only give one clue here. ONE DOES NOT HACK THE CODE.

Last week, Librarians Anna and Emily scored a Librarian of the Week award for sterling efforts tidying and identifier-enriching the 2019 spreadsheets. This week, they’ve gone one step further and done the same job for the 2017 spreadsheets. Which means our election website is now running with both the 2017 and the 2019 general election results, and the whole thing is filling out quite nicely. Top work Librarian Anna. Top work Librarian Emily.

In another example of premier league librarianship, Librarian Ned has also backfilled more of our boundary set adjacent legislation data. Computational heavy-hitter Michael has taken that data and one of Ned’s earlier efforts listing Parliament periods and re-run his import script. He’s also added a left join or two to cope with the problem of Ireland. Which means our election website now comes with a list of Parliament periods and a more expansive list of legislation. All going to show that computational shenanigans are all quite simple provided you’ve assembled a crack team of crack librarians before you start. Maybe a lack of librarians is where most computational projects go wrong? In this Ted Talk, I will…

In additional code news, Michael spent another pleasant train journey - toothache aside - adding new data, new models and new views to link from English constituency areas to their Library dashboard equivalents. Please believe him when he says he found the experience of “deep linking” to the dentistry offering in his home constituency particularly troubling. That’s another 800 quid up the wall, he sighed to himself. Unfortunately, such “deep linking” is only possible for the three new dashboards, written - or so your correspondents believe - in Shiny. The older dashboards being Power BI powered and not equipped with the power of hypertext. And I think we all have opinions on that. C’mon man, it’s 2023.

I am a procedural cartographer - to the tune of the Palace Brothers

The thing about Librarian Jayne is she never knows when to stop. It’s almost guaranteed that one can be finishing up weeknotes - or ‘putting them to bed’ as we say in these parts - and she’ll bung another card in the Trello weeknotes column. Like playing Whac-A-Mole with a filing obsessive. Still, she has youth on her side. She’ll be as tired as we are one day. Anyway …

last time we reported in at C-Suite level - hi, dear reader - Jayne had added a step to our proposed negative statutory instrument procedure map to describe the publishing of further information by the Secondary Legislation Scrutiny Committee. This week, she’s been head down in our Constitutional Reform and Governance Act treaty procedure map, adding ‘treaty noted’ steps for the International Agreements Committee.

By way of background, in the House of Lords, treaties subject to the CRaG procedure are considered by either the IAC or the European Affairs Committee. Traditionally, upon conclusion of committee consideration, a report would be published. Back in October, the Procedure and Privileges Committee confirmed the Liaison Committee’s proposal to update the terms of reference of the IAC, allowing them to note their consideration in lieu of issuing a full report. A committee’s way of saying, yup, we looked, but there was nowt much to write home about. If you will.

New step mapped and actualised, such an ‘IAC noted’ can be found on everybody’s favourite treaty timeline - the Protocol, done at Palma de Mallorca on 18 November 2019, to amend the International Convention for the Conservation of Atlantic Tunas. Should our dear reader prefer their information conveyed in the medium of queries rather than website pixels, Librarian Jayne has added a new query to our ever-growing SPARQL library. Because of course she has.