ontologies

2022 - Week 9

Following on from the excitement of last week, this one was quieter. Much quieter. So quiet in fact that Librarian Jayne managed to sneak in a much needed vacation. That said, work happened, as work always does.

Tidying up behind ourselves

Before heading north of the border, Librarian Jayne churned through a ton of work occasioned by our mass remapping exercise. This had caused some steps to be merged and some steps to disappear altogether. The list is too long and tedious to list out here but, should you be interested, a quick glance at the top four cards in our done column will reveal all.

With the data tidied, it only remained to take a feather duster to our code. We know there are developers in the world who pick up their computational spanners and flee the scene of the crime leaving a hot mess of widgets, dials, springs and sprockets in their wake. Reader, we are not those kind of developers. And, quite frankly, abhor such behaviour. Which is why poor Jianhan found himself on housekeeping duties, tidying away the wood shavings of recent endeavours. Now that we have procedural maps with typed steps we no longer need our old implementation of typed routes. Jianhan has taken to the staging application, database, orchestration and triple store and removed all vestiges of routes having types. We await the return of Librarian Jayne to check whether this has had any adverse effect on the test websites. And, if so, how much. Our gut feeling is some of our SPARQL queries may also need a spring clean. Once Jayne is happy we get to repeat the exercise in live.

Questions of cardinality

On the subject of developer criminality, young Robert and Michael have used a little down time to scrape every pixel they can find from the Foreign, Commonwealth and Development Office’s treaty database. A task made no easier by the developers of said system seemingly throwing everything they knew about Representational State Transfer up in the air and seeing where the pieces landed. The resulting website does not actually have URLs for treaties. Or indeed anything else. Instead, it takes the user request, plonks it into local storage, renders a results page, queries the local storage and somehow manages to return something that looks roughly like a web page. But isn’t. Scraping this monstrosity is only made possible by spoofing cookies that expire at the exact moment you really don’t want them to.

This is but part of the problem. For reasons as yet unclear, each agreement is present in the data anything up to one hundred times. Which makes the resulting data voluminous to say the least. We now have around one gigabyte of data and think we’re about a third of the way through. In more positive news, we’ve also learnt about the maximum size of a git commit. Reader, it is not 1Gb.

Cogs churn, cookies expire and home broadband chokes and we don’t yet have an awful lot to show for these efforts. We’ve taken a brief squint at the data and can’t say it looks too pleasant. The signing location data alone is enough to send a librarian running for the smelling salts. Perhaps risking repeating ourselves - in our quite considerable experience, 90% of technical and information debt results from an incorrect understanding of cardinality. How many of those things can that thing have is one of the best questions you can ever ask. Clearly, this project skipped that step.

Return to bill mountain

Last week saw Librarian Jayne and her computational helpmate Michael combing through the standing orders of the Scottish Parliament and decorating our new and shiny legislative consent motion map with appropriate citations. This week, it was time to get their homework marked. Another call with Mhairi and Gael saw them pass with flying colours. Hoorah! It only remains to add what we’ve learnt to the machines and we’ll be one third of our way through one thousandth of our way through public bill procedure. Or something roughly like that.

On the laying of papers

Back in week 7, we were delighted to be joined House of Lords Kath, House of Commons Robi and House of Commons Librarian Corie to chat about all things laid paper related. Questions were posed about who can lay papers, when papers can be laid, how papers get laid and why papers get laid. The why question being the most interesting. As ever. Coming out of that meeting, it appeared we had two main classes to deal with: papers laid by command - or under prerogative powers if you will - and papers laid according to some duty set out in legislation, commonly referred to as Act papers although we believe the duty may also form part of a piece of secondary legislation. Since then, email tennis has continued and we’ve also met with Vote Office Bernadette and Ryan. It now seems we have three classes of papers to deal with, the aforementioned two and papers laid in response to opposed and unopposed returns. Our diagram still requires a deal of work, the House of Commons Papers class seemingly being orthogonal to authority of laying.

How hard can things be?

Anya and Silver continue to plug away at how and why we might provide a ‘single subject view of the Library’. This to cover inputs in the form of research requests, process flows in the form of enquiry routing and reuse, and outputs in the form of both completed enquiries and research briefings. Not to mention a whole host of other things such as the cataloguing of online materials from elsewhere and the actual book collection. Not starting from here seems to be the main lesson so far.

Wednesday morning saw Anya, Silver and Michael meet over pixels to chat categories in general and prototype theory in particular. Thursday afternoon saw the same three joined by both Edward and an assortment of library researchers to test out Silver’s PICO-based suggestion for enquiry indexing. We suspect there’s still a gap between the proposal and the actuality which looks more and more to have a legislation based slant. More news soon.