ontologies

2022 - Week 12

A single subject view of The Library. Or toward an information architecture

Librarians Anya, Silver, Ned and Phil continue to prod away at providing House of Commons Library researchers, front of house librarians, Members and their staff with a single-subject view of Library inputs, processes and outputs. The better to gather all relevant material pertinent to an enquiry in one place. The current state of things is, erm, well, somewhat messy. Let’s just leave it at somewhat messy.

Enquiries - the bread and butter of Library services - are not subject indexed at all. Occasionally, a librarian or researcher may attach a free-text tag or two. But there’s no process that one might describe as systematic in place. Research Briefings, on the other hand, get subject indexed three times against three different vocabularies. First time by the authors against a short list of topic terms derived from - but forked from - the Parliament Thesaurus. The second time is not really an indexing, more a munging of topic terms into a new shape to make the Wordpress website category pages. And the third time when librarians take out fine toothcombs and index everything against the Thesaurus proper.

Over in the world of the physical book collection, a team once managed by our Phil also subject index acquisitions, again against the full Parliament Thesaurus. Elsewhere, another team scour the web for useful articles and subject index them with free text terms.

But this is not where the subject indexing of things stops. The Library researchers are also subject indexed. In a fashion. The subject specialist directory is a list of Library specialists indexed against their specialisms. Actually, there’s two lists, but let’s skip over that. It gets sent out to front of house Library staff, Members and Members’ staff as a sort of Yellow Pages for those in the market for a statistical or policy expert. Rather than, say, a plumber. Two lists means two different taxonomies, again not integrated with any of the other taxonomies.

Now one does not need to be a Wardley mapping obsessive to see there’s some redundancy in this system. Almost worse than redundancy, it’s also a system with sizeable gaps in both coverage and interoperability. It’s rather like a library indexing its hardbacks against Library of Congress subject headings and its softbacks against Universal Decimal Classification and wondering why no one can find anything. But if the railways could fix breaks of gauge, a library can damn sure fix breaks in subject indexing.

Not that fixing the railways was a trivial matter. If someone wanted to build their railway to 4 ft 8 1⁄2, and some other chap decided that 7 ft 1⁄4 would provide a more comfortable travelling experience, there was little anyone could do to knock heads together and bang out a deal. It was only when the railways stopped being systems and started to become a system of systems that carting barrels of beer from one train to another started to feel a little labour intensive. And this, dear reader, is a long drawn out metaphor for how software is built. And more importantly commissioned. Only when you wrap your head round the fact that you’re not commissioning a system, but rather a component in a larger system, do economies of scale start to happen and your individual investments start to yield dividends that are larger than their sum of their parts.

Our crack team of librarians have spent the last few months busying themselves with building a prototype that integrates the many and varied taxonomies and brings much of the information together. Ned has been churning out spreadsheets to index specialists with concepts from the Parliament Thesaurus, Silver has been hand indexing a sample set of enquiries against the same vocabulary and Anya has been squeezing indexed Research Briefings out of Parliamentary Search. All of which has found a home in a nifty little data integration tool that Silver’s colleagues at Data Language have put together. The team are now inching toward the conclusion of the project, presenting findings and heading into report writing territory. Unfortunately, we can’t share the report with you yet. Because (a) it isn’t finished and (b) no one’s read it yet. We can, however, share some slides and end on the usual note that feedback is always welcome.

Return to bill mountain

Keeping heads firmly down for fear of catching taxonomic shrapnel, Librarian Jayne and her computational helpmate Michael, once more left bill mountain basecamp for another assault on legislative consent motions. The LCM procedure for the Scottish Parliament is now safely inside the machines. Once more, the machines have been kind enough to draw it back to us. We cannot say their penmanship improves any. In fairness, it feels fairly safe to assume that no machine has ever tried to draw a legislative consent motion procedure map before, so there really isn’t much in way of prior art for the poor machines to draw upon. They do their best. As do we all.

Prodding procedural parsing

In last week’s breaking news we reported on a suspected bug in our C# procedure parsing code. The machines having tipped past the point of trying hard to the point of trying rather too hard. Procedural routes that had been fully parsed continued to be parsed. If not to infinity, to a reasonable approximation thereof. Which led to all kinds of procedural steps appearing in quite the wrong future possibility buckets. Our Jianhan has now fixed this and all appears to be working as expected.

Whilst Jianhan’s head was buried in parsing code, he took the opportunity to make another tweak to our logic. We’d always said that decision steps - designed to modify behaviour from a business step being caused to a business step being allowed - would only ever sit directly in front of said business step. Which had the unfortunate effect of preventing us from saying that a step may be caused in some circumstances and allowed in others. To put this into some sort of procedural context, the laying of a statutory instrument into both Houses causes the Joint Committee on Statutory Instruments to consider that instrument. If the instrument is subsequently withdrawn before the JCSI have considered it, the JCSI are still allowed to consider it - who, after all, would stop them - but the relationship is no longer causal. So we’ve updated our design notes to say an OR step can now accept an input with status ALLOWS and updated both our Ruby and C# parsing code to do likewise.

In the meantime, a new bug has emerged around the propagation of untraversability. Jayne, young Robert and Michael combined brains for yet another full rewrite of both design notes and Ruby parsing code. A new ticket has been created with the memorable title of Untraversability of cancelled deferred divisions is polluting routes to motion agreed, disagreed and deferred in draft affirmative procedure. If you have an abiding passion for how one parses a procedure where standing orders have since been rescinded, please tune in next time. Though we can fully appreciate why even our regular reader might pass up this opportunity.

More rants about cardinality

If you’ve been following along from home you’ll know by now that young Robert and Michael have been attempting to scrape data from the Foreign, Commonwealth and Development Office treaty website. Only distracted by occasional whines into weeknotes about the state of the information model and the state of the information management. Reader, our whining is over. We have no further whines for the week. If you’d like to peruse the results of our efforts please click here. As they say. Feedback much more than welcome.

Fettling Rush

Librarian Anna continues to make steady progress on tidying up the Rush database reference data. Computational expert James following in her wake to normalise newly tidied data into new tables. Which means this week we have freshly normalised titles and surname types for Members’ alternative names. Lovely stuff.