ontologies

2024 - Week 7

Library open day

Wednesday saw the return of the ever-popular House of Commons Library Open Day, with librarians from all over the country grabbing cardigans, polishing their cateye specs, wrapping ham sandwiches (cut diagonally naturally) in tinfoil, and climbing aboard charabancs bound for Westminster. There to be entertained by a morning of talks and an afternoon tour of the Members’ Library - the bit that we suspect they really signed up for.

On duty from our crack team were librarians Anya, Ayesha, Jayne and Susannah. Susannah gave a talk on the Library’s planning operation for the upcoming general election. After which all four of them staffed the Indexing and Data Management stall, handing out a plethora of flyers and stickers to delighted librarians and answering questions on subject indexing, vocabulary management and the alignment of library and archival metadata modelling. Or at least attempting to. Well done all.

Home alone

Across town computational midfield pivots Young Robert and Michael were left to mind the good ship IDMS. Fixing printers, adjusting the height of monitors - or display screen equipment, as Young Robert likes to call them - and cleaning keyboards with cotton buds. Or whatever it is their job descriptions require them to do. We’re never entirely sure.

Being easily distracted when unsupervised, Michael took the opportunity to decorate one of our walls with paper, string and sellotape. This in some attempt to map models for the many and varied types of paper and the many and varied ways in which they’re made available to Parliament. Librarians returned from the Open Day to find him looking absolutely delighted with his efforts. They complimented him on his endeavours whilst making mental notes to never leave a toddler alone in a stationery shop.

Still, it wasn’t all creative playtime. Covering walls with paper and string turned up a small gap where some string should have been. Which caused us to wonder aloud if the range of a presentation is a bill work whilst the range of an order to print is a bill expression. If you’re a clerkly type having opinions on such matters, please send answers on a postcard to Dartmouth Street, Westminster.

People, places, parties

It would be quite remiss to skip over by-elections in a week with two by-elections. As Friday morning dawned and most of the team were still mid-yawn and mid-stretch, Librarians Anna, Emily and Phil had already downed their cups of ambition and were hard at work. Trello cards flying across the board as details of our two new Members were added to various systems. By now a well-drilled routine all went well, with zeroes and ones bouncing brightly along the computational piping. Welcome Gen and Damien. Top work Anna, Emily and Phil.

It goes without saying that work continues on new, old search. Of course it does. Unfortunately it’s not work that’s easy to type about, being something of a game of Trello table-tennis - played at a frankly disturbing speed - between librarians Jayne and Ned and developer Jon. Given the speed of top-spin shots from the back of the court, we can barely keep our eye on the ball. We are delighted to report that all 35 templates for the object views - or item pages, as we’re calling them this week - have been checked, with six of them fully checked, amended, checked again and signed off. These being bills, European deposited documents, impact assessments, private Acts, public Acts and last, but by no means least, everybody’s favourites - statutory instruments. Excellent work Jayne, Ned and Jon.

Meanwhile, Librarian Anya and her ever present scullery maid Michael have been pressing on, trying to get ahead of the design and development game for search result pages. They now have a first pass description of all attributes to appear in a search result for all 34 content types, around half of which have had both labels and occurrence counts checked and signed off. The next bit of the plan involves an in-person meeting - or a facilitated workshop, as Young Robert likes to call them - some walls, some more paper and some post-it notes. Though sadly, as far as Michael is concerned, no string or sellotape. We can’t risk him immobilising himself. Again.

Because new, old search still needs a fair bit of work, the team have been busy making contingency plans to ship what we can ship when we can ship it. After all, as Young Robert would no doubt say, one should always be shipping. Excellent product, as he’d probably add. The plan had been to put live our upgraded Solr instance, turn off the completely inadequate search service we offer to external users and provide the much better search service - currently requiring parliamentary network access - to anyone who wanted it. Then some computer words we didn’t quite follow were used in a meeting and as a result, that is no longer the plan. Instead we intend to go live first with the Solr upgrade and make no changes to either the public or internal user interfaces, with unifying and the new user interface to follow on from that.

With Stage 1 not being ‘actionable’ - as Young Robert might say - until our crack team of librarians are content that the Solr upgrade has had no unforeseen side-effects, Young Robert and Michael appear to have found themselves in the unlikely and slightly uncomfortable position of being sat slap-bang in the middle of the ‘critical path’. Not a situation that happens often. One can make out the vague panic on their faces, like a pair of moles facing down an out of control wheelbarrow. Not that they’re shirking their responsibilities. Work on automated testing continues and the basics are now mostly there. More interestingly, Young Robert has been pulling apart assorted emails from Librarian Ned and turning them into Cucumber. Conversion to RSpec to follow shortly. On the subject of which, if anyone knows of any kind of Cucumber plugin that might work for Michael’s colour-divergent eyes, please get in touch. An alternative to the use of green and red for ‘passed’ and ‘failed’ would be most useful. Without this he remains, quite frankly, baffled.

Over in Parliamentary Computational Section Towers, Project Manager Lydia has been busying herself with the form filling required for stage 1 to happen. This alongside corralling a mixed bag of people - or a multidisciplinary team, as Young Robert would no doubt say - into pixel-based meetings to work out a rough plan for bringing the public and internal searches together. That last bit being made much smoother by the arrival of Robert, Michael and Jianhan’s new boss Diana, who brings a pleasing sense of beginning, middle and end to proceedings.

Meanwhile, our Jianhan has been twiddling with the mini-CMS wrapped around our search service to create individual accounts for our crack team of librarians. More importantly, Jianhan has also made a test version of our external search service, running on top of our upgraded Solr 9 instance. Work that was introduced by the decoupling of the Solr 9 upgrade work from the unified search work. It went, we are told, swimmingly.

And finally, Librarians Claire and Steve have been casting an eye over our “indexer notes”, just to check if there’s anything in there we’d really rather not expose to the public. These are the equivalent of notes-to-self for our crack team of indexers, pointing out things they’re unsure of and things to come back to. Whilst the existing search application does not make use of this text and new, old Parliamentary Search won’t expose this data either, should we ever expose a public API this field will be available to all and sundry. Our dear reader will be pleased to learn that their fine toothcombs picked up nothing that any librarian would ever be ashamed of.

Thesaurus fiddling

Our loyal reader will be well aware that we work with a toolset that’s barely been maintained, never mind updated, over the last 13 years. Our live Solr is six major versions behind the current release, our triplestore is the oldest the vendor has ever seen in the wild and something of an archeological, technical marvel. The tools our librarians depend on are creaking and complaining, and pretty unpleasant to use. We’ve got a lot to do, and that means doing what we can, where we can. To that end, we’re also in the midst of updating our thesaurus management software. Or at least attempting to - the dependency graph looks not unlike overcooked linguine.

Whilst we’re only two versions behind - we’re running version 3, the current release is version 5.6 - it’s still a big enough gap to make upgrading more difficult than would be ideal. Our current software is based on the Zthes standard, whilst version 4 onwards are based on SKOS-XL. Which means the last time our Jianhan attempted the upgrade, some of the data went missing. This including a good decade’s worth of version history. Oops. Luckily there is a workaround which didn’t prove too onerous. Our Jianhan has now taken the circular and scenic route to first upgrade to version four and from there to version five. All data appears to be present and correct and ready for a once over from our keen eyed librarians.

Return to model mountain

For the past few years - no, let’s say for the last few decades, it feels like decades - we’ve been designing new models for our putative new data platform. Imagine then our surprise, when we took another look and realised we’d not made a motion model. Quite the oversight. Realisation struck outside of work hours, with Anya and Michael ensconced in a house of the public variety. A beer mat was grabbed, a pen purloined and the resulting scribble is now drawn up. The next step is the usual trip to Messrs. Hennessy and Korris for the purposes of idiot checking. At which point, we’ll know if it was worth taking the beer mat home or whether we should have left it sitting in a puddle of Guinness.

People, places, parties (slight return)

In preparation for the upcoming general election, team:Phil have been working with colleagues in the Parliamentary Computational Section on changes to the Members part of the Parliament website. Of particular concern is how we handle the boundary changes which will come into effect upon dissolution. The website has pages for former constituencies - at least for fairly recent ones - but there was no link to them from Member pages. This is now fixed. A quick visit to Yvette Cooper’s parliamentary career page proves the point, her representation of Pontefract and Castleford being now derefenceable. So that’s one problem solved.

A blue pencil has been taken to both our constituency and our Member pages. Where once a constituency without a current Member was described as vacant, the seat is now described as vacant. And, where once a Member having represented a constituency in consecutive Parliaments was said to have done so continuously, they are now said to have done so continually. This last tweak we hope will please at least one retired Clerk of our acquaintance.

The next problem is harder to pick apart, representation cards on the Members website smushing together constituencies with the same name but different boundaries. Check Theresa May’s page and you’ll see what we mean. She’s listed as representing Maidenhead from the 1st May 1997 to the present date, despite the boundary change in 2010 meaning we’re dealing with two different Maidenheads here. In the future, we plan to unsmush constituencies and list each unique constituency representation separately. Lovely stuff.

Niche? What? Us?

At some point late last year we put our regnal year calculator website live. Being both finishers and completers, with an unhealthy obsession with symmetry, we’re pleased to announce that our beloved egg timer now also comes complete with regnal year session citations. For which Jayne would like to thank Michael and Michael would like to thank Jayne. Let’s face it, no one else is gonna thank them.

Bots to blue skies

Shortly before Christmas, a reader got in touch - we’ve counted at least two now - congratulating us on our written answer bots and asking if we could do the same for written statements. Some weeks later we can finally reveal our written statement bots on both Mastodon and Bluesky. We’d like to do the same for Twitter but API access has been heavily locked-down by the current owner, in some attempt to crack down on porn bots and bitcoin scammers. A crackdown that does not appear to be bearing fruit. At least anecdotally speaking.