ontologies

2023 - Week 48

New, old search - frontend

Not much to report in the world of the new, old search frontend this week. Not to say that work isn’t happening, but it’s all a bit of a plod. Data scientist Louie has once more lent a much-needed hand with the data analysis work, decanting more attribute distribution data into our righty-famed 3D data dictionary. Librarians Anya, Jayne and Ned having been poring over the results, bringing their considerable domain expertise to analyse the analysis. All of which means, poor designer Graeme has been forced to reopen Figma and pay what might well be his third - possibly fourth - visit to designs he’d half thought done. Which, in turn, has knock-on effects on developer Jon, the “spec” he’s working to changing on what feels like an hourly basis.

In a more ideal world, we would not have put pixels to pages before the data analysis work had been done. After all, even the humble carpenter needs to know the character of the materials they are working with. But we do not live in an ideal world and are always happy to play the hand we’ve been dealt. Chins up and smiles on faces.

Endless circling aside, progress has been made. Librarian Anya popped the bit between her teeth and decided that our very limited coverage of what we still call e(lectronic)-petitions did not warrant further bother. Ingesting e-petitions was work done last time we attempted to make a data platform, and again in the intervening period, but it never really went very far. This later attempt mostly failed because our aged computers couldn’t quite cope with the volume of changes. At some point, we gave up on the idea and the pipes were turned off. Which meant our Search and Indexing triplestore ended up with a residual stump of e-petition data that wasn’t much use to anyone. Anya has decided to take a backup of the indexed data and purge triplestore, designs and code of any further mention. At least until we can do it properly. Well, it’s one less sheet of pixels to design and one template Jon won’t have to make.

A decision has also been taken on how we handle untitled things. Or at least the things that turn up in Solr with no title. Some things have no title at source, and some of those things are subjected to the indignity of having fake titles applied somewhere in the pipes. Quite where this happens needing further investigation. Needless to say, it’s all a bit of a mess.

On a decision making roll, Anya has also concluded that we should not flagprorogation answers’ on our written question and answer pages. Prorogation answers crop up when a Member tables a question for written answer toward the end of a session. The answering body may find that there is not enough time to provide a full answer, so instead responds with some boilerplate text to the effect that the egg-timer has run short of time. Our crack team of librarians flag such such answers, which is very useful query feature for filtering out the boilerplate and focussing on the substantive, but as the distinction isn’t really ‘procedural’ we won’t be displaying it on our answer pages.

In templating news, Jon has added a couple of helpers to help present our data in a slightly better shape. First off, any and all use of pipes to separate items has now been replaced with line breaks. Which is easier to look at and should also please screenreader users. He’s also found a neat way to handle disambigution of Member names. Additional information is added to the the name string - yes, we know, but we are where are - so that you, dear reader, might understand which Gareth Thomas it is. It looked a little odd in pixels. It looks much less odd now.

In the midst of life, we’re in technical debt, etcetera

Most of the new, old search backend work now being complete, our Jianhan has turned his attentions to upgrading our taxonomy management software and accompanying API. A job that sounds simple. At least on paper. And a job that is not so simple in reality.

We knew we had two APIs, an internal one and a redacted external one. “A redacted taxonomy?” our dear reader may well exclaim. “That doesn’t sound at all right. What could they possibly wish to redact from a humble taxonomy?” Well. It turns out that when our elderly search service was built, a number of other systems were built as part of the same project. Or programme, to use the collective noun. Amongst them, the application used to upload Library Research Briefings. Now, one cannot have every Tom, Dick or Harriet uploading briefings, and for reasons lost in the mists, a novel approach was taken to application authentication. In other words, encoding authentication credentials in our poor taxonomy. Sighs. We do not suppose this is the first time a boat has been spoiled for a ha’porth of tar, and we do not suppose it will be the last. But, my word, the work that decision has resulted in.

One preceding computational expert we will never curse is Joe, whose helping hands are inexhaustible. Jianhan and Michael had the unexpected pleasure of spending an hour on Wednesday pushing around dependency-pixels with the lad. At which point, it turned out we have more dependencies than even Jianhan had realised. Dependencies longer than our arms, it would seem. Not only is there an internal API and an external API, there’s also a “simplified” internal API called something like the “indexing API”; an API that isn’t used by the indexing application. Obviously. And also it isn’t simple, taking some form of feed from the triplestore. News that caused all concerned to wince.

Not wanting to get blindsided by more unexpected taxonomy usage, team:Thesaurus has done the legwork to compile a list of all the ways in which we use our taxonomy system. Some more unusual than others, but none so unusual as an authentication system. Luckily.

All that said, the week has not been all bad news. Far from it. Our Jianhan has managed to install the current version of our taxonomy management software, export all the data from the none too recent version we’ve been running - think Visual Basic but possibly uglier - and load all of that data - even the weird stuff - into the upgraded version. Which gives our crack team of librarians some source of comfort. Top work, as ever, Jianhan.

How’s poor Robert?

It’s hard to say to be honest. He has the thousand yard stare of a man who’s fresh from the pixel trenches, having just put the finishing touches to his TD search MVP HLD. All ably assisted by PM Lydia. Quite the achievement for the pair of them. As far as we understand, the next steps involve taking the HLD to the TWG and - should they decide not to shred it - it’s on to the TDA. Acronyms have been left here at the owners risk, your correspondents will not accept any responsibility for confusion, bewilderment or indeed losses. Good luck Lydia and Robert!

People, places, parties

In breaking psephology news, there’s not been much in the way of new code this week. But that shouldn’t really come as much of a surprise. As we’ve always said, given a decent data model and a well thought through information management policy, websites kinda build themselves. It is, in many ways, our mantra.

We do now have much improved layouts for majorities and turnouts under our boundary set pages. Not to mention considerably faster response times. We’ve also added new views for winning candidate vote shares and listings of party performances at boundary set level. No idea how we managed to forget those, but none of us are as young as we used to be. We’ve also added links to party pages wherever a party is mentioned and links to Member pages wherever a Member is mentioned. Because of course we have. If there’s one thing we’re reliably good at, it’s using hypertext.

In more exciting news, Librarian Ned has continued to expand our coverage of boundary set setting legislation, meaning we’re now good back to 1918. Excellent stuff.

In even more exciting news, statistician Carl has been applying his well-honed computational techniques to our boundary set / constituency area problem. Explaining that the same geographic area with the same name, or that an ever so slightly different geographic area with the same name, or some combination thereof, are, in fact, different things, is a perennial bugbear in attempting to explain parliamentary elections to the slightly disinterested. God knows, it’s baffled us and we’re paid to be interested. Carl’s maths skills have been applied to analysing current constituency boundaries against proposed constituency boundaries, returning actual figures for overlaps geographic, household and population in nature. Wow.

Off the back of this, we’ve cloned our psephology application, made a pretend dissolution, a pretend Order in Council, loaded the proposed new constituencies and started to expose the data. No longer will the genealogy of Runnymede and Weybridge be a mystery, not even when it starts to colonise bits of Esher. More user friendly views coming soon, one hopes.

It is at this point that we dive back into the purpose of the work. Being modellers of real world things - albeit being trapped in the medium of ‘data’ - none of this work would add up to much if we couldn’t encapsulate it in models. To that end, Ned, Young Robert and Michael have all headed back into ontology land, to capture everything that Carl and Neil have taught us in a more Turtle shaped form. Changes to the geographic area model have already been made, changes to the election model are proposed, and - unless we find a better home for it - a new constituency area genealogy model would appear to be on the horizon. Because one does not make ontologies by sitting in a darkened room with a copy of Protégé, one makes ontologies by chatting to people and using some combination of code and data to prod one’s understanding. At least, that’s what we think.

Assessing impacts

Amongst the many and varied types of document our crack team of librarians index and interlink are impact assessments. For which some background might be considered necessary. Whenever the Government make changes to policy or legislation, they examine the effects the changes will have on public bodies, the private sector and the third sector. The result is an impact assessment. For bills, they are placed with the Public Bill Office and published to the bills website. And our librarians index all of them.

This has never been a process without problems. Or issues, as we say in these parts. First off, notifications of new IAs has been patchy - the old email distribution list being less than reliable. Second off - and we’d be the last people to ever criticise government - but they do seem to lack a style guide for giving titles to these things. The official titles of impact assessments being what one might charitably describe as “all over place”. For this reason, we’d been adding our own - and better, from a presentation perspective - idea of what the title should be. Third off, for reasons no one can quite remember - but perhaps because IAs were once physically filed somewhere in the Library - we appear to have instituted our own numbering system and completely ignored the reference numbers supplied. Fourth off, we’ve often treated the arrival of IAs as bundles of documents rather than creating a record per IA. Fifth off, publication dates for IAs are less than clear, not all IAs having one. A problem made worse by the aforementioned bundling problem. Sixth off, our bundling approach meant we couldn’t link out to the actual document but only to somewhere roughly nearby.

Now, thanks to diligent librarianship by Librarians Deanne, Jason and Steve, we think we’ve solved the problems. Or most of them. The notification problem is largely solved by our shiny new bill papers application, which churns out RSS feeds for bill-adjacent paper types. We’ve now plugged these feeds into our mailboxes and are on the receiving end of an email every time an IA is uploaded to the bills website.

We’ve also instituted a new titling policy, prioritising accuracy over consistency. We are, after all, not the authors. Should the authors choose long or short, informative or vague titles - who are we to quibble. And, if there is no title - this also happens - then a title will be created in the format [Bill title]: [Impact assessment].

Having checked with Librarians Corie and Andrew, we’ve realised our previous policy of assigning numbers to IAs was a pointless exercise, these numbers being used nowhere else in the Library. From here on in, we’ll instead use the identifiers supplied with the documents. At least, where possible, given they don’t appear to be consistently applied.

Perhaps the biggest change is to our bundling regime. IAs will no longer be treated as a bundle, but rather as individual records. Which means we have a long and laborious task to unbundle a great many IAs published during previous sessions and create a new record for each one with a proper link to the thing and a date reflecting when the IA was made available to Parliament. Best of luck with that Deanne, Jason and Steve.

Facts / figures

A wee while back, our crack team of librarians started along the road towards taking over publishing duties for the House of Commons Library’s facts and figures series. This week Librarian Phil’s first Parliamentary Facts and Figures publication rolled off the production line. This one covering Government Chief Whips and Deputy Whips since 1945. Read and enjoy.

Librarian Claire has also made an update to our PFF on the subject of addresses to Members of both Houses of Parliament, which now includes the recent address by the President of the Republic of Korea. Churning out new and improved spreadsheets is one thing, keeping them up to date, quite another. Top work, Librarian Claire.

Bots to blue skies

We’re delighted to report that our project to port bot accounts to Bluesky continues to gather pace. This week we welcome ten - count ‘em - new accounts, providing updates whenever an answering body provides a written answer to a parliamentary question:

Given access to Bluesky is still by invite only, and given we’re not exactly drowning in invitation codes, should you have any spares you’d like to donate to a good cause, please do get in touch. Librarian Anna will more than thank you.