weeknotes.data-search

2018 Week 23

DATOR Day

After a short break data day returned and turned 40. In its new format it only takes us half a day, but it’s still pretty much all about the data. Aside from the gossip. And the scheming. There was a decent variety of stuff, but we didn’t quite get through it all.

Some stuff we agreed:

More visions

Very little in the way of visions this week. The cheese must be wearing off.

Community

We were joined at data day by Marc Adams from the NAO to chat about stats in general and plan some actual work we might do together. We’re now looking for a guest for the next data day in July. Hands up if that might be you.

Some immediate, practical stuff we agreed to do with Marc:

On a similar subject, our Liz has been helping colleagues in the House of Commons Library set up systems to produce new reporting pages for constituency stats. New features for a constituency dashboard and topic based stats browsing went live on Thursday. They’re looking really good and people have been positive about them so far. Which doesn’t stop Michael scowling at the URLs. And the “topics”. That aside, the data is ripe for ingesting into the data platform, to allow for display on the beta website constituency pages. And open up new ways to query parliamentary material in a way that hasn’t been anywhere near possible before. A meeting is planned.

Robert had a couple of meetings about developing search functions over websites. One with people from ACAS, one with people from the Legislative Assembly of Ontario. Properly international, if not continental.

Matthieu published his blog post about the trip he took with Sara to TICTeC. If you’re interested in the impact of civic technology, you should read it.

A data scientist called Izzy got in touch with Mike to share her MSc project. It pulls House of Commons divisions data from data.parliament.uk and analyses voting behaviour across Members.

Domain modelling

Alison has been meeting lots of people and having chats with the Collaboration team about all things related to “visiting Parliament”.

Anya and Michael met with Lef Apostolakis from POST to get another view on the work they’ve been doing to model research briefings. They’d already done a couple of passes with assorted people from the House of Commons Library and still need to sit down with House of Lords Library people. The POST session didn’t change an awful lot, which is good news and suggests the model is mostly correct.

On Wednesday, there was yet another meeting on SIs and the tracking thereof. This one with Jane, Jack, Jen, Janya, Jichael and Jalison. Jenna and James being otherwise occupied. After several weeks they think they’ve finally cleared up their confusion around SI procedure clocks, the definition of statutory days and how to model statutory day counts. That said, there’s another meeting next week to definitely, finally, completely agree the last bit.

Anya and Michael went to Brighton to meet Silver. Because he lives there. Or near there. And because it’s the seaside. And because it has dodgems and air hockey. They sat on the beach and planned out a talk they’re due to give to NetIKX. Which was supposed to be an introduction to ontologies for librarians and knowledge management people, but has ended up as plea to switch from learning ontologies to learning about domain modelling instead. So no one gets their money’s worth there.

They also spent a bit of time talking about the Modelling Parliament talk they’re due to give at the KanDDDinsky conference in Berlin. And some time talking about teaching House of Commons librarians a little more SPARQL. The day ended, as many days do, in the pub, hatching a plan for the first useful chunk of a legislation model. Which has been bothering Michael for some time.

Data platform

In the best news for a quite a while, procedure data is now live and happily turning into pages on the beta website. Although for now and I guess for reasons(?), those pages are still restricted to the parliamentary network.

After herculean data entry efforts by IDMS, some monumental procedure modelling work, an INTERIM DATA SOURCE built from scratch by Chris, visualisations in two and three dimensions built by Raphael, a novel editorial interface built by Wojciech, all ably supported by Mike… we got there.

In the first instance we’re only working with procedures for Statutory Instruments. In the longer term we think any parliamentary procedure could be captured in this way. This excites us, for we are a very niche brand of nerd.

Wojciech’s done some sterling work to describe our Query API using the Open API Specification aka Swagger. We’re now publishing details of assorted endpoints, query parameters and content negotiation options in a standard, machine readable format.

Jianhan has been busy adding fixed query endpoints for questions asked by a Member and questions answered by a Member.

He also updated the OData endpoints to reflect the newly imported questions and answers data. You can now get the total number of questions, total number of answers, questions by a Member, answers by a Member, questions asked on a date, questions asked between two dates, and correcting answers expanded with corrected answers. There’s also a fixed query to return questions by search terms in headings.

Samu made major improvements to the default HTML rendering of data from our API. It now shows meaningful labels for resources, images for member photos, maps for constituency areas, and improved styling for a better table display. Matthieu helped en route with several useful suggestions.

Samu had a busy week. He also added analytics to capture redirects from hansard.millbanksystems.com (the thing that search engines still have indexed) to the new Historic Hansard. Liz has been looking at user IDs. There’s been about 22,000 unique users per week in the last month.

The lastest version of dotNetRDF was released this week. It’s an open-source software library we both contribute to and rely on. The new release contains a number of contributions by Samu:

Based on the code we’ve contributed, our query API now supports GraphML output for all queries. Here’s Jianhan’s query giving the Parliamentary questions answered by Lucy Frazer MP in a format that can be visualised by software like Gephi.

Corporate data

Dan’s been working a fair bit with David, Lew, and Noel on all things data integration-y. They’ve been trying to improve the pipeline of work coming in and the quality of requests the pipes contain. They’ve also been trying to keep the endless email chains to a minimum and working out where we go next with our infrastructure.

David went off to the Biztalk 360 Integrate 2018 conference and returned with an assortment of Biztalk stickers. Which at least makes a pleasant change from Users First and Being Bold.

Dan also got a new gig. As of Friday he is now both Head of Data and Search and Service Owner – Interaction Management. I have no idea what interaction management might be, but it sounds super. Well done Dan.

Strolls

No strolls were reported this week. Though Anya, Silver and Michael did walk to the end of Brighton pier. Well, as far as the dodgems anyway. Anya and Michael played air hockey. Anya whipped Michael’s ass.

Things that caught our eye