weeknotes.data-search

2018 Week 15

Dan’s week was mainly spent writing a first draft of the data strategy to be presented at next week’s Data Steering Group.

In related strategy news Anya, Samu and Aidan met to discuss the vision and strategy for the ‘Data Toolkit and Vocabulary Management’ work. Aidan will update the draft documents and be setting up a session to discuss in more detail.

With assistance from Raphael, Jianhan and Chris, Aidan has been putting together a synopsis of the interim data sources currently in place and under development. This will form part of the document Dan’s writing.

Showing. And indeed telling

Chris, Raphael, Michael and Samu gave a joint show and tell about the work they’ve been doing on the procedure model, the application they’ve developed for the general purposes of tyre kicking and some of the ways they’ve begun to visualise procedural and other data.

Some of the procedural clerks from both Houses turned up and the discussion that followed (about decoupling the data platform from business facing application development) was probably more interesting than the showing. Or the telling.

Previous weeknotes have pointed to Raphael’s work on 2D visualisations of procedural data but they’re starting to look better by the day.

Samu made a three.js visualiser for graph data and Chris added a few more features. CORS permitting, you can make a 3D visualisation of any RDF graph on the web, something we’re unaware of existing in such an accessible way anywhere else. This is the OWL ontology visualised. You can remove type statements to make things more legible.

They work best if you have a very large touch screen and an ambition to be a less weird Tom Cruise.

A while back, one of our colleagues from another department (name redacted) said that all the Data and Search team do is make flow charts. That isn’t even remotely true, but when we do make flow charts they are interactive and three-dimensional. So there.

Community

Aidan met with De Havilland who currently scrape Parliament’s website (and those of other Parliaments and institutions) for data. They talked about the new platform, public APIs, upcoming work and long term goals. De Havilland are interested in starting to use our APIs and in joining our community events.

One world, one web, one team

Michael spent some time with Jamie and the Members website product team, looking at proposed designs for written questions and answers and chatting about URLs. The only thing of any concern was identifying the position of the answering minister at the time of the response. Which given the state of our government positions data could be tricky.

On a similar note there was a brief email exchange with the GDS Registers team to discuss the need for a register of government positions (and incumbencies (and people)). Please someone make this and make it go backwards.

Samu met Callum to discuss the technical details of introducing Research briefings onto the beta website.

Domain modelling

The second cross-team design session for Statutory Instruments took up a fair chunk of Tuesday. There were some good discussions about the timing of scheduled Business Systems development work and the need for a new procedure to deal with proposed (aka baby) SIs. We also spent some time doing a collaborative review of the design mock-ups created by the SI Website product team against the domain model.

Anya and Michael published a first draft of a very basic legislation model. Like everything else, it will probably need to grow over time, but they think it gives us everything we need for immediate SI work. Anya needs to check how Parliament currently handles coming into force dates and coming into force notes, so those comments are still subject to change.

Data platform

Wojciech made some tentative first steps toward standardised documentation and machine readable specifications for our services on api.parliament.uk.

Matthieu dived deep into SPARQL, hunting down a bug related to binding the existence of alternative property paths to a variable. The issue has been reported, now we’re waiting for a fix. He duck taped an alternative path in the data for the time being, whilst marvelling at the flexibility of both SPARQL and RDF.

Matthieu’s also asked for fixes to the HTTPS remote repository support in VocBench. Armando, the project lead, was very prompt to answer and the matter is being investigated. Suspense is hitting peak levels everywhere.

Since it looks like a vendor special for Matthieu this week, he compiled a wishlist, some of which you may want to support:

Search (and indeed indexing)

Liz tinkered with some data and made visualisations around search terms and concepts from our controlled vocabulary. For now it’s all quite basic. Just for fun, she made some graphs showing top terms used by class type. Which is interesting but not useful (she says).

Sara wrote an R script that outputs all concepts used by the Indexing and Data Management Section in the House of Commons Library in the past month.

Corporate data

In David’s week, the interface with the facilities management system is now working. Final testing on the interface to link people to their department(s) is scheduled to go live at the end of next week.

Lewis continued with development work on the House of Lords HR system integration.

Matt’s work on the Active Directory integration is now ready for testing. This will pick up data directly from AD and will provide updated phone number and network ID information to People Data. On the corporate reporting front, Matt has started linking sources from external spreadsheets to a single spreadsheet. Once it’s updated it will hold all of the data in one place. So far he’s linked 6 KPIs.

An integration for the online learning system was modified by Matt and some offline testing was done.

Noel is testing the finance side that handles transactions from the House of Commons Library. He’s also provided assistance with Parliament’s Financial Times and Times online email database list.

Capability

Liz and Matthieu did a technical interview with the candidates for our vacant Data Analyst position. Dan and Julie handled the panel interview part. Both bits went well.

On not breaking the web

Dan asked team:Samu to revive the blog on our previous open linked data platform, which had stopped working sometime recently. Samu implemented a solution to redirect all requests to the old blog to the Internet Archive Wayback Machine. He also reviewed related cloud infrastructure to identify resources that are not needed any more. Mike deployed the solution and Wojciech turned off the unnecessary computing resources. Money and electricity were saved.

Samu met Chris Fryer to discuss the recent Historic Hansard migration, the assorted problems it caused and how we could prevent similar problems in the future. They discussed why such things are difficult to communicate in such a large organisation and how similar difficulties with other services might be prevented if there was less focus on process and more on people actually talking to each other.

Chris and Samu also discussed the potential for longer term cooperation, such as the Data Service using Archive systems as a data source and the Archive adopting identifiers from the Data Service

Is it possible to get promoted?

Yes. Apparently it is.

Did anybody get promoted?

Let’s not get carried away here.

Was sarcasm deployed?

You’ve got the wrong people, sunshine.

Things that caught our eye