Worknotes - February | Shane’s blog

Throughout November, December and January I was working part time to focus on some personal projects, making February my first full month back this year. It also coincided with our new CTO’s second month at Altmetric.

Rather than immediately joining back into a development team, I’ve spent the month working closely with our CTO as there were a few processes that could do with being formalised at Altmetric which weren’t. By “formalised” I just mean that they’ve been written down somewhere after consultation from relevant people.

The push for this was an interesting one: the new CTO of Altmetric is the first who didn’t come immediately from a development background. We’ve previously had very technical CTOs who have managed to be very hands on with our technology – useful at the time as we were still figuring out what the technical foundations were of collecting conversation and impact of scholarly research from around the web. We’ve largely got that down now and can comfortably handle the millions of events we process each day. Where we needed more help was creating a stronger bridge between the commercial teams and development teams. This isn’t to say that our previous CTOs left that bridge abandoned, but it’s certainly true that finding time to maintain it was a stress of the role rather than the highlight of anyone’s day. Now, with a CTO whose priorities aren’t focused entirely on building technical foundations, there are other areas of the tech team to work on.

I’ve focused on a few processes this month: how we recruit developers, our processes around major incidents, detailing a data flow diagram of Altmetric’s systems, cautiously deciding on new recommended hardware, and investigating different ideas around our developer “ladder”.

Recruitment remains a difficult task. Where in the past we received hundreds of CVs, this time around we’ve not been as successful – not nearly. Half a dozen CVs or so have been worthwhile getting back to and having an initial call. That’s not because we’re being picky. We’re just not getting much response; if you’d like to work with my team send across your CV. Consider this your foot in the door!

Writing up our current recruitment process had two purposes (like a lot of the work this month): the first was to create a document to give someone who needed onboarding to the process as a hiring manager, and the second to try and debug if we’re doing something wrong this time around. We’ve concluded that it might be an industry wide thing, with Ruby developers thin on the ground at the moment.

We have improved our process around reviewing the technical tests from candidates by anonymising the code, allowing us to share it with the wider development team. The aim here is to remove potential bias, of course, but also to safely spread the work out to more people. Recruitment is a huge time investment, but I’m fortunate that the team are all willing to help out.

In the middle of the month we had a major incident firedrill to test out the current process. Although the idea of a roleplay exercise first felt quite awkward, it ended up being quite a lot of fun. It highlighted some parts of our process which some people weren’t aware of, and some which needed reviewing. To follow this, the largest piece of work I’ve done was writing up a Major Incident Management Plan. Whilst we’ve had incidents in the past, and the development team have rolled with it, rarely leaving an incident alive long enough for a customer to notice, it was clear that people outside the development team weren’t aware of our processes. At times, they felt out of the loop and didn’t want to interrupt to get back on the same page. Writing these up, including non-technical staff members, and consulting the wider group left us with an evolved process which is hopefully healthier.

One document we have at Altmetric is Miro board showing a medium-high level overview of how our event collectors, pipeline, and various datasets work with each other. We use it to onboard new dev team members. It’s more frequently used these days, as our systems have increased in number and no single developer is expected to fit it into their head anymore. (Not like the good old days when that was a fun challenge to draw on a whiteboard from memory.)

That document works well still, but more rigor could be added. Yes, it lists that the news collector talks to the pipeline, but what kind of data is it sending to the pipeline? How does our Twitter collector know which domains to be to filtering for? Dataflow diagrams are great for this. Adding in these arrows shows quickly where the most important systems are, and some bits were surprising. I should say that this is the document which I’m least sure of how valuable it is, compared to the original, higher level document. I’m hoping to get further feedback on if it is used, or if its complexity causes too many headaches to follow.

One poor developer had been using the laptop I had been given when I started at Altmetric seven years ago. Silently chugging along until I noticed whilst doing a hardware audit of laptops we have in service. A number of laptops deserved to be upgraded, so I had the easy job a few months ago of picking new Macbook Pros. Since then, I’ve needed to order a few more laptops but now have had to struggle with a decision which could backfire: the M1 machines are clearly superior but have the fatal flaw of potentially just not working for development. Docker, apparently, needs some massaging before it was work properly and work to fix this isn’t yet done. The alternative: buy the trusty Intel machines, which are more expensive and less powerful. I’ve recommended, and ordered, the M1 machines. In the grand scheme of things, it’s a solvable problem if a developer runs into troubles. (Worst case: return the laptop and get another.)

I’ve also spent some time this month considering how, as a manager and also as a developer, career progression should be tracked. We currently have a development ladder, where developers can judge for themselves (and along with their manager) where their skills are right now and where they want them to be. I’ve had some feedback that this is too vague in parts (tasks which developers do everyday, and feel odd sitting at “Level 3”) and too specific in others (skills where the opportunity to demonstrate might not come up often). Our CTO is familiar with the SFIA framework, which I’ve been looking at. I’m tentatively curious about it; it’s certainly detailed, but quite formal. I’m planning to self-appraise myself against it and see if it’s useful.

One difficulty with (self-)monitoring career progression is that it doesn’t come for free: there’s a time cost, and often developers feel ticketed work is more important. Adding more complexity to the process certainly can’t help that. We’ll see.

A stunning lack of development work from my part this week. I’m eager to get back to it this month, hopefully, but it hasn’t been totally absent. The rest of the team have been super busy and I’ve been eager to support them where ever I can; one such task was finally destroying our last Old Ubuntu server for a much more modern one. That was possible because we’ve converted our API – the most delicate part of Altmetric, which needs to be highly available and super speedy – to use Docker and be managed by Nomad. The testing was intensive, and done by many people, including myself, before being confidently signed off. With that, a large amount of old infrastructure can be put on a canoe, pushed out to sea, and set alight.