Practical, team-focused operability techniques for distributed systems

A 45 minute Case Study by:

Matthew Skelton

Skelton Thatcher Consulting

Download the slides

The slides used for this session are available to download here.

About this Case Study

Modern software systems now increasingly span cloud, on-premise, and remote embedded devices and sensors. These distributed systems bring challenges with data, connectivity, performance, and systems management, so for business success we need to design and build with operability as a first class property.

In this talk, Matthew Skelton (Skelton Thatcher Consulting) explores five practical, tried-and-tested, real-world techniques for improving operability with many kinds of software systems, including cloud, Serverless, on-premise, and IoT.

  1. Logging as a live diagnostics vector with sparse event IDs
  2. Operational checklists and 'run book dialogue sheets' as a discovery mechanism for teams
  3. Endpoint healthchecks as a way to assess runtime dependencies and complexity
  4. Correlation IDs beyond simple HTTP calls
  5. Lightweight 'User Personas' as drivers for operational dashboards

These techniques work very differently with different technologies. For instance, an IoT device has limited storage, processing, and I/O, so generation and shipping of logs and metrics looks very different from the cloud or 'serverless' case. However, the principles - logging as a live diagnostics vector, event IDs for discovery, etc - work remarkably well across very different technologies.

Based on our work in many industry sectors, we will share our experience of helping teams to improve the operability of their software systems: what works, what doesn't work, and how teams can expand their understanding and awareness of operability through these straightforward, team-friendly techniques.

About the Speaker

Matthew Skelton is co-founder and principal consultant at Skelton Thatcher Consulting. He specialises in helping organisations to adopt and sustain good practices for building and operating software systems: continuous delivery, DevOps, aspects of ITIL, and software operability. Matthew curates the well-known DevOps team topologies patterns at and is co-author of the books 'Continuous Delivery with Windows and .NET' (O’Reilly, 2016) and 'Team Guide to Software Operability' (Skelton Thatcher Publications, 2016).


See the full programme

full programme