Medical Devices are meant to be reliable and safe. Architecture is key to achieve this. A sound architecture driven by risk analysis will mitigate disasters and their consequences. And a good architecture (especially if that quality is maintained over time) will make a software with fewer bugs, easier to spot, and easier to fix without regressions. So how do we approach medical device architecture?
- Spend time on initial design. I know, Agile scorns BDUF (Big Design Up Front). I think BDUF should be avoided in the sense of defining UML diagrams for every class in the system needed to implement those 5000 requirements. But if you really want to mitigate risks and maximize reuse, some separations are to be really considered at the very beginning – because they are a lot more difficult to achieve later.
- Stay pragmatic
- Beware of dreamed features. I’ve been amazed, in the past few years, how much different the actual evolution of the platform I was responsible for was what we thought at the beginning. Projects come and go, partnerships change, markets mutate. So keep YAGNI (You Ain’t Gonna Need It) in mind. But don’t fall in trap of getting blind and miss the chance to prepare for changes that will really happen in the future.
- Change the architecture when requirements change. Change the design when developers think a particular area of code is smelly. Nothing is sacred. Everybody gets things wrong once in a while, even rock-star architects.
Prove the architecture
- With prototypes. I love the Tracer Bullet project pattern: on your first iteration, implement one only, simplified, core feature of the app, that encompasses all layers and components of the architecture (the metaphor is that this practice, just as a tracer bullet in the army, gives the team enough light to understand the landscape and correct fire in real conditions). Once you have this feature working, you know the architecture works (well, you don’t know yet how it will sustain change over time, edge cases and loads, but you know at least it’s not an impediment to development). Before that, you just hope for it.
- With Stress tests and load tests. They are good judges to validate an architecture. I’ve often been flabbergasted by how much harder getting those tests to pass is, compared to what the team expects. You find so many rare, hard to reproduce bugs during those tests. You don’t want them to jump up in production – always at the worst possible time, by their very nature. As those tests might reveal deep problems rooted to the architecture itself, thus very costly to change, they should be performed as soon as possible.
- With reliability tests. It’s always interesting to control software behavior when things fail. It is especially important when the software might save someone’s life – or ruin it. You should have strategies for handling every kind of failure: network failure, other software failure, device hardware failure, computer hardware failure, OS failure, power supply failure, cyberattack, and even failure in your own software. It’s not in the norms, but you have to go beyond norms as far as the real thing is concerned. And as everything with software, it doesn’t work until it has been tested. So test it. Make sure you don’t lose data in case of blue screen of death (they can be provoked on demand thanks to special drivers). Make sure you don’t lose data when you turn off the power switch (we had to disable several caches to make it work). Make sure you don’t loose biological result when the GUI goes mad (we set an automated to test to kill the GUI during load test).