Architectural ideas worth considering for medical devices

Some architectural ideas are quite classical in the medical device software world, and should be considered in the early stages of a medical device project. I collected some:

  • Split the software to isolate the riskier features. For example, in a radiography device, all code surrounding the manipulation and the radioactive substances, their emission, and alerts around them should be split apart. You want to keep that part small to review it thoroughly and keep complexity low to avoid bugs. In addition, there is a huge overhead implied by class C 62304 requirement – you want to avoid that overhead in risk-free parts of the app.
    • On the opposite site, risk-free zones (such as the GUI, provided it takes no decision and no memory at all) should be split from the rest to be restarted at will in case of failures. And GUIs are doomed to mutate forever (to stick to UX fashion, and to give marketing opportunities for more product launches) – you don’t want to validate that automation impact on biological phenomenons again for the sake of an update in color palette.
  • Isolate real-time automation from the rest. Real-time or near-real-time is tough to get right. It will typically require lower-level languages (C, C++, PLC…) and maybe a special OS (RTOS) or RTSS (Intime, RTX, Preempt-RT…).
    • Having several OS may have an impact on electronics (another PC, a dedicated board…) and production price. This is a far-reaching decision that has to be taken wisely and early.
    • Low-level languages typically imply a lower productivity (C vs C#). And this part of the code will more or less follow the development lifecycle of the hardware: slow to start, a nightmare to tune and fix with all edge cases and recovery mechanisms, and then nothing – once the device is out, this code will have few reasons to change. But the higher-level part will always be changing – adapting to regulations, markets, healthcare network protocols.
    • Added bonus: isolating what’s not directly linked to the hardware will make a good basis for reuse on another machine.
    • Another extra for the road: you will need to emulate the hardware (to simulate rare conditions, to minimize costly and scarce real hardware usage, to speed up tests, to avoid being blocked until the hardware is ready), so have it clearly isolated to mock it through a simple interface.
  • Isolate components with cybersecurity risks. What’s in contact with networks and USB will typically be the entry point for attackers; therefore, it should have minimum rights – so a successful attackers cannot get much further.
  • Beware of networks. Calling third-party web services is a nice idea for, say, a climate app. But for medical devices, beware. Imagine there’s an earthquake or a war – a situation where the internet might be working very slowly, and people requiring urgent attention pouring into hospitals. Medical devices have to be working no matter what. So code that Clinical Decision Support algorithm locally.

Classical separations

  • Isolate tools. I may sound obvious once more. But don’t ship all these R&D tools (simulators, tests, low-level system testing routines…) in your production code. Medical Devices don’t need one more reason to fail. And keep in mind that these devices may be maintained in the field by versatile technicians that may have basic knowledge of computers and mess with the device if they can get a hand on powerful but unsafeguarded programs.
  • Design for testability. It has become mainstream, but I still see projects who avoid automated testing. My guess is their managers think automated testing is costly – and yes it is! In my experience, deeply automatically tested software costs about twice to implement. But you gain so much in horrible debugging time (who likes debugging? I’d rather write tests…) and by enabling refactoring by providing a safety net. And are we serious about writing safe and reliable medical device software, or are we not? You can’t be if you run away every time a quality-related activity seems costly. But to be profitable (and keep costs in a reasonable zone), automated testing has to be thought on the long run. And code has to be architected in a testable way from the very beginning. My typical advice here would be to heavily use dependency injection, mock all hardware-related components and network interfaces, run them you’re your CI server and integrate the test results into your traceability matrix to give them legitimacy).

Process hints for architecting safe medical devices

Medical Devices are meant to be reliable and safe. Architecture is key to achieve this. A sound architecture driven by risk analysis will mitigate disasters and their consequences. And a good architecture (especially if that quality is maintained over time) will make a software with fewer bugs, easier to spot, and easier to fix without regressions. So how do we approach medical device architecture?

 

croquis

 

  • Spend time on initial design. I know, Agile scorns BDUF (Big Design Up Front). I think BDUF should be avoided in the sense of defining UML diagrams for every class in the system needed to implement those 5000 requirements. But if you really want to mitigate risks and maximize reuse, some separations are to be really considered at the very beginning – because they are a lot more difficult to achieve later.
  • Stay pragmatic
    • Beware of dreamed features. I’ve been amazed, in the past few years, how much different the actual evolution of the platform I was responsible for was what we thought at the beginning. Projects come and go, partnerships change, markets mutate. So keep YAGNI (You Ain’t Gonna Need It) in mind. But don’t fall in trap of getting blind and miss the chance to prepare for changes that will really happen in the future.
    • Change the architecture when requirements change. Change the design when developers think a particular area of code is smelly. Nothing is sacred. Everybody gets things wrong once in a while, even rock-star architects.

 

Tracer bullets ricochet off their targets as Japanese Ground Self-Defence Force tanks fire their machine guns during a night session of an annual training exercise in Gotemba

Prove the architecture

  • With prototypes. I love the Tracer Bullet project pattern: on your first iteration, implement one only, simplified, core feature of the app, that encompasses all layers and components of the architecture (the metaphor is that this practice, just as a tracer bullet in the army, gives the team enough light to understand the landscape and correct fire in real conditions). Once you have this feature working, you know the architecture works (well, you don’t know yet how it will sustain change over time, edge cases and loads, but you know at least it’s not an impediment to development). Before that, you just hope for it.
  • With Stress tests and load tests. They are good judges to validate an architecture. I’ve often been flabbergasted by how much harder getting those tests to pass is, compared to what the team expects. You find so many rare, hard to reproduce bugs during those tests. You don’t want them to jump up in production – always at the worst possible time, by their very nature. As those tests might reveal deep problems rooted to the architecture itself, thus very costly to change, they should be performed as soon as possible.
  • With reliability tests. It’s always interesting to control software behavior when things fail. It is especially important when the software might save someone’s life – or ruin it. You should have strategies for handling every kind of failure: network failure, other software failure, device hardware failure, computer hardware failure, OS failure, power supply failure, cyberattack, and even failure in your own software. It’s not in the norms, but you have to go beyond norms as far as the real thing is concerned. And as everything with software, it doesn’t work until it has been tested. So test it. Make sure you don’t lose data in case of blue screen of death (they can be provoked on demand thanks to special drivers). Make sure you don’t lose data when you turn off the power switch (we had to disable several caches to make it work). Make sure you don’t loose biological result when the GUI goes mad (we set an automated to test to kill the GUI during load test).

Handling quality-related records in practice

Agile medical device software developers must solve a contradiction between two seemingly opposite philosophies:

  • From an agile perspective: go fast, experiment, deliver frequently, embrace change
  • From a regulatory perspective: produce auditable documents, double-check everything, make plans.

These philosophies have indeed been opposed very often (Apple and Google complaining that medical regulations slow down innovation, auditor being very suspicious of early agile projects). See AAMI TIR 45 for an enlightening discussion on how to reconcile them.

The rest of this post is focused on practical devices on how to cope with quality-related records so you don’t waste your energy.

 

Automation of the production of recurrent documents

automation

There are two categories of quality-related records

  • One shot documents, or only requiring minor updates (management plan, quality assurance plan, maintenance plan…). Do them once and for all, early in the project.
  • Recurrent documents (specs, test plans, test reports, design document traceability matrices…).

Recurrent documents are strategic since their repetition (especially in an iterative development process) will multiply the load required to produce them. In developed countries labor is expensive and cannot be wasted. What can you automate?

Adding_machine,_1909

  • Use a document lifecycle management tool for handling validation processes and versioning. They take care of ensuring proper signatures, notifying interested people, and most importantly act as a safe to protect your documents for the crazy time required by regulations (I heard 7 years after the last device is sold, after typical project times of several years: we’re talking in decades!). Odd as it may seem, I stumbled upon a project in 2014 where project documents and procedures were still manually signed. That’s a guaranteed recipe for losing documents and having holes in your validation process that exhilarating auditors will love to spot.
  • Use a spec and test tool to handle your requirements (I’ve been using Doors quite successfully for example, but other good tools exist). Benefits :
    • Be a platform for further automation.
    • Factor out repetitive document introduction, definitions…
    • Handle traceability
    • Share common requirements, risks, risk mitigation measures, tests plans across projects. Especially useful when you share code.
  • One of my teams wrote a tool to gather info from Doors (requirements, risks, risk mitigation measures, tests plans, executed tests plans) and from the software factory (automated developer tests results, automated GUI tests, automated stress and robustness tests) to generate a full traceability matrix. This matrix is required by regulations (to make sure every requirement has been tested), but it’s very useful to the team. Only when a requirement has been successfully tested can I be sure that its implementation is done. So this matrix provides good metrics to analyses project progress. It helps to pay extra attention to risk mitigation measures: by identifying them as special kind of requirements, it is easy to track how many are not yet implemented, or have their tests fail. Automation is the only way to go with traceability matrices when there are thousands of requirements, manual test cases and automated tests.
  • We have a project (not yet fulfilled) of generating a list of dependencies and versions by analyzing the Nuget package.config files.NuGet-Logo-2

 

Document lifecycle

In an agile team, documents are long-lived and evolve constantly. In fact, I believe a document should be considered correct only once its data has been used by the following process (e.g. a spec is correct once it has been implemented, an architecture document is correct once load and stress tests pass, a test plan is correct once it has been executed). It’s a fact of life. So don’t get stuck waiting for document approval in the general case. Instead, work on everything in parallel and have people collaborate – they master how to optimize complex, fine-grained interactions better than any process can.

Yeast_lifecycle_gl.svg

Metaphor for real-life document workflow

Write documents at the time when they are useful. I’ve seen projects blissfully ignorant of regulations until the end, where the documentation required by regulations is hastily written. This is nonsense. Minimizing doc authoring effort can be the enemy of project effort minimization. Quality-related documents are often very useful, if written properly, at the right time.

time-1024x652

  • Write those documents framing the entire program (such as high-level marketing needs) very early. They are likely to generate a lot of heat (political struggles) in the enterprise and take a long time to stabilize (when someone has won the battle). It’s very risky to start developing before – but it’s a good time for feasibilities, finding and tuning the right process, choosing tools and languages, writing foundation frameworks, hiring teams).
  • Think about architecture and risk analysis at the very beginning, when things are easy to change. Write it down in documents to set a clear vision that may be lasting for years. These documents will be read by every newbie joining the team, saving days of training for the architect – more time than required to write and maintain the docs.
  • Coding guidelines (hopefully enforced via tools) are to be enacted at the very beginning of implementation – if not, you will have to refactor the existing codebase to abide by them.
  • Specifications are to be written before coding. It’s a lot cheaper to change text than code. If you can’t write a sentence explaining what the software is supposed to do, it means you are not sure yet. Developers often think they know what the program should do – except they lack intimate client knowledge and perspective.
  • Manuel test plans are to be written before they are executed. Free testing is a powerful tool, but this is another story.

 

Document approval

ME_227_Approval

  • As far as I understand, for most documents, the minimal approval process is one author, and one other person playing the roles of reviewer and approver. That should be the default approval process to minimize waste. I’ve seen documents with more than a dozen people involved in the review process. Guess what? Everybody feels it’s useless to review the document because the others will spot errors. The review process ends up being more shallow than with one only reviewer – but fully responsible and committed.
  • The most efficient reviewer is the person using the document data as input data: he or she has to carefully read it anyway, and has the skills to really understand them. It should be the reviewer and approver of choice.images
  • As documents change until the work is done, validate documents only once the job is done – the end of the iteration (notable exception: project management plans, high-level marketing needs).
  • Validating documents once or twice a year should be sufficient (provided you explain it in your project management plan) if your validation process is costly (for example if you have a manual, paper-based validation process, or if your document lifecycle management tool has poor ergonomics and performance). You can’t waste time validating them at every iteration.

Regulatory Quality vs Product Quality

I have found very valuable over the years to make a clear distinction between Regulatory Quality and Product Quality. Regulatory Quality means you can handle to authorities a documentation package that proves you have followed their norms and standards. Product Quality means users like the features and ergonomics, that there are very few bugs, and especially none in any area that can harm somebody.

Quality yin yang

There is no equivalence between the two concepts. While Regulatory Quality can have a very positive effect on Product Quality (62 366 will definitely help to define sound ergonomics, 14 971 will help keep risks under control), it is definitely possible to issue a very bad product (ugly, full of bugs, poorly architected, with silly features) but still hastily write a nice retro-documentation that will fit the bill. Conversely, many companies write excellent software in the consumer area (websites, video games, operating systems) without using our norms – have you ever heard of a project team outside the medical device world saying: Hey, I’ve been using 62304, it’s incredible how more productive we are, how lesser defects we find, it’s awesome, check this out! I’ve always found disturbing to think that practices that seem to me so crucial for good software – refactoring, automated testing, automated coding standards, load tests – are not emphasized in norms, or worse, blissfully ignored. Maybe it’s a good things – there are legacy projects out there that certainly couldn’t use these techniques, and norms are handicapped by the least common denominator syndrome. But this demonstrates that Regulatory Quality and Product Quality are two different things.

Both areas are judged in very different ways. Auditors will generally not read the code or execute the app, because they would have subjective judgment on projects, which is not tolerable – only compliance of audit trail documentation to a norm is an objective criteria. But users are not objective. Users have a feeling about your product. They will hate in their guts every too-well-known bug, they will comment that impenetrable screen to every fellow user.

So my recommendation would be to treat them separately. Provide auditors with the documents they need in the way they like. But don’t stop there. Sure, medical device norms and processes will definitely help Product Quality – especially if you perform these activities early, honestly, with the right amount of energy. But they are not enough. Norms and standards take years to reach consensus, be validated, be widely implemented. The software world changes much more rapidly. Every year, new practices, new languages, new architectural styles emerge everywhere. You should stay tuned on what’s happening out there and try to apply it in our regulated world.

Sat-Bild der Woche/ San Francisco Bay Area/ USA
Satellite view of the Silicon Valley

A nice advantage of splitting these concerns is to maximize efficiency. Regulatory Quality implies the heavy burden of document templates, approval processes, tool validations, and so many activities that are meant for the auditor but not for the team, are a strong incentive NOT to experiment, take risks, fail, start something new. So what’s in the realm of the audit trail should be kept to a minimum. And there should be another underground, agile world were lots of good practices are used for making good software. The downside of this is that the auditor will never know of all these good things we do, that he/she might like. But if we’ve done a good job in preparing our quality-related records, he/she will be happy – if not, you have a problem.

Handling regulations

The medical industry is heavily regulated. That’s because bugs that kill people are to be handled with definitely more care than bugs that force a web page to reload. But guess what ? That’s good for established manufacturers – barrier to cheap and fast new entrants. Stop complaining about regulations, adapt to them, take advantage of them.

Carefully study regulations, norms and standards. They change all the time. New countries write their own (Anvisar, CFDA…). Worldwide manufacturers must infer from them a meta-regulation that bundles the worse (i.e., the more stringent) of them all and that is relatively unstable, because it changes when any underlying regulation changes. Usually, organizations set up RA (Regulatory Affairs) teams for that purpose.

But don’t let specialized quality teams write procedures. Procedures must be written by the people who execute them (with proper RA supervision) if you want the interpretation of norms to be productive (fast to execute, lean, no waste) and adaptive (changing frequently). It’s easy to ask for a stupid, lengthy, repetitive task when you’re not going to do it yourself.

regulations

Having a 6 months approval procedure for procedure changes with 10 senior managers involved will definitely discourage change. The procedure for writing procedures must enable evolution and empowerment.

Challenge regulations. Sometimes they can be interpreted in a variety of forms.

  • Take for example NF EN 62304, that presents software development activities in a numerical order, subtly implying you should follow the evil waterfall model. But it is not explicitly written. It took AAMI TIR45 to explicitly legalize Agile.
  • Regulations never talk about the amount of work to be done. 2 pages or 200 for a document ? Challenge your impulse to be thorough. From what I’ve heard, auditors get mad when something is totally missing, but are open to negotiation when it’s small. You can be lean by providing the bare minimum if you don’t find the activity really useful – but a have a rationale ready to justify your priorities.
  • Challenge RA people. When they say developers should add a best practice because of « regulations », ask to read the text of the article of the regulation that really puts a constraint. Always come back to the text – it’s the core principle, it’s the real constraint. It’s too easy to invoke a hazy « regulations » to justify any excessive demand. If it’s not mandatory, if we’re talking about best practices, then it must be decided by the development team. Best practices are only known by people who practice. Just to bring the point home: whenever you feel something brought up by “regulations” doesn’t feel right, always come back to the test, and challenge its interpretation.

Rules_and_Regulations

Remember, regulators don’t want you to drown in papers – they want medical devices to be safe, and incidentally their design to be auditable. They are reasonable people. If something seems completely silly, there must be a more sensible interpretation.

One useful technique my teams use is to write regulations as a spec, and trace its implementation to our specs and risk mitigation measures. Works well for technical guides such as CLSI AUTO9 and CLSI AUTO11. Going fully traceable by writing procedures as specs as seemed a little excessive to us, but why not? The good thing about this technique is you can challenge any legal obligation, and it can help you in case of an audit, by capturing your decisions towards regulation implementation, and by showing off how organized you are towards them.