What about SOUPs ?

Regulators of IEC 62304 have put a lot of energy into normalizing how to handle SOUPs (Software Of Unknown Provenance) for software of classes B and C (software that is in a position to potentially harm people in a non-benign way). The definition says: “Software that is already developed and generally available and that has not been developed for the purpose of being incorporated into the MEDICAL DEVICE (also known as “off the-shelf software”) or software previously developed for which adequate records of the development PROCESSES are not available”. To sum up: everything that hasn’t been built according to the norm.

What I’ve seen in the trenches indicates that this distrust in SOUPs is a bit misplaced: in my projects, carefully chosen libraries contain several dozen times less bugs than home-made code before verification. Why?

  • Released libraries are finished software.
  • They are used by many more developers than code being called in only one context and thus have a higher probability that bugs have already been found and fixed.
  • The rise of open source software along with excellent processes (automated builds, TDD, gitflow with systematic reviews of pull request…) and psychological motivators (the name of developers permanently and publicly attached to every commit incentivizes perfection in code) has dramatically increased the quality of free libraries compared to ten years ago, when 62304 was first released.

But I understand the theoretical need of regulators: if there was no SOUP policy, it would be too easy to pretend that a major part of the code is a SOUP and not apply the regulation at all. I can’t imagine a norm that doesn’t think that what’s coming from outside of its jurisdiction could be better.

Norms are norms and auditors are paid to verify compliance to a norm, not to argue about how well or bad the norm was written. I’ve heard that SOUPs are one of the top favorite areas for auditors to look for defects in your implementation of IEC 62304 (the other one being risk analysis): be warned.

eyeballsoup
Indiana Jones and the Temple of Doom. Nice SOUP.

 

So how do we handle this mandatory and not-so-useful activity? Here are a few hints to maximize productivity.

  • You need a list of dependencies and their versions. In some programming environments (nuget, bower, npm…), there is a clear list of these dependencies and their versions (package.config, package.json, bower.config…): try to generate the SOUP list from these files.

    bower-json
    Dependencies in a bower.json file
  • It’s a good idea to take advantage of this list to perform a thorough inventory of licenses and do what’s required to be clear. For example, many open source libraries require your software documentation (typically online help) to quote them. And maybe you’ll find one or two that’s not free for commercial use and that needs to be replaced – the sooner the better.
  • 62304 requires specifications for SOUPs, including performance criteria. This is a tricky business: some of your SOUPs are a lot more complex than your medical device (the base library of your favorite language, the OS): you can’t possibility retro-spec them entirely. My preferred approach is to document the requirements of the main behaviors of the library that you actually use – a projection of its features on your special use case.
  • You should always try to wrap the external dependencies of your code in, well, wrapper classes. This prevents this external namespace to creep all over your code. It helps to easily change the library with another functionally similar implementation someday. In the context of SOUPs, the public interface of the wrapper makes very clear which part of the SOUP you use, and which part you don’t. This can serve as a boundary to limit your SOUP specification effort.

    soupwrapperclass
    The SOUP wrapper acts as a Facade and helps limit the specification and unit testing effort to features that are really used.
  • 62304 requires you to test these requirements. That’s something developers spontaneously do when choosing a library: make sure the library works, test a few edge cases. But you need to do it again every time you upgrade the library. For the latter reason, I strongly suggest unit tests that you can link to the specification (so that they end up in the traceability matrix) and use to test the mandatory performance requirements (for example by using the MaxTime attribute in NUnit). These unit tests will help you make sure the next version of the library works with very little extra effort.
  • When they are available, you could run the unit tests of the library itself and use their results as a proof of quality. You will still have to deal with writing your own requirements and linking them to the tests. In practice my teams often have had problems with libraries having a few failed tests related to features we didn’t use, which triggered cumbersome justification; in this case we just skipped the library unit tests.
  • You are required to perform a risk analysis of your SOUPs and add mitigation strategies as required. This is theoretically a good idea, but I’ve often found it very difficult to put in practice with general-purpose libraries, because their impact cannot be bound to a single feature. In some cases – databases, ORMs, mappers – almost any features could potentially be compromised. As always with risk analysis, there is a temptation to assess every possible failure mode, which would lead to an overwhelming analysis that never gets finished. My advice here would be to trust your gut feeling and choose a selected handful of risks where the brainpower consumed performing risk analysis will be most valuable. There are less failure modes in SOUPs than in your code; use your time on the risks that really threaten patients. Don’t get stuck in an impossible thorough analysis of everything that could possibly go wrong in things that are more complex than what you produce.
  • You are also required to perform a list of known bugs and assess the risk your software incurs because of them. It’s a demanding endeavor: in practice my projects tend to use dozens of libraries, some of them have hundreds of bugs, and others don’t publish bugs at all; when they do, it is often difficult to tell which versions of the library have the bug without testing them. I suggest you don’t waste your time with this before the end of the project because you are likely to upgrade your libraries until then and because more known bugs are likely to be closed with newer versions. The ROI of this activity seems very low. I would be glad if this requirement was stripped out of the next version, or adapted to be more cost-effective.
  • Operating Systems are a special kind of SOUP. Of course you don’t want to retro-specify and test what it took your vendor decades of work with thousands of developers. But there is an alternative approach. These days, a lot of emphasis has been put on cybersecurity for medical devices – and this is good, patient data is sacred and hackers are on the brink of cyberwar to get it. You must harden your OS – and maybe brand it along the way. My recommendation would be for you to specify, document and test the hardened OS and not the base OS. This way, the OS spec is really useful and has a realistic scope.
  • SOUPs of SOUPs. Developers often ask how they should handle SOUPs of SOUPs – the dependencies of the libraries themselves. Of course you can’t handle all dependencies recursively, you would be overwhelmed. Treat your direct dependencies; their own dependencies are an implementation detail. The tests that verify the requirements you wrote for what you use in the SOUP will exercise the lines of code of the SOUP dependencies that you actually use. Their possible failure would be ways to trigger the failure mode of the level 1 SOUP that you already considered in your risk analysis; you don’t need to analyze them separately.
reinvent-the-wheel
Don’t reinvent the wheel. Source: John Kim, https://www.linkedin.com/pulse/how-reinvent-wheel-john-kim

 

Whatever the hardships in producing the required documentation, resist the temptation to code for yourself what others have done. Reinventing the wheel is a waste of time. Remember, your goal in agile development is customer feedback and delight, not library writing. The thrill of writing cutting-edge technical code is what I suspect entices many developers into rolling their own version of existing stuff, and not good project governance; this an area where a responsible mindset – adult self-supervision – is of particular importance. Developing with an agile mindset implies going as fast as you can by removing waste; missed opportunities of good reuse are horrendous waste. Your immature code will always more buggy and more poorly designed than a library written by people dedicated to it, maybe working on it full-time for several years, and that has lived in production for several releases in many different contexts. In this regard I think that the writers of 62304 have done a very dangerous job in discouraging people to use reliable libraries and creating an incentive to write brittle home-made code instead, which would have a very negative effect on overall medical device reliability and safety. A few month ago I stumbled upon a concrete example of this : a developer I know decided to write his own XML generation routine to avoid the lengthy, boring and absurd (according to him) process of documenting an off-the-shelf library. Don’t ever do this. SOUPs are good. Always use SOUPs when they make sense. Accept the pointless burden (automating as much as you can) and write the required doc.

Let me take advantage of this tribune to deeply thank all the open source contributors in the world.

OpenSourceTribute.PNG
Some of my favorite libraries. Thank you guys! You rock!

Project pace regulation

Faith Dickey in the rain with high heels
Amazing Faith Dickey demonstrating was caution is

Good project pace results of two conflicting forces: market or financial pressure to go fast (typically relayed by management) and technical pressure to do things right (typically relayed by architects and developers).

This conflict is not symmetrical, for several reasons.

  • Management has organizational power and excellent communication skills – compared to developers who tend to emerge from the ideal world of their IDE after hours lost in abstraction in a kind of semi-conscious, almost hungover state, barely able to talk to human beings 🙂 And management is always interested in having more bang for the buck and in shipping earlier to build or strengthen a market position.
  • Technical issues make your product explode on the long run, not today, and as such are easy to sacrifice. By technical issues I mean not only rotting architecture, but also documentation and regulatory issues: taking them too lightly will not cause an earthquake today, but years later. Example of disasters: technical bankruptcy (throwing away that entire unmaintainable code base and starting again from scratch), denial of authorization to sell your medical device by the regulator of a certain market, a patient gets killed using your product. And I have seen managers who, either because of incompetence or of sheer cynicism, are perfectly able to take decisions that have catastrophic long-term consequences for the sake of short-term political advantage – some people are amazingly able to lie their way out of any situation.

This conflict is dissymmetrical but it doesn’t mean that technical people are always right. A friend of mine worked in a startup without adult supervision: developers happily spent three years refactoring the code without adding any new feature. No joking. Three years of getting high with code. Gold plating can go very far. Another story from the trenches: I knew a software architect who convinced his management that building a tool to refactor the code was required (ugly Borland Delphi 6 was really unstable and unproductive) and he spent two years writing this tool alone instead of taking care of the codebase he was responsible for, in particular database concurrency issues that caused much trouble at the end of the project – he clearly worked for interests that were not those of the company, but of personal pleasure. The thing is that usually technical people don’t care much about their organization making money: they just want to enjoy coding well-done stuff and avoid becoming obsolete. If the company goes bankrupt or the project fails, they just move to another company where they shine with these new skills they honed instead of working on what was required to get that project done. Don’t get me wrong: I’m not saying that all developers are selfish and not interested in moving projects forward, but that some of them are, and that there exists a natural tendency to privilege thrill over duty that must be contained.

Fast vs careful

 

A simple model to help find the balance

So how we find the appropriate balance between conflicting forces? It’s not easy to find, it has to be managed. Over the years, I have come up with a model designed to understand and manage each force, and it has proved useful.

Real quality won’t happen by chance, or only thanks to the sacrifice of your teammates workings spontaneously nights and week-ends, but because time has been allocated for it (assuming the right processes and mindset are already in place). On the other end of the spectrum, there has to be a clear focus on delivering story points to avoid getting lost in gold plating. My preferred approach is to set up an iteration model where I allocate time for quality-related activities (stabilization phase size, refactoring proportion during construction phase) and set a story points goal for each iteration to keep everybody focused on delivering customer value.

  • Model for quality forces.
    • Duration of construction and stabilization phases. They might not be constant all project long: as the maintenance burden increases, stabilization phases may get longer. Remember, stabilization is time devoted to bug fixing and documentation. It’s quality time.
    • Proportion of construction time allocated to technical tasks. I’m not alluding to the mandatory technical tasks that deserve user stories of their own (such as sending error reports, writing an installer, have that load test pass), but to unpredictable refactorings. Be careful to think it through. Without dedicated time, refactoring won’t happen at the magnitude required for quality projects. This technical time is also the oxygen skilled technical people breathe: its helps you hire and retain them. Set that value to zero, and you likely will accumulate technical debt and scare away gifted technicians. On the other hand, set it to 100% and the project will stop moving forward. And once again, this value should not be constant: high at project start (say 50%) when frameworks and practices are not established, medium in the middle of the project (say 25%), low at project end when everybody struggles to finish that version (say 15%).
  • Model for production
    • I find it necessary to have a target of accepted story points for each iteration. This is the sum of user stories and technical stories (refactoring and anything related to inner quality that no user will see).
      • It is best measured as the average of the total story points of accepted user stories (accepted by testers, with a little allowance for a few minor bugs) on the last few iterations. This measurement is essential to feed the model with reality (total team velocity captures a great deal of variables that are impossible to model: estimate errors, organizational overhead, tooling problem, motivation, quality of the personnel, maintenance burden, architectural issues…).
      • Aligning the goal to the measurement is a delicate choice if the team produces less than expected: it can be invaluable to predict an accurate project end date, but maintain expectations help fight gold plating tendencies and maintain commitments firmly. In my experience, the target should be maintained just a little above average measured velocity – insufficient production must be fought by the team, not too easily accepted.
      • This target and the proportion of construction time devoted to refactoring make it easy to calculate the estimated user story points target and the budget for technical stories.

 

Simple iteration model

Here’s a sample spreadsheet to help clarify my intent. I’m not saying it’s perfect: you should probably design yours; just consider mine a starting point. Simple iteration management model

 

A model for unknown team velocity

 

Measuring actual team velocity and feeding it into the model is very powerful. But sometimes it’s not practical:

  • When the team is just starting and has no history
  • When projects are long (several years in medical devices development). In particular, the maintenance burden will likely get heavier and turnover will happen.
  • When there are lots of variations in team size. This is especially the case when management asks: “I want this project to be ready by date X, what do you need to make it happen?”. Tricky question: team dynamics are way more complex than a simple multiplication – such as thinking that doubling the headcount will double the throughput. Choose your answers wisely. This is a very strategic issue: if you can predict very early that a project is late and add the necessary staff soon enough so that it is profitable (people should stay at least one year to offset training costs), you will save the deadline. Do it too late, and haphazard late staffing efforts will terminate the team: “adding manpower to a late software project makes it later“.
  • When projects have not started yet. To approve a project, management needs to know its scope, its duration and its cost. The best way to do this with an agile mindset would be to build a product map to have a backlog and estimate user stories to have a project size, then estimate the velocity of the team to deduce time and cost.

 

So I have designed another model to estimate team velocity when there is no empirical data. Here are the main variables:

  • Real time
    • Average daily velocity by developer. The number of ideal days in a real day. Could be 100% if you worked alone in a monastery with absolute concentration, no task switching, and perfect estimation skills. In reality, there are useful and useless meetings, coffee breaks, errors of all kinds. I have made many measurements and find that people are often around 70% in this respect.
    • Working day factor. You have to take into account that a week day is not a working day: people are sick, have holidays, get trained. In my current environment (France, where holidays are plentiful and sacred), people work around 220 days a year. That’s not 52*5=260.
  • Real workforce. Don’t just count people. Take into account:
    • Turnover. I usually count that one person out of ten goes every year, and that it takes six month to replace them (so I loose 5% of the workforce). In other environments, you might have higher attrition rate or shorter recruitment delays.
    • Training. Newbies are not as productive as historical team members. Beginners are often not as productive as principal engineers. There has to be some ramp-up in the workforce when someone arrives (50% productivity the first month, complete productivity after 3 to 6 month depending on experience and the complexity of the work environment).
    • Communication and management overhead. My rule of thumb: every new person in the team eats 20% of the time of the equivalent of a person. One person is as productive as one. Two persons are as productive as 1,8. 6 persons are as productive as 5. This factor is very important when you start computing the effect on the deadline of various staffing scenarios. For very big teams, this factor might be higher.

 

Team velocity estimation model.PNG

Here’s the sample spreadsheet: Team velocity estimation model

 

This might sound a little complicated and over-engineered (don’t complain, I spared you the spreadsheet where I mix both models with release burnup graphs and macros to generate user story cards 🙂 – you won’t need it if you have decent agile tooling, which was not my case). But having sound predictions of what will happen in the future is a prerequisite to act upon that future. When budgets are scary and the deadline is years away, a spreadsheet with many parameters and experimental data will prove a good way to negotiate with top management. And once people realize that what you predicted one year ago proved true, they will listen very carefully on what you say will happen in two years, and maybe grant those two additional developers you need. This might also help you to slow down if quality gets out of hand: the automatic adjustment feature of the two-phase iteration (automatic stabilization phase extension which leads to decreased team velocity on the long run) will help justify why team throughput decreases – quality is just a priority.

Putting bugs under control

Whats that bug - Anegela DiTerlizzi & Brendal Wenzel
Whats that bug – Anegela DiTerlizzi & Brendal Wenzel

Bugs are bad. And especially for software engaged in serious business – such as saving lives. A few reasons why:

  • Quality medical devices have few known bugs, and of low severity. Thus, having bugs prevents you from shipping. You need to be able to ship at the end of every iteration to your customers (to get feedback), or more realistically, considering integration and product registration, to a system integration team.
  • Bug backlogs are inventories and as such are a form of waste. Lean Software Development advocates having low inventories since a bug left open will incur in additional costs:
    • Bugs in the software might provoke very complex system bugs when mixed with hardware and bioware. These bugs take an awful lot of time to investigate and are often blocking the entire project plan. You don’t want this to happen with a bug known to the software team that could have been fixed a long time ago.
    • Workarounds elicitation and teaching (by documentation or face-to-face) take time.
    • It is always more expensive to fix bugs in the future, when knowledge fades away in people’s heads or vanishes when they go.
    • Bug backlog engineering (prioritization, endless reviews, risk analysis…) typically takes the time of several experts at the same time. A tremendous waste of energy when the bug backlog is large.
    • Bug duplicates imply wasteful investigations. A closed bug has no duplicates.
  • Bugs in medical devices have the potential to do harm to people. As such they must be considered with horror and dealt with accordingly. And the best way is to fix them ASAP. Don’t let them a chance to slip through your processes.
  • I’ve seen projects with a huge bug count (close to 1000) a couple of times. You know what? They never recovered. They stayed at 1000 bug count forever. Maybe because of the cost of all this waste, maybe because it spread a sense of bad quality and failure in everybody’s hearts.
Bug count graph
Bug count graph of a real-life project. After a quick phase of exponential growth, bug count stayed in the six hundreds. In spite of two heroic campaigns of bug fixing, the end was inevitable: the flat part of the curve on the right is the clinical death of the project (brutal end, no production, millions lost).

 

Morale of the story: never get high on bug count, or your feet may never touch the ground again.

Guy swllowing bugs
What happens when you let bugs free…

Conclusion: a good bug is a bug killed. Bug count should be close to zero.

I won’t write about bug detection here, but only about what you do when you know them. Let’s assume you already have a good testing system in place.

 

Bug count threshold and two-phase iteration

So how do we actually manage known bug count? Simple. Set a threshold. Respect it.

  • Set the threshold at the start of the project. Write it down in your Project Plan and have everybody sign it. You’ll still be able to change it, but it’s motivating to give it some official existence.
  • Recommended values for the threshold:
    • More than zero (or you might seriously delay shipping for minor issues)
    • Inferior to a couple of dozens. The maximum threshold will depend on the size of the team and its ability to fix bugs. I suggest the max bug threshold doesn’t exceed what your team is able to fix in a few days if it’s its sole focus.
    • Split limits by bug severity.
    • For example, typical thresholds I use: 0 blocking bugs, 3 majors, 20 minors.
  • Bug count evolution is easier to understand inside the two-phase iteration framework. During construction, you take risks, you build, you refactor: bug count gets high. During stabilization, you stop taking risks, you fix bugs: bug count gets down. There will be a delay due the lengthy manual testing processes: you will discover the real extent of the bug count some time after they are introduced in your code.
Iteration 0 regulation
Total bug count (orange curve) is below the threshold at the end of iteration 0 (inside the green circle)

 

  • What’s important is what you do when the threshold is not respected. My advice: don’t deliver. Hold the version back until more bugs are fixed. You can’t leave into the wild a version that will waste precious integration time or harm patients. You would be ashamed of it. Take the blame for the delay. Put bug count in your information radiators so that everybody gets used to the fact that it’s important. When you have trouble respecting the threshold, talk about it around you and in your team retrospectives. It’s serious. Find solutions.

 

Iteration 1 regulation
Bug threshold is not respected at the end of stabilization of iteration 1 (red circle). An extra stab is added until the quality criteria is met (new green circle).

 

  • What’s also important is what happens to the iteration following the iteration that went wrong. Here’s where the two-phase iteration gets handy. If iteration N has too many bugs and if Stabilization phase N takes 3 more days, Construction Phase N+1 will be 3 days shorter. It means that a few user stories will have to be removed from iteration N+1. It also means that since iteration N+1 is smaller, it should be a little easier to get right, so Stabilization N+1 should run more smoothly. There is an automatic-short-term regulation effect in the two-phase iteration framework.

 

Iteration 2 regulation
Iteration 2 has a shorter construction, with less features, refactorings and bug creation than usual.

 

 

  • On the long run, if you encounter this situation on a regular basis, consider increasing Stabilization phase proportion. That is the beauty of the two-phase iteration: it also embeds a long-term regulation system. Take an extreme example: 1 day of construction, 29 days of stabilization. Plenty of time to fix bugs and get the doc right, no? It should not be a problem. This means that there exists a good proportion between construction and stabilization phase durations that will allow you to finish iterations with bugs below the threshold and documentation in good shape. Your job is to find that proportion.
  • This regulation system is vital to any project. What do you when your car engine gets hot and spits steam? You slow down. The same with a team. If a project pace is so fast that quality gets out of control, you must slow down. Remember the agile belief that quality is not negotiable? Now show your true colors. Negotiate time.
  • By the way, it is quite a logical for a project to slow down after a while, as maintenance effort increases. You might expect an increase in Stabilization size over time.

I’ve used these techniques on projects of a respectable size (several years, several dozen people, several thousand bugs created and fixed) and they have proved to work well: known bug count never exceeded the threshold for a long time.

Handling quality-related records in practice

Agile medical device software developers must solve a contradiction between two seemingly opposite philosophies:

  • From an agile perspective: go fast, experiment, deliver frequently, embrace change
  • From a regulatory perspective: produce auditable documents, double-check everything, make plans.

These philosophies have indeed been opposed very often (Apple and Google complaining that medical regulations slow down innovation, auditor being very suspicious of early agile projects). See AAMI TIR 45 for an enlightening discussion on how to reconcile them.

The rest of this post is focused on practical devices on how to cope with quality-related records so you don’t waste your energy.

 

Automation of the production of recurrent documents

automation

There are two categories of quality-related records

  • One shot documents, or only requiring minor updates (management plan, quality assurance plan, maintenance plan…). Do them once and for all, early in the project.
  • Recurrent documents (specs, test plans, test reports, design document traceability matrices…).

Recurrent documents are strategic since their repetition (especially in an iterative development process) will multiply the load required to produce them. In developed countries labor is expensive and cannot be wasted. What can you automate?

Adding_machine,_1909

  • Use a document lifecycle management tool for handling validation processes and versioning. They take care of ensuring proper signatures, notifying interested people, and most importantly act as a safe to protect your documents for the crazy time required by regulations (I heard 7 years after the last device is sold, after typical project times of several years: we’re talking in decades!). Odd as it may seem, I stumbled upon a project in 2014 where project documents and procedures were still manually signed. That’s a guaranteed recipe for losing documents and having holes in your validation process that exhilarating auditors will love to spot.
  • Use a spec and test tool to handle your requirements (I’ve been using Doors quite successfully for example, but other good tools exist). Benefits :
    • Be a platform for further automation.
    • Factor out repetitive document introduction, definitions…
    • Handle traceability
    • Share common requirements, risks, risk mitigation measures, tests plans across projects. Especially useful when you share code.
  • One of my teams wrote a tool to gather info from Doors (requirements, risks, risk mitigation measures, tests plans, executed tests plans) and from the software factory (automated developer tests results, automated GUI tests, automated stress and robustness tests) to generate a full traceability matrix. This matrix is required by regulations (to make sure every requirement has been tested), but it’s very useful to the team. Only when a requirement has been successfully tested can I be sure that its implementation is done. So this matrix provides good metrics to analyses project progress. It helps to pay extra attention to risk mitigation measures: by identifying them as special kind of requirements, it is easy to track how many are not yet implemented, or have their tests fail. Automation is the only way to go with traceability matrices when there are thousands of requirements, manual test cases and automated tests.
  • We have a project (not yet fulfilled) of generating a list of dependencies and versions by analyzing the Nuget package.config files.NuGet-Logo-2

 

Document lifecycle

In an agile team, documents are long-lived and evolve constantly. In fact, I believe a document should be considered correct only once its data has been used by the following process (e.g. a spec is correct once it has been implemented, an architecture document is correct once load and stress tests pass, a test plan is correct once it has been executed). It’s a fact of life. So don’t get stuck waiting for document approval in the general case. Instead, work on everything in parallel and have people collaborate – they master how to optimize complex, fine-grained interactions better than any process can.

Yeast_lifecycle_gl.svg

Metaphor for real-life document workflow

Write documents at the time when they are useful. I’ve seen projects blissfully ignorant of regulations until the end, where the documentation required by regulations is hastily written. This is nonsense. Minimizing doc authoring effort can be the enemy of project effort minimization. Quality-related documents are often very useful, if written properly, at the right time.

time-1024x652

  • Write those documents framing the entire program (such as high-level marketing needs) very early. They are likely to generate a lot of heat (political struggles) in the enterprise and take a long time to stabilize (when someone has won the battle). It’s very risky to start developing before – but it’s a good time for feasibilities, finding and tuning the right process, choosing tools and languages, writing foundation frameworks, hiring teams).
  • Think about architecture and risk analysis at the very beginning, when things are easy to change. Write it down in documents to set a clear vision that may be lasting for years. These documents will be read by every newbie joining the team, saving days of training for the architect – more time than required to write and maintain the docs.
  • Coding guidelines (hopefully enforced via tools) are to be enacted at the very beginning of implementation – if not, you will have to refactor the existing codebase to abide by them.
  • Specifications are to be written before coding. It’s a lot cheaper to change text than code. If you can’t write a sentence explaining what the software is supposed to do, it means you are not sure yet. Developers often think they know what the program should do – except they lack intimate client knowledge and perspective.
  • Manuel test plans are to be written before they are executed. Free testing is a powerful tool, but this is another story.

 

Document approval

ME_227_Approval

  • As far as I understand, for most documents, the minimal approval process is one author, and one other person playing the roles of reviewer and approver. That should be the default approval process to minimize waste. I’ve seen documents with more than a dozen people involved in the review process. Guess what? Everybody feels it’s useless to review the document because the others will spot errors. The review process ends up being more shallow than with one only reviewer – but fully responsible and committed.
  • The most efficient reviewer is the person using the document data as input data: he or she has to carefully read it anyway, and has the skills to really understand them. It should be the reviewer and approver of choice.images
  • As documents change until the work is done, validate documents only once the job is done – the end of the iteration (notable exception: project management plans, high-level marketing needs).
  • Validating documents once or twice a year should be sufficient (provided you explain it in your project management plan) if your validation process is costly (for example if you have a manual, paper-based validation process, or if your document lifecycle management tool has poor ergonomics and performance). You can’t waste time validating them at every iteration.

Regulatory Quality vs Product Quality

I have found very valuable over the years to make a clear distinction between Regulatory Quality and Product Quality. Regulatory Quality means you can handle to authorities a documentation package that proves you have followed their norms and standards. Product Quality means users like the features and ergonomics, that there are very few bugs, and especially none in any area that can harm somebody.

Quality yin yang

There is no equivalence between the two concepts. While Regulatory Quality can have a very positive effect on Product Quality (62 366 will definitely help to define sound ergonomics, 14 971 will help keep risks under control), it is definitely possible to issue a very bad product (ugly, full of bugs, poorly architected, with silly features) but still hastily write a nice retro-documentation that will fit the bill. Conversely, many companies write excellent software in the consumer area (websites, video games, operating systems) without using our norms – have you ever heard of a project team outside the medical device world saying: Hey, I’ve been using 62304, it’s incredible how more productive we are, how lesser defects we find, it’s awesome, check this out! I’ve always found disturbing to think that practices that seem to me so crucial for good software – refactoring, automated testing, automated coding standards, load tests – are not emphasized in norms, or worse, blissfully ignored. Maybe it’s a good things – there are legacy projects out there that certainly couldn’t use these techniques, and norms are handicapped by the least common denominator syndrome. But this demonstrates that Regulatory Quality and Product Quality are two different things.

Both areas are judged in very different ways. Auditors will generally not read the code or execute the app, because they would have subjective judgment on projects, which is not tolerable – only compliance of audit trail documentation to a norm is an objective criteria. But users are not objective. Users have a feeling about your product. They will hate in their guts every too-well-known bug, they will comment that impenetrable screen to every fellow user.

So my recommendation would be to treat them separately. Provide auditors with the documents they need in the way they like. But don’t stop there. Sure, medical device norms and processes will definitely help Product Quality – especially if you perform these activities early, honestly, with the right amount of energy. But they are not enough. Norms and standards take years to reach consensus, be validated, be widely implemented. The software world changes much more rapidly. Every year, new practices, new languages, new architectural styles emerge everywhere. You should stay tuned on what’s happening out there and try to apply it in our regulated world.

Sat-Bild der Woche/ San Francisco Bay Area/ USA
Satellite view of the Silicon Valley

A nice advantage of splitting these concerns is to maximize efficiency. Regulatory Quality implies the heavy burden of document templates, approval processes, tool validations, and so many activities that are meant for the auditor but not for the team, are a strong incentive NOT to experiment, take risks, fail, start something new. So what’s in the realm of the audit trail should be kept to a minimum. And there should be another underground, agile world were lots of good practices are used for making good software. The downside of this is that the auditor will never know of all these good things we do, that he/she might like. But if we’ve done a good job in preparing our quality-related records, he/she will be happy – if not, you have a problem.

Handling regulations

The medical industry is heavily regulated. That’s because bugs that kill people are to be handled with definitely more care than bugs that force a web page to reload. But guess what ? That’s good for established manufacturers – barrier to cheap and fast new entrants. Stop complaining about regulations, adapt to them, take advantage of them.

Carefully study regulations, norms and standards. They change all the time. New countries write their own (Anvisar, CFDA…). Worldwide manufacturers must infer from them a meta-regulation that bundles the worse (i.e., the more stringent) of them all and that is relatively unstable, because it changes when any underlying regulation changes. Usually, organizations set up RA (Regulatory Affairs) teams for that purpose.

But don’t let specialized quality teams write procedures. Procedures must be written by the people who execute them (with proper RA supervision) if you want the interpretation of norms to be productive (fast to execute, lean, no waste) and adaptive (changing frequently). It’s easy to ask for a stupid, lengthy, repetitive task when you’re not going to do it yourself.

regulations

Having a 6 months approval procedure for procedure changes with 10 senior managers involved will definitely discourage change. The procedure for writing procedures must enable evolution and empowerment.

Challenge regulations. Sometimes they can be interpreted in a variety of forms.

  • Take for example NF EN 62304, that presents software development activities in a numerical order, subtly implying you should follow the evil waterfall model. But it is not explicitly written. It took AAMI TIR45 to explicitly legalize Agile.
  • Regulations never talk about the amount of work to be done. 2 pages or 200 for a document ? Challenge your impulse to be thorough. From what I’ve heard, auditors get mad when something is totally missing, but are open to negotiation when it’s small. You can be lean by providing the bare minimum if you don’t find the activity really useful – but a have a rationale ready to justify your priorities.
  • Challenge RA people. When they say developers should add a best practice because of « regulations », ask to read the text of the article of the regulation that really puts a constraint. Always come back to the text – it’s the core principle, it’s the real constraint. It’s too easy to invoke a hazy « regulations » to justify any excessive demand. If it’s not mandatory, if we’re talking about best practices, then it must be decided by the development team. Best practices are only known by people who practice. Just to bring the point home: whenever you feel something brought up by “regulations” doesn’t feel right, always come back to the test, and challenge its interpretation.

Rules_and_Regulations

Remember, regulators don’t want you to drown in papers – they want medical devices to be safe, and incidentally their design to be auditable. They are reasonable people. If something seems completely silly, there must be a more sensible interpretation.

One useful technique my teams use is to write regulations as a spec, and trace its implementation to our specs and risk mitigation measures. Works well for technical guides such as CLSI AUTO9 and CLSI AUTO11. Going fully traceable by writing procedures as specs as seemed a little excessive to us, but why not? The good thing about this technique is you can challenge any legal obligation, and it can help you in case of an audit, by capturing your decisions towards regulation implementation, and by showing off how organized you are towards them.