Medical devices software testing: automated tests that require human supervision

This article is part of a series of posts about medical devices software testing:


Some tests are mostly automated but can’t be run in the build, for two reasons:

  • Somebody needs to start them and interpret the results
  • They need to run on the target computer and OS

They are the purple ones in the classification of the introduction:

Test classification - white - tests that require supervision.png

Haltéprophile.jpg
Source: http://invictus-physicalcoaching.com/2014/02/analyse-descriptive-des-mouvements-halterophiles-33/

Load test. Load tests are typically a special kind of integration tests. The idea is to ensure that your app works with the expected performance criteria at maximum capacity by testing it in conditions an order of magnitude more demanding. Both performance and capacity criteria should be precisely defined in specifications and related to a real user need, otherwise the temptation to lower the bar will be great. These tests are difficult to make pass and as such deserve a user story of their own. But they are an excellent bug finder: you will encounter those rare conditions that otherwise would happen only at client sites. They are a must have. While I recommend running some load tests in the build (which allows you to break the build as soon as a toxic commit kills performance, or to easily monitor the duration of these tests over time), at some point you must run them on the target hardware and OS and guarantee that the performance objectives will be met in production conditions.

Stamina tests. This time, the device operates at normal capacity, but for a long time. For example, I work on a medical device that is meant to run with one shutdown per week: it has to be tested for one week of continuous work (or more). Other kinds of interesting and surprising bugs can be spotted here, such as tiny memory leaks or hidden timeouts – what happens if the app is left open without activity for more than twenty-four hours? A very interesting way of performing stamina testing is to capture real production activity and to repeat it – in which case it also serves as regression testing.

Automated tests written by testers. Automated testing tools such as Ranorex or TestComplete allow testers to set up tests that exercise the GUI as an end user would and check the display. As such they are pretty global integration tests. They can be run in the build or on the same computer as the medical device, to be even more global tests (which is why I’ve classified them here). The easiest way for testers to write them is to record manual test sessions. But on the long run we have found more productive to write large portions of them directly with code, which helps to reuse them across projects. We have also found that such automated GUI tests could be very handy to automate the setup of manual tests (putting the SUT in a certain state before starting the test). Let’s face it: the entry cost is real. You need to study the market and choose the right tool (vendor lock-in is very likely given the cost of your tests and their adherence to the specifics of the tool), to get acquainted with the tool, to create a team responsible for these automated UI tests, to set up good communication channels between this team and the dev teams to avoid breaking tests when controls are renamed or refactored, to tweak your UI to help the tool find controls, to put everything in the software factory, to link the tests to requirement numbers and get their results into the traceability matrix, to provision the maintenance costs of these tests. But the rewards are great: imagine you could run your whole manual test plan in one night on your servers, imagine you could transform wasteful test execution time into an investment (repeatable test procedures), imagine you could thus get enough time for your testers to perform free testing – wouldn’t that be a lot more productive and interesting? Wouldn’t you find more bugs?

testcomplete
TestComplete test editor

 

Test with network peers. Modern medical devices can no longer be off the grid: medical and laboratory staff need them to receive orders and send results through the healthcare organization network (with protocols such as HL7, ASTM, DICOM…), they may interact with non-embedded software that is easier to develop and maintain (Data Managers, for example), and they might interact with an IoT solution that helps keep them in optimal operating conditions and add remote services (such as allowing patients to inspect their medical records, or apply artificial intelligence technique to provide clinical decision support). All these wonderful features relying on networks need to be carefully tested. While low-level interactions (such as protocol handling or basic conversations between servers) should be tested in the build to provide fast feedback to developers and extensive edge case testing, at some point the integration must be tested in more realistic conditions: real hardened operating systems, real network cards, real cables or wifi, real firewalls and NATs, real DNSs, real intrusion detection systems, real production server hardware… I’ve classified network peers testing in the semi-automated category since you can typically script part of the test scenario, but will still need some degree of human intervention to set them up, start them and analyze the results (until the day the healthcare industry has fully moved to continuous delivery, but we still have a long way to go when embedded systems are involved).

Security testing. Network peers lead us to cybersecurity. As anything, security doesn’t work until it has been tested. There is much more to security than intrusion testing, but the latter at least guarantees that your security measures were able to repel one professional attacker. It’s doesn’t guarantee your device can’t be hacked, but it says it’s not that easy. To perform the attack properly, the auditor will need access to the device in real conditions (real computer, real hardened OS with real vulnerabilities). Trying to hack a system involves the expert manipulation of tools (such as MetaSploit) that automate attacks or entire catalogs of attacks and gather results to show vulnerabilities to the dev teams. I wouldn’t run such tools in a build system since they can easily compromise your IT infrastructure – they should be run only in a controlled, totally disconnected environment.

metasploit

 

 

Highligts from BIOMEDevice Boston 2016 conference

I attended the BIOMEDevice conference on the 13th and 14th of April 2016. The conference was packed with suppliers of the medical device space, especially from Massuchussetts. Two conferences have especially rung a bell in my head, and I thought I might just drop my notes here so that everybody can get a feel of what was said:

  • Patient Privacy & Data Security in the Cloud Communication Age
  • Winning Over the Hospital Value Analysis Committees

BIOMEDevice 2016 conference hall

Patient Privacy & Data Security in the Cloud Communication Age

  • Our technology is advancing faster than we can protect it. How can we keep up with the cloud communication age and build sustainable data protection?
  • Understanding FDA’s evolving guidelines and standards to address cyber security
  • How is HIPAA playing an increasingly pervasive role in health data management?
  • Cloud-enabled utilities and solutions – what are the pros, cons, and security risks of storing data in the cloud?
  • Advances in safely transmitting data across various healthcare applications and protecting data from cyber attacks

Michael McNeil, Global Product & Security Services Officer, PHILIPS HEALTHCARE

 

Phillips has a HealthSuite IoT architecture based on AWS (EC2, S3, Glacier, Lambda, SNS)

http://www.usa.philips.com/healthcare/innovation/about-health-suite

They have a way to make sure data is not leaving a country’s borders where it’s forbidden.

Industry challenges:

  • Patient safety (ethical hackers have demonstrated threats)
  • Data integrity and availability – required by care
  • Legal and regulatory obligations
  • Protecting intellectual property – especially when expanding into emerging markets

 

Best practices:

  • Design security at every stage of development
  • Take advantage of well-known techniques (encryption, salting, rate limiting)
  • Train employees
  • Integrate security by design. Security built into the development process.
  • External security testing and assessment.

 

Medical device challenges:

  • Portable and mobile devices (storage medium encryption, hard to remove without tools)
  • Access to device and settings
  • Firewall controls
  • Malware controls (whitelist solutions take away the need for daily updates)

 

Avoid 3 deadly sins of medical device vulnerabilities

  • Uncontrolled distribution of passwords (fixed, default, hard-coded)
  • Failure to provide timely security software updates and patch management
  • Security vulnerability in off-the-shelf software designed to prevent unauthorized device or network access

 

The FDA has clearly stated that you don’t have to the entire re-submission process to address security updates (validation responsibility still applies though)

 

Establish a policy for providers and SOUPs (embed checkpoints in vendor selection, update the procurement process, establish monitoring criteria [frequency of scan and pen testing…]

 

Define a responsible disclosure of incidents process (they will happen!)

 

Conclusion:

  • Continuous threat monitoring of the healthcare landscape is critical
  • Transparency, accountability and responsiveness must be ongoing features
  • Wider dialogue between medical device makers, hospitals, regulators and security professionals will advance innovation in security in the healthcare industry

 

Winning Over the Hospital Value Analysis Committees

  • Overview of the changing marketplace and how to position your product in this tight economic environment
  • USA vs. Europe – what are the hospitals looking for?
  • Important questions you should be able to answer
  • Looking at devices and assessing value – from a physician standpoint
  • Discussing value added services in products
  • Understanding the necessity of usability and how it can determine widespread adoption

Moderator:
David J. Dykeman, Attorney, GREENBERG TRAURIG, LLP

Panelists:
Eric T. Pierce, MD, PhD, Physician Director of Anesthesia Bioengineering, Supply & Technical Support, Department of Anesthesia, Critical Care & Pain Medicine, MASSACHUSETTS GENERAL HOSPITAL
Michael Fraai, Executive Director- Biomedical Engineering & Device Integration, BRIGHAM AND WOMEN’S HOSPITAL
David J. Berkowitz,
Vice President, Healthcare Insights and Analytics, ECRI INSTITUTE

 

Value Analysis Committees are now gatekeepers to inserting a technology into hospitals. Decisions are more and more based on financial factors, clinical benefits are not the paramount factor anymore.

Considerations they have

  • What do they do with former product if there is a replacement?
  • Cost – upfront and maintenance. TCO is king.
  • Clinical outcome – of backed by solid evidence.

 

Eric T Pierce, MD, PHD: how we select devices

Eric is involved in product selection for the Massachusetts General Hospital – especially for anesthesia

The selection process is always changing.

Value in Medical devices = Quality (outcome, safety, clinician satisfaction) / TCO

Traditionally, physicians were big drivers of device selection. They become less and less important.

When a product might be controversial, limited trials are set up.

For complex and expensive products, the process is the following:

  • An ad-hoc evaluation group is formed (physician director, bioengineers, clinician advocates, division leaders, frequent users)
  • Review all viable product options
  • Apply selection criteria (TCO, compatibility & continuity, ease of operation, serviceability, product support)
  • Narrow choice of 2 or 3 products
  • Focused trial of top choices in-service
  • Comparative financial analysis, purchasing folks negotiate
  • Review, recommendation, decision

 

The whole process takes weeks or month

Ease of operation criteria (very important):

  • Intuitive design
  • Simple interface
  • Clean-ability (they recently had a device which screen was damaged to cleaning solutions)
  • Battery life
  • Boot up time (because of emergencies). They time boot-up time.
  • Portability (big issue for them: portable devices get stolen)
  • Mounts

 

Winning over the value analysis committee – David J. Berkowitz, Vice President, Healthcare Insights and Analytics, ECRI INSTITUTE

 

We are moving from a volume-based healthcare system to a value-based healthcare system

The absence of evidence (as far as clinical benefits are concerned) is a showstopper

 

Michael Fraai, Executive Director- Biomedical Engineering & Device Integration, BRIGHAM AND WOMEN’S HOSPITAL

Network security is huge topic before devices are authorized into a hospital’s network.

They don’t buy a quote. They buy a solution to deliver safe & efficient care.

There is an awareness of real cost.

Factors in the TCO: purchase cost, backfill cost, training cost, device integration, software cost, warranty cost, implementation cost, parts, accessories.

It becomes more and more costly to integrate products into EHRs.

 

Panel discussion

 

Mistakes companies and salespeople make:

  • Adding too many features
  • Eliminating features that users do like
  • Not doing enough outcome research
  • Not understanding the user’s work environment (screens too smalls or difficult to read). Send your designers to the environment where the device will be used.
  • Introducing too many variable or deals
  • Not supporting intra-operability (ICE standards)
  • Not being the clients’ time and objectives
  • Not being environmentally responsible

 

There is an EPP (Environmentally Preferable Purchase) movement happening in the supply chain space

 

Advice for manufacturers:

  • How do you reduce downtime?
  • Think about helping institutions to compute the TCO
  • Analyze error logs and fix errors. Provide backup capabilities.
  • Have a real value dossier with all the stuff discussed above ready for the value committee.