Project pace regulation

Faith Dickey in the rain with high heels
Amazing Faith Dickey demonstrating was caution is

Good project pace results of two conflicting forces: market or financial pressure to go fast (typically relayed by management) and technical pressure to do things right (typically relayed by architects and developers).

This conflict is not symmetrical, for several reasons.

  • Management has organizational power and excellent communication skills – compared to developers who tend to emerge from the ideal world of their IDE after hours lost in abstraction in a kind of semi-conscious, almost hungover state, barely able to talk to human beings 🙂 And management is always interested in having more bang for the buck and in shipping earlier to build or strengthen a market position.
  • Technical issues make your product explode on the long run, not today, and as such are easy to sacrifice. By technical issues I mean not only rotting architecture, but also documentation and regulatory issues: taking them too lightly will not cause an earthquake today, but years later. Example of disasters: technical bankruptcy (throwing away that entire unmaintainable code base and starting again from scratch), denial of authorization to sell your medical device by the regulator of a certain market, a patient gets killed using your product. And I have seen managers who, either because of incompetence or of sheer cynicism, are perfectly able to take decisions that have catastrophic long-term consequences for the sake of short-term political advantage – some people are amazingly able to lie their way out of any situation.

This conflict is dissymmetrical but it doesn’t mean that technical people are always right. A friend of mine worked in a startup without adult supervision: developers happily spent three years refactoring the code without adding any new feature. No joking. Three years of getting high with code. Gold plating can go very far. Another story from the trenches: I knew a software architect who convinced his management that building a tool to refactor the code was required (ugly Borland Delphi 6 was really unstable and unproductive) and he spent two years writing this tool alone instead of taking care of the codebase he was responsible for, in particular database concurrency issues that caused much trouble at the end of the project – he clearly worked for interests that were not those of the company, but of personal pleasure. The thing is that usually technical people don’t care much about their organization making money: they just want to enjoy coding well-done stuff and avoid becoming obsolete. If the company goes bankrupt or the project fails, they just move to another company where they shine with these new skills they honed instead of working on what was required to get that project done. Don’t get me wrong: I’m not saying that all developers are selfish and not interested in moving projects forward, but that some of them are, and that there exists a natural tendency to privilege thrill over duty that must be contained.

Fast vs careful

 

A simple model to help find the balance

So how we find the appropriate balance between conflicting forces? It’s not easy to find, it has to be managed. Over the years, I have come up with a model designed to understand and manage each force, and it has proved useful.

Real quality won’t happen by chance, or only thanks to the sacrifice of your teammates workings spontaneously nights and week-ends, but because time has been allocated for it (assuming the right processes and mindset are already in place). On the other end of the spectrum, there has to be a clear focus on delivering story points to avoid getting lost in gold plating. My preferred approach is to set up an iteration model where I allocate time for quality-related activities (stabilization phase size, refactoring proportion during construction phase) and set a story points goal for each iteration to keep everybody focused on delivering customer value.

  • Model for quality forces.
    • Duration of construction and stabilization phases. They might not be constant all project long: as the maintenance burden increases, stabilization phases may get longer. Remember, stabilization is time devoted to bug fixing and documentation. It’s quality time.
    • Proportion of construction time allocated to technical tasks. I’m not alluding to the mandatory technical tasks that deserve user stories of their own (such as sending error reports, writing an installer, have that load test pass), but to unpredictable refactorings. Be careful to think it through. Without dedicated time, refactoring won’t happen at the magnitude required for quality projects. This technical time is also the oxygen skilled technical people breathe: its helps you hire and retain them. Set that value to zero, and you likely will accumulate technical debt and scare away gifted technicians. On the other hand, set it to 100% and the project will stop moving forward. And once again, this value should not be constant: high at project start (say 50%) when frameworks and practices are not established, medium in the middle of the project (say 25%), low at project end when everybody struggles to finish that version (say 15%).
  • Model for production
    • I find it necessary to have a target of accepted story points for each iteration. This is the sum of user stories and technical stories (refactoring and anything related to inner quality that no user will see).
      • It is best measured as the average of the total story points of accepted user stories (accepted by testers, with a little allowance for a few minor bugs) on the last few iterations. This measurement is essential to feed the model with reality (total team velocity captures a great deal of variables that are impossible to model: estimate errors, organizational overhead, tooling problem, motivation, quality of the personnel, maintenance burden, architectural issues…).
      • Aligning the goal to the measurement is a delicate choice if the team produces less than expected: it can be invaluable to predict an accurate project end date, but maintain expectations help fight gold plating tendencies and maintain commitments firmly. In my experience, the target should be maintained just a little above average measured velocity – insufficient production must be fought by the team, not too easily accepted.
      • This target and the proportion of construction time devoted to refactoring make it easy to calculate the estimated user story points target and the budget for technical stories.

 

Simple iteration model

Here’s a sample spreadsheet to help clarify my intent. I’m not saying it’s perfect: you should probably design yours; just consider mine a starting point. Simple iteration management model

 

A model for unknown team velocity

 

Measuring actual team velocity and feeding it into the model is very powerful. But sometimes it’s not practical:

  • When the team is just starting and has no history
  • When projects are long (several years in medical devices development). In particular, the maintenance burden will likely get heavier and turnover will happen.
  • When there are lots of variations in team size. This is especially the case when management asks: “I want this project to be ready by date X, what do you need to make it happen?”. Tricky question: team dynamics are way more complex than a simple multiplication – such as thinking that doubling the headcount will double the throughput. Choose your answers wisely. This is a very strategic issue: if you can predict very early that a project is late and add the necessary staff soon enough so that it is profitable (people should stay at least one year to offset training costs), you will save the deadline. Do it too late, and haphazard late staffing efforts will terminate the team: “adding manpower to a late software project makes it later“.
  • When projects have not started yet. To approve a project, management needs to know its scope, its duration and its cost. The best way to do this with an agile mindset would be to build a product map to have a backlog and estimate user stories to have a project size, then estimate the velocity of the team to deduce time and cost.

 

So I have designed another model to estimate team velocity when there is no empirical data. Here are the main variables:

  • Real time
    • Average daily velocity by developer. The number of ideal days in a real day. Could be 100% if you worked alone in a monastery with absolute concentration, no task switching, and perfect estimation skills. In reality, there are useful and useless meetings, coffee breaks, errors of all kinds. I have made many measurements and find that people are often around 70% in this respect.
    • Working day factor. You have to take into account that a week day is not a working day: people are sick, have holidays, get trained. In my current environment (France, where holidays are plentiful and sacred), people work around 220 days a year. That’s not 52*5=260.
  • Real workforce. Don’t just count people. Take into account:
    • Turnover. I usually count that one person out of ten goes every year, and that it takes six month to replace them (so I loose 5% of the workforce). In other environments, you might have higher attrition rate or shorter recruitment delays.
    • Training. Newbies are not as productive as historical team members. Beginners are often not as productive as principal engineers. There has to be some ramp-up in the workforce when someone arrives (50% productivity the first month, complete productivity after 3 to 6 month depending on experience and the complexity of the work environment).
    • Communication and management overhead. My rule of thumb: every new person in the team eats 20% of the time of the equivalent of a person. One person is as productive as one. Two persons are as productive as 1,8. 6 persons are as productive as 5. This factor is very important when you start computing the effect on the deadline of various staffing scenarios. For very big teams, this factor might be higher.

 

Team velocity estimation model.PNG

Here’s the sample spreadsheet: Team velocity estimation model

 

This might sound a little complicated and over-engineered (don’t complain, I spared you the spreadsheet where I mix both models with release burnup graphs and macros to generate user story cards 🙂 – you won’t need it if you have decent agile tooling, which was not my case). But having sound predictions of what will happen in the future is a prerequisite to act upon that future. When budgets are scary and the deadline is years away, a spreadsheet with many parameters and experimental data will prove a good way to negotiate with top management. And once people realize that what you predicted one year ago proved true, they will listen very carefully on what you say will happen in two years, and maybe grant those two additional developers you need. This might also help you to slow down if quality gets out of hand: the automatic adjustment feature of the two-phase iteration (automatic stabilization phase extension which leads to decreased team velocity on the long run) will help justify why team throughput decreases – quality is just a priority.

The two-phase iteration

Developing software for medical devices implies two special aspects:

  • Documentation becomes important. Regulations require a ton of documentation, and a failure to do so might hamper the market access of your medical device. Conclusion: documentation becomes a part of the definition of done and you have to get some distance from the second principle of the agile manifesto (favoring working software over comprehensive documentation). The problem is, developers in general hate documentation (but have all kind of justifications to make it appear as a conscious choice, such as “documentation is always outdated” or “just read the code”). But when you have to do something that you don’t like and there’s no way to get around it, I believe you should get at it and do it on a regular basis. Don’t postpone documentation until the end of the project, when people leave or are assigned to new endeavors, and three quarters of the knowledge has evaporated.
  • Leaving bugs is not an option. And an excellent technique for finding bugs is manual testing. In my best-effort projects as far as automated testing is concerned, we still find 7 bugs a day through manual testing. And some tests are just too complicated to be automated (example: recovery after a power outage, intrusion testing). Manual testing has a very direct consequence: at some point, you have to deliver a version, testers will test it and it will take time, and when they put their hands on the next version, they don’t want it crippled by regressions due to new developments. Manual testers don’t have the patience of a continuous integration system.

 

A good way to adjust to these constraints is to use two-phase iterations.Phase 1: build. Phase 2: stabilize.

Build & stabilize
Build & stabilize

To elaborate a bit more:

  • Phase 1: construction.
    • Build a product increment. In this phase, you take risks. You write that ambitious new feature. You refactor that awfully complex engine. You change the build system.
    • You maintain quality (automated tests, bug fixing), but it’s not your main concern. You might let bugs and broken tests pile up a little (but not too much).
    • Little manual testing occurs (only functional challenges by those who wrote the specs, weekly general regression testing), but testers prepare for the next phase by honing test procedures and test strategies (according to impact analysis).
    • The final days of the construction phase will be a little constrained. There has to be a feeling of deadline around this date. Everybody works hard to be on time.
    • The most concrete consequence of the end of the construction phase is that a stabilization branch for this iteration is created.
  • Phase 2: stabilization.
    • Finish the product increment.
      • Testers test.
      • Developers fix bugs and broken tests. They are not allowed to take risks on the stabilization branch.
      • Several versions are issued and tested until the last one meets quality standards (I like to set a low total known bugs threshold – more on this in a dedicated post).
    • Write the documentation. It’s a good time: after the rush, and while things are still fresh in everyone’s head. Write the mandatory one-shot documentation for the iteration (test report, formal reviews…). Update long-term documentation (e.g. architecture documents).
    • Prepare for the next iteration. Select the candidate user stories. Have functional people explain them to developers. Extract the requirements related to the user stories, make sure that evil details are taken into account (remember, at that time, specs should be ready. The months of talking and studying and analyzing features are over. If the spec is not ready, then the feature is not mature enough for this iteration. If you can’t write it down, you can’t code it). Developers and architects should throw their first design ideas one whiteboards and start negotiating solutions. Then formal planning occurs (detailed planning poker with tasks). Product Owners compare task estimates with personnel availability and estimated throughput, then choose which user stories will be part of the iteration, and which will not.

 

Two-phase iteration
Detailed activities inside the two-phase iteration

Practical considerations

  • Iteration size. Although Scrum advocates for 2 to 4 weeks iterations, for this kind of process, I’ve experimented with values between 4 and 8 and settled for 6 weeks. This seems to me a good compromise: big enough so that content matches iteration overhead (documentation, planning, manual test campaign), small enough to be manageable. Of course, this value works in my environment, you should try several and see what works for you.
  • Phase size. My biggest project started with a 2/3-1/3 proportion (that is, four weeks of construction and two weeks of stabilization). Then, as maintenance cost increased, I increased stabilization level to 12 days (leaving 18 for construction) and plan to increase it again soon. It may seem long, but in my usual environment it takes about 4 deliveries to get a software version of good enough quality. This is a key feature of the two-phase iteration: by adjusting the relative size of construction and stabilization, you have a built-in mechanism for regulating speed and quality. More on this in a dedicated article.
  • Be tough with construction end date. If a user story is not ready, then it’s going to ship in the next iteration (Of course, if everybody at system level waits for it, you may want to be (or be forced to be) a little flexible. But if that user story is so important, why did it slip until the end of the construction phase? Couldn’t it have been planned earlier? You should always have a buffer of user stories or technical tasks of lesser importance ready to be sacrificed if something important goes out of hand).
  • Be tough with quality at the end of stabilization. If the product is buggy, it’s not shippable. Immediate course of action is to fix it and ship it. The following construction phase will be smaller than usual, with less features. That’s a smaller problem than a buggy software – your users are more important than your bosses.
  • On the day stabilization phase begins, a stab branch should be created in the repository (named, for example, iteration_XX_stab). Why?
    • Dev on trunk/master: sometimes it is reasonable to allow a developer to start construction N+1 during stabilization N. Example: a huge refactoring with lots of impact, that should rather be performed when the change level on the code is lower (during stab). That developer should work on the trunk/master while everybody remains on the stab branch.
    • Psychological reasons: the fact that they have to switch from trunk/master to stab branch helps developers materialize the fact that activities will be different. The fact that it’s possible to perform minor tasks in the trunk helps keep the stab branch clean (bug fixing only!). For example: fix that bug in the stab branch the quick and dirty way to avoid unnecessary regressions, but merge the fix at once in the trunk, and then refactor it until the design doesn’t make you blush anymore.
    • Version maintenance. Imagine you have to fix a bug for the software version of iteration N is 3 months or 3 years from now: pick stab branch N just where you left it.

Agile medical device system design

The Agile revolution has definitely transformed the way software is built, to such an extent that it has become mainstream because it just works better. There are several factors to such a success: empowerment that helps get the best out of the people; automation that reduces costs, cycle time and errors. But to me, the most powerful practice of the agile toolbox is the incremental product design that reduces risks at all levels:

  • Integration risk: you integrate sooner (all the time, in fact), so the long-dreaded integration phase of the eighties (that could last for years and often end in project failure) is an everyday, routine task.
  • User needs risk: by implementing the most important features first and putting them into the hands of end users ASAP, you gain field feedback on what the users really need and want. You decrease the risk of creating totally useless or partially usable feature (80% of features in software are said to be never or seldom used).
  • Projects risks: by finishing the product often and measuring the team velocity, you know your real project pace and can adjust to it. Your team’s average velocity of the last three iterations is a good predictor of the team’s pace until the end of the project. I’m a big fan of this down-to-earth wisdom of measuring what’s too complex to be predicted and changing course accordingly.

When working on medical device projects with my colleagues from the hardware, electronics, reagent or system teams, I’ve often wondered why they wouldn’t use iterative development to their advantage. The counter arguments they gave me usually were the following:

  • Our iterations are too long. When designing hardware, the time needed to finish plans, order parts all over the world and receive them, test them and send them back for defects once in a while, assemble them, is ridiculously long – up to six months. The same with electronics if suppliers are expected to design and produce boards. Reagent teams may perform stability tests that last for years.
  • Our iterations cost too much. Big hardware prototypes can cost the price of several brand-new cars. Moulds are awfully expensive. Reagent production lines are a luxury item. Physical stuff cannot just be made and destroyed without a sizeable monetary footprint.

These hardships entice specialists to optimize their business with a typical waterfall process: long requirements elicitation, one-shot production of what they think should be made, oops we forgot something, some supplier is late, schedule is doomed. Local optimization is the enemy of the global optimization endeavor that is a systems project. I believe systems design must be iterative and thought as such from the very start.

 

System iterations
Hardware V1 and electronics V2 are combined with embedded software V5 to build embedded system V3. After some integration testing, embedded system V3 is combined with non-embedded software V6 and reagents V1 to perform the first round of tests of the complete system. This will lead to new insights and subsequent changes in the next iterations of all sub-components – a long time before the end of the project.

 

Software item iterations are likely to be always shorter. But that doesn’t mean that other specialties can’t plan iterations too. Some techniques that could be used to make it possible:

  • First hardware and electronics iterations can be made with prototyping material (for example: B&R automation products) that has unrealistic production cost or size but that allows fast creation of first versions. If first tests prove that the design is good, next iterations can focus on production cost, maintainability, assembly lines, multi-sourcing of providers, while the overall systems keeps on its journey.
  • Hardware stubs. First iterations can also use the technique we software developers know as stubs. For example, the first version of an automated and temperature-regulated drawer for reagent storage could be made without temperature regulation at all, and without automation (only fixed-position reagents, hard-coded in the code or loaded in the database via a script).
  • Design and usability are a big concern for marketing departments and regulators as well. I would suggest to meet your end-users ASAP by quickly manufacturing prototypes of all external interfaces. For example, you can use 3D printers or cardboard models or foam models. Have end-user representatives execute typical usage scenarios with it. What do they think? I remember using this technique for a device with a bar-code reader: we printed a 3D version of the casing in a matter of days only to realize that the bar-code reader was positioned in such a way that the end user would have to almost break its wrist to use it. So we moved it to the opposite very easily (no need to redesign all the internal parts of the device, no constraints!).
  • Reagents design is complex and slow. Help these guys by giving them ASAP a system prototype to test their stuff. They don’t care about chassis production cost, cybersecurity or electronic components triple-sourcing. They just need good biological performance.
  • Assembling subsystems is difficult. Something that has never been tested never works. So be sure to plan an integration and system debugging session every time you produce a system iteration, before downstream activities (such as biological performance tuning) can start.
  • As explained by the eXtreme Manufacturing movement, to plan for iterative, incremental system design, the priority would be to think carefully about the internal interfaces of the system and divide it into subsystems. Subsystems can evolve independently as long as they respect the interfaces – thus achieving fast-paced design.

 

ScrumInc eXtreme Manufacturing car
The modules that make up an extreme manufacturing build party at ScrumInc

This is no easy task. But a necessary one to tackle the top risks of a medical device project: biological risk and registration risk.

  • You should produce as fast as you can a functional system to tackle the biological risk – living matter is so unpredictable that you are better off observing how it behaves (just as project dynamics, by the way).
  • And once you have a complete system able to perform its biological task (stripped out of the bells and whistles), i.e. once you have tackled the biological risk, consider handling the registration risk by registering this minimalistic system. This will take lots of time (typically 2 years in China). The registration teams should be able to define the contour of a system that could be registered officially all over the world, but that you probably won’t sell (it’s ugly, it can’t be maintained, it has no advanced software features, but yes it performs its core biological mission pretty well). Meanwhile, you will prepare a second version with all the nice-to-have features that will be registered as a simple product evolution, with lower risk and delay, and that might well end up being available on the market little time after the first version is ready.

Lean Startup thinking promotes trying your concept with a Minimal Viable Product that you put into the hands of your end-users. Product registration authorities are a kind of VIP end-user. Maybe you should plan you entire project plan to build them a dedicated MVP to address the registration risk right after the biological risk is under control.