Archive for the ‘Examples’ Category

A story was shared this week.

A story of hope for the hard-pressed NHS, its patients, its staff and its managers and its leaders.

A story that says “We can learn how to fix the NHS ourselves“.

And the story comes with evidence; hard, objective, scientific, statistically significant evidence.

The story starts almost exactly three years ago when a Clinical Commissioning Group (CCG) in England made a bold strategic decision to invest in improvement, or as they termed it “Achieving Clinical Excellence” (ACE).

They invited proposals from their local practices with the “carrot” of enough funding to allow GPs to carve-out protected time to do the work.  And a handful of proposals were selected and financially supported.

This is the story of one of those proposals which came from three practices in Sutton who chose to work together on a common problem – the unplanned hospital admissions in their over 70’s.

Their objective was clear and measurable: “To reduce the cost of unplanned admissions in the 70+ age group by working with hospital to reduce length of stay.

Did they achieve their objective?

Yes, they did.  But there is more to this story than that.  Much more.

One innovative step they took was to invest in learning how to diagnose why the current ‘system’ was costing what it was; then learning how to design an improvement; and then learning how to deliver that improvement.

They invested in developing their own improvement science skills first.

They did not assume they already knew how to do this and they engaged an experienced health care systems engineer (HCSE) to show them how to do it (i.e. not to do it for them).

Another innovative step was to create a blog to make it easier to share what they were learning with their colleagues; and to invite feedback and suggestions; and to provide a journal that captured the story as it unfolded.

And they measured stuff before they made any changes and afterwards so they could measure the impact, and so that they could assess the evidence scientifically.

And that was actually quite easy because the CCG was already measuring what they needed to know: admissions, length of stay, cost, and outcomes.

All they needed to learn was how to present and interpret that data in a meaningful way.  And as part of their IS training,  they learned how to use system behaviour charts, or SBCs.

By Jan 2015 they had learned enough of the HCSE techniques and tools to establish the diagnosis and start to making changes to the parts of the system that they could influence.

Two years later they subjected their before-and-after data to robust statistical analysis and they had a surprise. A big one!

Reducing hospital mortality was not a stated objective of their ACE project, and they only checked the mortality data to be sure that it had not changed.

But it had, and the “p=0.014” part of the statement above means that the probability that this 20.0% reduction in hospital mortality was due to random chance … is less than 1.4%.  [This is well below the 5% threshold that we usually accept as “statistically significant” in a clinical trial.]

But …

This was not a randomised controlled trial.  This was an intervention in a complicated, ever-changing system; so they needed to check that the hospital mortality for comparable patients who were not their patients had not changed as well.

And the statistical analysis of the hospital mortality for the ‘other’ practices for the same patient group, and the same period of time confirmed that there had been no statistically significant change in their hospital mortality.

So, it appears that what the Sutton ACE Team did to reduce length of stay (and cost) had also, unintentionally, reduced hospital mortality. A lot!

And this unexpected outcome raises a whole raft of questions …

If you would like to read their full story then you can do so … here.

It is a story of hunger for improvement, of humility to learn, of hard work and of hope for the future.

Bob Jekyll was already sitting at a table, sipping a pint of Black Sheep and nibbling on a bowl of peanuts when Hugh and Louise arrived.

<Hugh> Hello, are you Bob?

<Bob> Yes, indeed! You must be Hugh and Louise. Can I get you a thirst quencher?

<Louise> Lime and soda for me please.

<Hugh> I’ll have the same as you, a Black Sheep.

<Bob> On the way.

<Hugh> Hello Louise, I’m Hugh Lewis.  I am the ops manager for acute medicine at St. Elsewhere’s Hospital. It is good to meet you at last. I have seen your name on emails and performance reports.

<Louise> Good to meet you too Hugh. I am senior data analyst for St. Elsewhere’s and I think we may have met before, but I’m not sure when.  Do you know what this is about? Your invitation was a bit mysterious.

<Hugh> Yes. Sorry about that. I was chatting to a friend of mine at the golf club last week, Dr Bill Hyde who is one of our local GPs.  As you might expect, we got to talking about the chronic pressure we are all under in both primary and secondary care.  He said he has recently crossed paths with an old chum of his from university days who he’d had a very interesting conversation with in this very pub, and he recommended I email him. So I did. And that led to a phone conversation with Bob Jekyll. I have to say he asked some very interesting questions that left me feeling a mixture of curiosity and discomfort. After we talked Bob suggested that we meet for a longer chat and that I invite my senior data analyst along. So here we are.

<Louise> I have to say my curiosity was pricked by your invitation, specifically the phrase ‘system behaviour charts’. That is a new one on me and I have been working in the NHS for some time now. It is too many years to mention since I started as junior data analyst, fresh from university!

<Hugh> That is the term Bob used, and I confess it was new to me too.

<Bob> Here we are, Black Sheep, lime soda and more peanuts.  Thank you both for coming, so shall we talk about the niggle that Hugh raised when we spoke on the phone?

<Hugh> Ah! Louise, please accept my apologies in advance. I think Bob might be referring to when I said that “90% of the performance reports don’t make any sense to me“.

<Louise> There is no need to apologise Hugh. I am actually reassured that you said that. They don’t make any sense to me either! We only produce them that way because that is what we are asked for.  My original degree was geography and I discovered that I loved data analysis! My grandfather was a doctor so I guess that’s how I ended up in doing health care data analysis. But I must confess, some days I do not feel like I am adding much value.

<Hugh> Really? I believe we are in heated agreement! Some days I feel the same way.  Is that why you invited us both Bob?

<Bob> Yes.  It was some of the things that Hugh said when we talked on the phone.  They rang some warning bells for me because, in my line of work, I have seen many people fall into a whole minefield of data analysis traps that leave them feeling confused and frustrated.

<Louise> What exactly is your line of work, Bob?

<Bob> I am a systems engineer.  I design, build, verify, integrate, implement and validate systems. Fit-for-purpose systems.

<Louise> In health care?

<Bob> Not until last week when I bumped into Bill Hyde, my old chum from university.  But so far the health care system looks just like all the other ones I have worked in, so I suspect some of the lessons from other systems are transferable.

<Hugh> That sounds interesting. Can you give us an example?

<Bob> OK.  Hugh, in our first conversation, you often used the words “demand”  and “capacity”. What do you mean by those terms?

<Hugh> Well, demand is what comes through the door, the flow of requests, the workload we are expected to manage.  And capacity is the resources that we have to deliver the work and to meet our performance targets.  Capacity is the staff, the skills, the equipment, the chairs, and the beds. The stuff that costs money to provide.  As a manager, I am required to stay in-budget and that consumes a big part of my day!

<Bob> OK. Speaking as an engineer I would like to know the units of measurement of “demand” and “capacity”?

<Hugh> Oh! Um. Let me think. Er. I have never been asked that question before. Help me out here Louise.  I told you Bob asks tricky questions!

<Louise> I think I see what Bob is getting at.  We use these terms frequently but rather loosely. On reflection they are not precisely defined, especially “capacity”. There are different sorts of capacity all of which will be measured in different ways so have different units. No wonder we spend so much time discussing and debating the question of if we have enough capacity to meet the demand.  We are probably all assuming different things.  Beds cannot be equated to staff, but too often we just seem to lump everything together when we talk about “capacity”.  So by doing that what we are really asking is “do we have enough cash in the budget to pay for the stuff we thing we need?”. And if we are failing one target or another we just assume that the answer is “No” and we shout for “more cash”.

<Bob> Exactly my point. And this was one of the warning bells.  Lack of clarity on these fundamental definitions opens up a minefield of other traps like the “Flaw of Averages” and “Time equals Money“.  And if we are making those errors then they will, unwittingly, become incorporated into our data analysis.

<Louise> But we use averages all the time! What is wrong with an average?

<Bob> I can sense you are feeling a bit defensive Louise.  There is no need to.  An average is perfectly OK and is very useful tool.  The “flaw” is when it is used inappropriately.  Have you heard of Little’s Law?

<Louise> No. What’s that?

<Bob> It is the mathematically proven relationship between flow, work-in-progress and lead time.  It is a fundamental law of flow physics and it uses averages. So averages are OK.

<Hugh> So what is the “Flaw of Averages”?

<Bob> It is easier to demonstrate it than to describe it.  Let us play a game.  I have some dice and we have a big bowl of peanuts.  Let us simulate a simple two step process.  Hugh you are Step One and Louise you are Step Two.  I will be the the source of demand.

I will throw a dice and count that many peanuts out of the bowl and pass them to Hugh.  Hugh, you then throw the dice and move that many peanuts from your heap to Louise, then Louise throws the dice and moves that many from her pile to the final heap which we will call activity.

<Hugh> Sounds easy enough.  If we all use the same dice then the average flow through each step will be the same so after say ten rounds we should have, um …

<Louise> … thirty five peanuts in the activity heap.  On average.

<Bob> OK.  That’s the theory, let’s see what happens in reality.  And no eating the nuts-in-progress please.

They play the game and after a few minutes they have completed the ten rounds.

<Hugh> That’s odd.  There are only 30 nuts in the activity heap and we expected 35.  Nobody nibbled any nuts so its just chance I suppose.  Lets play again. It should average out.

…..  …..

<Louise> Thirty four this time which is better, but is still below the predicted average.  That could still be a chance effect though.  Let us run the ‘nutty’ game this a few more times.

….. …..

<Hugh> We have run the same game six times with the same nuts and the same dice and we delivered activities of 30, 34, 30, 24, 23 and 31 and there are usually nuts stuck in the process at the end of each game, so it is not due to a lack of demand.  We are consistently under-performing compared with our theoretical prediction.  That is weird.  My head says we were just unlucky but I have a niggling doubt that there is more to it.

<Louise> Is this the Flaw of Averages?

<Bob> Yes, it is one of them. If we set our average future flow-capacity to the average historical demand and there is any variation anywhere in the process then we will see this effect.

<Hugh> H’mmm.  But we do this all the time because we assume that the variation will average out over time. Intuitively it must average out over time.  What would happen if we kept going for more cycles?

<Bob> That is a very good question.  And your intuition is correct.  It does average out eventually but there is a catch.

<Hugh> What is the catch?

<Bob>  The number of peanuts in the process and the time it takes for one peanut to get through is very variable.

<Louise> Is there any pattern to the variation? Is it predictable?

<Bob> Another excellent question.  Yes, there is a pattern.  It is called “chaos”.  Predictable chaos if you like.

<Hugh> So is that the reason you said on the phone that we should present our metrics as time-series charts?

<Bob> Yes, one of them.  The appearance of chaotic system behaviour is very characteristic on a time-series chart.

<Louise> And if we see the chaos pattern on our charts then we could conclude that we have made the Flaw of Averages error?

<Bob> That would be a reasonable hypothesis.

<Hugh> I think I understand the reason you invited us to a face-to-face demonstration.  It would not have worked if you had just described it.  You have to experience it because it feels so counter-intuitive.  And this is starting to feel horribly familiar; perpetual chaos about sums up my working week!

<Louise> You also mentioned something you referred to as the “time equals money” trap.  Is that somehow linked to this?

<Bob> Yes.  We often equate time and money but they do not behave the same way.  If have five pounds today and I only spend four pounds then I can save the remaining one pound for tomorrow and spend it then – so the Law of Averages works.  But if I have five minutes today and I only use four minutes then the other minute cannot be saved and used tomorrow, it is lost forever.  That is why the Law of Averages does not work for time.

<Hugh> But that means if we set our budgets based on the average demand and the cost of people’s time then not only will we have queues, delays and chaos, we will also consistently overspend the budget too.  This is sounding more and more familiar by the minute!  This is nuts, if you will excuse the pun.

<Louise> So what is the solution?  I hope you would not have invited us here if there was no solution.

<Bob> Part of the solution is to develop our knowledge of system behaviour and how we need to present it in a visual format. With that we develop a deeper understanding of what the system behaviour charts are saying to us.  With that we can develop our ability to make wiser decisions that will lead to effective actions which will eliminate the queues, delays, chaos and cost-pressures.

<Hugh> This is possible?

<Bob> Yes. It is called systems engineering. That’s what I do.

<Louise> When do we start?

<Bob> We have started.

Dr Bill Hyde was already at the bar when Bob Jekyll arrived.

Bill and  Bob had first met at university and had become firm friends, but their careers had diverged and it was only by pure chance that their paths had crossed again recently.

They had arranged to meet up for a beer and to catch up on what had happened in the 25 years since they had enjoyed the “good old times” in the university bar.

<Dr Bill> Hi Bob, what can I get you? If I remember correctly it was anything resembling real ale. Will this “Black Sheep” do?

<Bob> Hi Bill, Perfect! I’ll get the nibbles. Plain nuts OK for you?

<Dr Bill> My favourite! So what are you up to now? What doors did your engineering degree open?

<Bob> Lots!  I’ve done all sorts – mechanical, electrical, software, hardware, process, all except civil engineering. And I love it. What I do now is a sort of synthesis of all of them.  And you? Where did your medical degree lead?

<Dr Bill> To my hearts desire, the wonderful Mrs Hyde, and of course to primary care. I am a GP. I always wanted to be a GP since I was knee-high to a grasshopper.

<Bob> Yes, you always had that “I’m going to save the world one patient at a time!” passion. That must be so rewarding! Helping people who are scared witless by the health horror stories that the media pump out.  I had a fright last year when I found a lump.  My GP was great, she confidently diagnosed a “hernia” and I was all sorted in a matter of weeks with a bit of nifty day case surgery. I was convinced my time had come. It just shows how damaging the fear of the unknown can be!

<Dr Bill> Being a GP is amazingly rewarding. I love my job. But …

<Bob> But what? Are you alright Bill? You suddenly look really depressed.

<Dr Bill> Sorry Bob. I don’t want to be a damp squib. It is good to see you again, and chat about the old days when we were teased about our names.  And it is great to hear that you are enjoying your work so much. I admit I am feeling low, and frankly I welcome the opportunity to talk to someone I know and trust who is not part of the health care system. If you know what I mean?

<Bob> I know exactly what you mean.  Well, I can certainly offer an ear, “a problem shared is a problem halved” as they say. I can’t promise to do any more than that, but feel free to tell me the story, from the beginning. No blood-and-guts gory details though please!

<Dr Bill> Ha! “Tell me the story from the beginning” is what I say to my patients. OK, here goes. I feel increasingly overwhelmed and I feel like I am drowning under a deluge of patients who are banging on the practice door for appointments to see me. My intuition tells me that the problem is not the people, it is the process, but I can’t seem to see through the fog of frustration and chaos to a clear way forward.

<Bob> OK. I confess I know nothing about how your system works, so can you give me a bit more context.

<Dr Bill> Sorry. Yes, of course. I am what is called a single-handed GP and I have a list of about 1500 registered patients and I am contracted to provide primary care for them. I don’t have to do that 24 x 7, the urgent stuff that happens in the evenings and weekends is diverted to services that are designed for that. I work Monday to Friday from 9 AM to 5 PM, and I am contracted to provide what is needed for my patients, and that means face-to-face appointments.

<Bob> OK. When you say “contracted” what does that mean exactly?

<Dr Bill> Basically, the St. Elsewhere’s® Practice is like a small business. It’s annual income is a fixed amount per year for each patient on the registration list, and I have to provide the primary care service for them from that pot of cash. And that includes all the costs, including my income, our practice nurse, and the amazing Mrs H. She is the practice receptionist, manager, administrator and all-round fixer-of-anything.

<Bob> Wow! What a great design. No need to spend money on marketing, research, new product development, or advertising! Just 100% pure service delivery of tried-and-tested medical know-how to a captive audience for a guaranteed income. I have commercial customers who would cut off their right arms for an offer like that!

<Dr Bill> Really? It doesn’t feel like that to me. It feels like the more I offer, the more the patients expect. The demand is a bottomless well of wants, but the income is capped and my time is finite!

<Bob> H’mm. Tell me more about the details of how the process works.

<Dr Bill> Basically, I am a problem-solving engine. Patients phone for an appointment, Mrs H books one, the patient comes at the appointed time, I see them, and I diagnose and treat the problem, or I refer on to a specialist if it’s more complicated. That’s basically it.

<Bob> OK. Sounds a lot simpler than 99% of the processes that I’m usually involved with. So what’s the problem?

<Dr Bill> I don’t have enough capacity! After all the appointments for the day are booked Mrs H has to say “Sorry, please try again tomorrow” to every patient who phones in after that.  The patients who can’t get an appointment are not very happy and some can get quite angry. They are anxious and frustrated and I fully understand how they feel. I feel the same.

<Bob> We will come back to what you mean by “capacity”. Can you outline for me exactly how a patient is expected to get an appointment?

<Dr Bill> We tell them to phone at 8 AM for an appointment, there is a fixed number of bookable appointments, and it is first-come-first-served.  That is the only way I can protect myself from being swamped and is the fairest solution for patients.  It wasn’t my idea; it is called Advanced Access. Each morning at 8 AM we switch on the phones and brace ourselves for the daily deluge.

<Bob> You must be pulling my leg! This design is a batch-and-queue phone-in appointment booking lottery!  I guess that is one definition of “fair”.  How many patients get an appointment on the first attempt?

<Dr Bill> Not many.  The appointments are usually all gone by 9 AM and a lot are to people who have been trying to get one for several days. When they do eventually get to see me they are usually grumpy and then spring the trump card “And while I’m here doctor I have a few other things that I’ve been saving up to ask you about“. I help if I can but more often than not I have to say, “I’m sorry, you’ll have to book another appointment!“.

<Bob> I’m not surprised you patients are grumpy. I would be too. And my recollection of seeing my GP with my scary lump wasn’t like that at all. I phoned at lunch time and got an appointment the same day. Maybe I was just lucky, or maybe my GP was as worried as me. But it all felt very calm. When I arrived there was only one other patient waiting, and I was in and out in less than ten minutes – and mightily reassured I can tell you! It felt like a high quality service that I could trust if-and-when I needed it, which fortunately is very infrequently.

<Dr Bill> I dream of being able to offer a service like that! I am prepared to bet you are registered with a group practice and you see whoever is available rather than your own GP. Single-handed GPs like me who offer the old fashioned personal service are a rarity, and I can see why. We must be suckers!

<Bob> OK, so I’m starting to get a sense of this now. Has it been like this for a long time?

<Dr Bill> Yes, it has. When I was younger I was more resilient and I did not mind going the extra mile.  But the pressure is relentless and maybe I’m just getting older and grumpier.  My real fear is I end up sounding like the burned-out cynics that I’ve heard at the local GP meetings; the ones who crow about how they are counting down the days to when they can retire and gloat.

<Bob> You’re the same age as me Bill so I don’t think either of us can use retirement as an exit route, and anyway, that’s not your style. You were never a quitter at university. Your motto was always “when the going gets tough the tough get going“.

<Dr Bill> Yeah I know. That’s why it feels so frustrating. I think I lost my mojo a long time back. Maybe I should just cave in and join up with the big group practice down the road, and accept the inevitable loss of the personal service. They said they would welcome me, and my list of 1500 patients, with open arms.

<Bob> OK. That would appear to be an option, or maybe a compromise, but I’m not sure we’ve exhausted all the other options yet.  Tell me, how do you decide how long a patient needs for you to solve their problem?

<Dr Bill> That’s easy. It is ten minutes. That is the time recommended in the Royal College Guidelines.

<Bob> Eh? All patients require exactly ten minutes?

<Dr Bill> No, of course not!  That is the average time that patients need.  The Royal College did a big survey and that was what most GPs said they needed.

<Bob> Please tell me if I have got this right.  You work 9-to-5, and you carve up your day into 10-minute time-slots called “appointments” and, assuming you are allowed time to have lunch and a pee, that would be six per hour for seven hours which is 42 appointments per day that can be booked?

<Dr Bill> No. That wouldn’t work because I have other stuff to do as well as see patients. There are only 25 bookable 10-minute appointments per day.

<Bob> OK, that makes more sense. So where does 25 come from?

<Dr Bill> Ah! That comes from a big national audit. For an average GP with and average  list of 1,500 patients, the average number of patients seeking an appointment per day was found to be 25, and our practice population is typical of the national average in terms of age and deprivation.  So I set the upper limit at 25. The workload is manageable but it seems to generate a lot of unhappy patients and I dare not increase the slots because I’d be overwhelmed with the extra workload and I’m barely coping now.  I feel stuck between a rock and a hard place!

<Bob> So you have set the maximum slot-capacity to the average demand?

<Dr Bill> Yes. That’s OK isn’t it? It will average out over time. That is what average means! But it doesn’t feel like that. The chaos and pressure never seems to go away.

There was a long pause while Bob mulls over what he had heard, sips his pint of Black Sheep and nibbles on the dwindling bowl of peanuts.  Eventually he speaks.

<Bob> Bill, I have some good news and some not-so-good news and then some more good news.

<Dr Bill> Oh dear, you sound just like me when I have to share the results of tests with one of my patients at their follow up appointment. You had better give me the “bad news sandwich”!

<Bob> OK. The first bit of good news is that this is a very common, and easily treatable flow problem.  The not-so-good news is that you will need to change some things.  The second bit of good news is that the changes will not cost anything and will work very quickly.

<Dr Bill> What! You cannot be serious!! Until ten minutes ago you said that you knew nothing about how my practice works and now you are telling me that there is a quick, easy, zero cost solution.  Forgive me for doubting your engineering know-how but I’ll need a bit more convincing than that!

<Bob> And I would too if I were in your position.  The clues to the diagnosis are in the story. You said the process problem was long-standing; you said that you set the maximum slot-capacity to the average demand; and you said that you have a fixed appointment time that was decided by a subjective consensus.  From an engineering perspective, this is a perfect recipe for generating chronic chaos, which is exactly the symptoms you are describing.

<Dr Bill> Is it? OMG. You said this is well understood and resolvable? So what do I do?

<Bob> Give me a minute.  You said the average demand is 25 per day. What sort of service would you like your patients to experience? Would “90% can expect a same day appointment on the first call” be good enough as a starter?

<Dr Bill> That would be game changing!  Mrs H would be over the moon to be able to say “Yes” that often. I would feel much less anxious too, because I know the current system is a potentially dangerous lottery. And my patients would be delighted and relieved to be able to see me that easily and quickly.

<Bob> OK. Let me work this out. Based on what you’ve said, some assumptions, and a bit of flow engineering know-how; you would need to offer up to 31 appointments per day.

<Dr Bill> What! That’s impossible!!! I told you it would be impossible! That would be another hour a day of face-to-face appointments. When would I do the other stuff? And how did you work that out anyway?

<Bob> I did not say they would have to all be 10-minute appointments, and I did not say you would expect to fill them all every day. I did however say you would have to change some things.  And I did say this is a well understood flow engineering problem.  It is called “resilience design“. That’s how I was able to work it out on the back of this Black Sheep beer mat.

<Dr Bill> H’mm. That is starting to sound a bit more reasonable. What things would I have to change? Specifically?

<Bob> I’m not sure what specifically yet.  I think in your language we would say “I have taken a history, and I have a differential diagnosis, so next I’ll need to examine the patient, and then maybe do some tests to establish the actual diagnosis and to design and decide the treatment plan“.

<Dr Bill> You are learning the medical lingo fast! What do I need to do first? Brace myself for the forensic rubber-gloved digital examination?

<Bob> Alas, not yet and certainly not here. Shall we start with the vital signs? Height, weight, pulse, blood pressure, and temperature? That’s what my GP did when I went with my scary lump.  The patient here is not you, it is your St. Elsewhere’s® Practice, and we will need to translate the medical-speak into engineering-speak.  So one thing you’ll need to learn is a bit of the lingua-franca of systems engineering.  By the way, that’s what I do now. I am a systems engineer, or maybe now a health care systems engineer?

<Dr Bill> Point me in the direction of the HCSE dictionary! The next round is on me. And the nuts!

<Bob> Excellent. I’ll have another Black Sheep and some of those chilli-coated ones. We have work to do.  Let me start by explaining what “capacity” actually means to an engineer. Buckle up. This ride might get a bit bumpy.

This story is fictional, but the subject matter is factual.

Bob’s diagnosis and recommendations are realistic and reasonable.

Chapter 1 of the HCSE dictionary can be found here.

And if you are a GP who recognises these “symptoms” then this may be of interest.

Sometimes change is dramatic. A big improvement appears very quickly. And when that happens we are caught by surprise (and delight).

Our emotional reaction is much faster than our logical response. “Wow! That’s a miracle!

Our logical Tortoise eventually catches up with our emotional Hare and says “Hare, we both know that there is no such thing as miracles and magic. There must be a rational explanation. What is it?

And Hare replies “I have no idea, Tortoise.  If I did then it would not have been such a delightful surprise. You are such a kill-joy! Can’t you just relish the relief without analyzing the life out of it?

Tortoise feels hurt. “But I just want to understand so that I can explain to others. So that they can do it and get the same improvement.  Not everyone has a ‘nothing-ventured-nothing-gained’ attitude like you! Most of us are too fearful of failing to risk trusting the wild claims of improvement evangelists. We have had our fingers burned too often.

The apparent miracle is real and recent … here is a snippet of the feedback:

Notice carefully the last sentence. It took a year of discussion to get an “OK” and a month of planning to prepare the “GO”.

That is not a miracle and some magic … that took a lot of hard work!

The evangelist is the customer. The supplier is an engineer.

The context is the chronic niggle of patients trying to get an appointment with their GP, and the chronic niggle of GPs feeling overwhelmed with work.

Here is the back story …

In the opening weeks of the 21st Century, the National Primary Care Development Team (NPDT) was formed.  Primary care was a high priority and the government had allocated £168m of investment in the NHS Plan, £48m of which was earmarked to improve GP access.

The approach the NPDT chose was:

harvest best practice +
use a panel of experts +
disseminate best practice.

Dr (later Sir) John Oldham was the innovator and figure-head.  The best practice was copied from Dr Mark Murray from Kaiser Permanente in the USA – the Advanced Access model.  The dissemination method was copied from from Dr Don Berwick’s Institute of Healthcare Improvement (IHI) in Boston – the Collaborative Model.

The principle of Advanced Access is “today’s-work-today” which means that all the requests for a GP appointment are handled the same day.  And the proponents of the model outlined the key elements to achieving this:

1. Measure daily demand.
2. Set capacity so that is sufficient to meet the daily demand.
3. Simple booking rule: “phone today for a decision today”.

But that is not what was rolled out. The design was modified somewhere between aspiration and implementation and in two important ways.

First, by adding a policy of “Phone at 08:00 for an appointment”, and second by adding a policy of “carving out” appointment slots into labelled pots such as ‘Dr X’ or ‘see in 2 weeks’ or ‘annual reviews’.

Subsequent studies suggest that the tweaking happened at the GP practice level and was driven by the fear that, by reducing the waiting time, they would attract more work.

In other words: an assumption that demand for health care is supply-led, and without some form of access barrier, the system would be overwhelmed and never be able to cope.

The result of this well-intended tampering with the Advanced Access design was to invalidate it. Oops!

To a systems engineer this is meddling was counter-productive.

The “today’s work today” specification is called a demand-led design and, if implemented competently, will lead to shorter waits for everyone, no need for urgent/routine prioritization and slot carve-out, and a simpler, safer, calmer, more efficient, higher quality, more productive system.

In this context it does not mean “see every patient today” it means “assess and decide a plan for every patient today”.

In reality, the actual demand for GP appointments is not known at the start; which is why the first step is to implement continuous measurement of the daily number and category of requests for appointments.

The second step is to feed back this daily demand information in a visual format called a time-series chart.

The third step is to use this visual tool for planning future flow-capacity, and for monitoring for ‘signals’, such as spikes, shifts, cycles and slopes.

That was not part of the modified design, so the reasonable fear expressed by GPs was (and still is) that by attempting to do today’s-work-today they would unleash a deluge of unmet need … and be swamped/drowned.

So a flood defense barrier was bolted on; the policy of “phone at 08:00 for an appointment today“, and then the policy of  channeling the over spill into pots of “embargoed slots“.

The combined effect of this error of omission (omitting the measured demand visual feedback loop) and these errors of commission (the 08:00 policy and appointment slot carve-out policy) effectively prevented the benefits of the Advanced Access design being achieved.  It was a predictable failure.

But no one seemed to realize that at the time.  Perhaps because of the political haste that was driving the process, and perhaps because there were no systems engineers on the panel-of-experts to point out the risks of diluting the design.

It is also interesting to note that the strategic aim of the NPCT was to develop a self-sustaining culture of quality improvement (QI) in primary care. That didn’t seem to have happened either.

The roll out of Advanced Access was not the success it was hoped. This is the conclusion from the 300+ page research report published in 2007.

The “Miracle on Tavanagh Avenue” that was experienced this week by both patients and staff was the expected effect of this tampering finally being corrected; and the true potential of the original demand-led design being released – for all to experience.

Remember the essential ingredients?

1. Measure daily demand and feed it back as a visual time-series chart.
2. Set capacity so that is sufficient to meet the daily demand.
3. Use a simple booking rule: “phone anytime for a decision today”.

But there is also an extra design ingredient that has been added in this case, one that was not part of the original Advanced Access specification, one that frees up GP time to provide the required “resilience” to sustain a same-day service.

And that “secret” ingredient is how the new design worked so quickly and feels like a miracle – safe, calm, enjoyable and productive.

This is health care systems engineering (HCSE) in action.

So congratulations to Harry Longman, the whole team at GP Access, and to Dr Philip Lusty and the team at Riverside Practice, Tavangh Avenue, Portadown, NI.

You have demonstrated what was always possible.

The fear of failure prevented it before, just as it prevented you doing this until you were so desperate you had no other choices.

To read the fuller story click here.

PS. Keep a close eye on the demand time-series chart and if it starts to rise then investigate the root cause … immediately.

Phil and Pete are having a coffee and a chat.  They both work in the NHS and have been friends for years.

They have different jobs. Phil is a commissioner and an accountant by training, Pete is a consultant and a doctor by training.

They are discussing a challenge that affects them both on a daily basis: unscheduled care.

Both Phil and Pete want to see significant and sustained improvements and how to achieve them is often the focus of their coffee chats.

<Phil> We are agreed that we both want improvement, both from my perspective as a commissioner and from your perspective as a clinician. And we agree that what we want to see improvements in patient safety, waiting, outcomes, experience for both patients and staff, and use of our limited NHS resources.

<Pete> Yes. Our common purpose, the “what” and “why”, has never been an issue.  Where we seem to get stuck is the “how”.  We have both tried many things but, despite our good intentions, it feels like things are getting worse!

<Phil> I agree. It may be that what we have implemented has had a positive impact and we would have been even worse off if we had done nothing. But I do not know. We clearly have much to learn and, while I believe we are making progress, we do not appear to be learning fast enough.  And I think this knowledge gap exposes another “how” issue: After we have intervened, how do we know that we have (a) improved, (b) not changed or (c) worsened?

<Pete> That is a very good question.  And all that I have to offer as an answer is to share what we do in medicine when we ask a similar question: “How do I know that treatment A is better than treatment B?”  It is the essence of medical research; the quest to find better treatments that deliver better outcomes and at lower cost.  The similarities are strong.

<Phil> OK. How do you do that? How do you know that “Treatment A is better than Treatment B” in a way that anyone will trust the answer?

 <Pete> We use a science that is actually very recent on the scientific timeline; it was only firmly established in the first half of the 20th century. One reason for that is that it is rather a counter-intuitive science and for that reason it requires using tools that have been designed and demonstrated to work but which most of us do not really understand how they work. They are a bit like magic black boxes.

<Phil> H’mm. Please forgive me for sounding skeptical but that sounds like a big opportunity for making mistakes! If there are lots of these “magic black box” tools then how do you decide which one to use and how do you know you have used it correctly?

<Pete> Those are good questions! Very often we don’t know and in our collective confusion we generate a lot of unproductive discussion.  This is why we are often forced to accept the advice of experts but, I confess, very often we don’t understand what they are saying either! They seem like the medieval Magi.

<Phil> H’mm. So these experts are like ‘magicians’ – they claim to understand the inner workings of the black magic boxes but are unable, or unwilling, to explain in a language that a ‘muggle’ would understand?

<Pete> Very well put. That is just how it feels.

<Phil> So can you explain what you do understand about this magical process? That would be a start.

<Pete> OK, I will do my best.  The first thing we learn in medical research is that we need to be clear about what it is we are looking to improve, and we need to be able to measure it objectively and accurately.

<Phil> That  makes sense. Let us say we want to improve the patient’s subjective quality of the A&E experience and objectively we want to reduce the time they spend in A&E. We measure how long they wait. 

<Pete> The next thing is that we need to decide how much improvement we need. What would be worthwhile? So in the example you have offered we know that reducing the average time patients spend in A&E by just 30 minutes would have a significant effect on the quality of the patient and staff experience, and as a by-product it would also dramatically improve the 4-hour target performance.

<Phil> OK.  From the commissioning perspective there are lots of things we can do, such as commissioning alternative paths for specific groups of patients; in effect diverting some of the unscheduled demand away from A&E to a more appropriate service provider.  But these are the sorts of thing we have been experimenting with for years, and it brings us back to the question: How do we know that any change we implement has had the impact we intended? The system seems, well, complicated.

<Pete> In medical research we are very aware that the system we are changing is very complicated and that we do not have the power of omniscience.  We cannot know everything.  Realistically, all we can do is to focus on objective outcomes and collect small samples of the data ocean and use those in an attempt to draw conclusions can trust. We have to design our experiment with care!

<Phil> That makes sense. Surely we just need to measure the stuff that will tell us if our impact matches our intent. That sounds easy enough. What’s the problem?

<Pete> The problem we encounter is that when we measure “stuff” we observe patient-to-patient variation, and that is before we have made any changes.  Any impact that we may have is obscured by this “noise”.

<Phil> Ah, I see.  So if the our intervention generates a small impact then it will be more difficult to see amidst this background noise. Like trying to see fine detail in a fuzzy picture.

<Pete> Yes, exactly like that.  And it raises the issue of “errors”.  In medical research we talk about two different types of error; we make the first type of error when our actual impact is zero but we conclude from our data that we have made a difference; and we make the second type of error when we have made an impact but we conclude from our data that we have not.

<Phil> OK. So does that imply that the more “noise” we observe in our measure for-improvement before we make the change, the more likely we are to make one or other error?

<Pete> Precisely! So before we do the experiment we need to design it so that we reduce the probability of making both of these errors to an acceptably low level.  So that we can be assured that any conclusion we draw can be trusted.

<Phil> OK. So how exactly do you do that?

<Pete> We know that whenever there is “noise” and whenever we use samples then there will always be some risk of making one or other of the two types of error.  So we need to set a threshold for both. We have to state clearly how much confidence we need in our conclusion. For example, we often use the convention that we are willing to accept a 1 in 20 chance of making the Type I error.

<Phil> Let me check if I have heard you correctly. Suppose that, in reality, our change has no impact and we have set the risk threshold for a Type 1 error at 1 in 20, and suppose we repeat the same experiment 100 times – are you saying that we should expect about five of our experiments to show data that says our change has had the intended impact when in reality it has not?

<Pete> Yes. That is exactly it.

<Phil> OK.  But in practice we cannot repeat the experiment 100 times, so we just have to accept the 1 in 20 chance that we will make a Type 1 error, and we won’t know we have made it if we do. That feels a bit chancy. So why don’t we just set the threshold to 1 in 100 or 1 in 1000?

<Pete> We could, but doing that has a consequence.  If we reduce the risk of making a Type I error by setting our threshold lower, then we will increase the risk of making a Type II error.

<Phil> Ah! I see. The old swings-and-roundabouts problem. By the way, do these two errors have different names that would make it  easier to remember and to explain?

<Pete> Yes. The Type I error is called a False Positive. It is like concluding that a patient has a specific diagnosis when in reality they do not.

<Phil> And the Type II error is called a False Negative?

<Pete> Yes.  And we want to avoid both of them, and to do that we have to specify a separate risk threshold for each error.  The convention is to call the threshold for the false positive the alpha level, and the threshold for the false negative the beta level.

<Phil> OK. So now we have three things we need to be clear on before we can do our experiment: the size of the change that we need, the risk of the false positive that we are willing to accept, and the risk of a false negative that we are willing to accept.  Is that all we need?

<Pete> In medical research we learn that we need six pieces of the experimental design jigsaw before we can proceed. We only have three pieces so far.

<Phil> What are the other three pieces then?

<Pete> We need to know the average value of the metric we are intending to improve, because that is our baseline from which improvement is measured.  Improvements are often framed as a percentage improvement over the baseline.  And we need to know the spread of the data around that average, the “noise” that we referred to earlier.

<Phil> Ah, yes!  I forgot about the noise.  But that is only five pieces of the jigsaw. What is the last piece?

<Pete> The size of the sample.

<Phil> Eh?  Can’t we just go with whatever data we can realistically get?

<Pete> Sadly, no.  The size of the sample is how we control the risk of a false negative error.  The more data we have the lower the risk. This is referred to as the power of the experimental design.

<Phil> OK. That feels familiar. I know that the more experience I have of something the better my judgement gets. Is this the same thing?

<Pete> Yes. Exactly the same thing.

<Phil> OK. So let me see if I have got this. To know if the impact of the intervention matches our intention we need to design our experiment carefully. We need all six pieces of the experimental design jigsaw and they must all fall inside our circle of control. We can measure the baseline average and spread; we can specify the impact we will accept as useful; we can specify the risks we are prepared to accept of making the false positive and false negative errors; and we can collect the required amount of data after we have made the intervention so that we can trust our conclusion.

<Pete> Perfect! That is how we are taught to design research studies so that we can trust our results, and so that others can trust them too.

<Phil> So how do we decide how big the post-implementation data sample needs to be? I can see we need to collect enough data to avoid a false negative but we have to be pragmatic too. There would appear to be little value in collecting more data than we need. It would cost more and could delay knowing the answer to our question.

<Pete> That is precisely the trap than many inexperienced medical researchers fall into. They set their sample size according to what is achievable and affordable, and then they hope for the best!

<Phil> Well, we do the same. We analyse the data we have and we hope for the best.  In the magical metaphor we are asking our data analysts to pull a white rabbit out of the hat.  It sounds rather irrational and unpredictable when described like that! Have medical researchers learned a way to avoid this trap?

<Pete> Yes, it is a tool called a power calculator.

<Phil> Ooooo … a power tool … I like the sound of that … that would be a cool tool to have in our commissioning bag of tricks. It would be like a magic wand. Do you have such a thing?

<Pete> Yes.

<Phil> And do you understand how the power tool magic works well enough to explain to a “muggle”?

<Pete> Not really. To do that means learning some rather unfamiliar language and some rather counter-intuitive concepts.

<Phil> Is that the magical stuff I hear lurks between the covers of a medical statistics textbook?

<Pete> Yes. Scary looking mathematical symbols and unfathomable spells!

<Phil> Oh dear!  Is there another way for to gain a working understanding of this magic? Something a bit more pragmatic? A path that a ‘statistical muggle’ might be able to follow?

<Pete> Yes. It is called a simulator.

<Phil> You mean like a flight simulator that pilots use to learn how to control a jumbo jet before ever taking a real one out for a trip?

<Pete> Exactly like that.

<Phil> Do you have one?

<Pete> Yes. It was how I learned about this “stuff” … pragmatically.

<Phil> Can you show me?

<Pete> Of course.  But to do that we will need a bit more time, another coffee, and maybe a couple of those tasty looking Danish pastries.

<Phil> A wise investment I’d say.  I’ll get the the coffee and pastries, if you fire up the engines of the simulator.

The immortal words from Apollo 13 that alerted us to an evolving catastrophe …

… and that is what we are seeing in the UK health and social care system … using the thermometer of A&E 4-hour performance. England is the red line.


The chart shows that this is not a sudden change, it has been developing over quite a long period of time … so why does it feel like an unpleasant surprise?

One reason may be that NHS England is using performance management techniques that were out of date in the 1980’s and are obsolete in the 2010’s!

Let me show you what I mean. This is a snapshot from the NHS England Board Minutes for November 2016.

RAG stands for Red-Amber-Green and what we want to see on a Risk Assessment is Green for the most important stuff like safety, flow, quality and affordability.

We are not seeing that.  We are seeing Red/Amber for all of them. It is an evolving catastrophe.

A risk RAG chart is an obsolete performance management tool.

Here is another snippet …


This demonstrates the usual mix of single point aggregates for the most recent month (October 2016); an arbitrary target (4 hours) used as a threshold to decide failure/not failure; two-point comparisons (October 2016 versus October 2015); and a sprinkling of ratios. Not a single time-series chart in sight. No pictures that tell a story.

Click here for the full document (which does also include some very sensible plans to maintain hospital flow through the bank holiday period).

The risk of this way of presenting system performance data is that it is a minefield of intuitive traps for the unwary.  Invisible pitfalls that can lead to invalid conclusions, unwise decisions, potentially ineffective and/or counter-productive actions, and failure to improve. These methods are risky and that is why they should be obsolete.

And if NHSE is using obsolete tools than what hope do CCGs and Trusts have?

Much better tools have been designed.  Tools that are used by organisations that are innovative, resilient, commercially successful and that deliver safety, on-time delivery, quality and value for money. At the same time.

And they are obsolete outside the NHS because in the competitive context of the dog-eat-dog real world, organisations do not survive if they do not innovate, improve and learn as fast as their competitors.  They do not have the luxury of being shielded from reality by having a central tax-funded monopoly!

And please do not misinterpret my message here; I am a 100% raving fan of the NHS ethos of “available to all and free at the point of delivery” and an NHS that is funded centrally and fairly. That is not my issue.

My issue is the continued use of obsolete performance management tools in the NHS.

Q: So what are the alternatives? What do the successful commercial organisations use instead?

A: System behaviour charts.

SBCs are pictures of how the system is behaving over time – pictures that tell a story – pictures that have meaning – pictures that we can use to diagnose, design and deliver a better outcome than the one we are heading towards.

Pictures like the A&E performance-over-time chart above.

Click here for more on how and why.

Therefore, if the DoH, NHSE, NHSI, STPs, CCGs and Trust Boards want to achieve their stated visions and missions then the writing-on-the-wall says that they will need to muster some humility and learn how successful organisations do this.

This is not a comfortable message to hear and it is easier to be defensive than receptive.

The NHS has to change if it wants to survive and continue serve the people who pay the salaries. And time is running out. Continuing as we are is not an option. Complaining and blaming are not options. Doing nothing is not an option.

Learning is the only option.

Anyone can learn to use system behaviour charts.  No one needs to rely on averages, two-point comparisons, ratios, targets, and the combination of failure-metrics and us-versus-them-benchmarking that leads to the chronic mediocrity trap.

And there is hope for those with enough hunger, humility and who are prepared to do the hard-work of developing their personal, team, department and organisational capability to use better management methods.

Apollo 13 is a true story.  The catastrophe was averted.  The astronauts were brought home safely.  The film retells the story of how that miracle was achieved. Perhaps watching the whole film would be somewhere to start, because it holds many valuable lessons for us all – lessons on how effective teams behave.


About 25 years ago a paper was published in the Harvard Business Review with the interesting title of “Teaching Smart People How To Learn

The uncomfortable message was that many people who are top of the intellectual rankings are actually very poor learners.

This sounds like a paradox.  How can people be high-achievers and yet be unable to learn?

Health care systems are stuffed full of super-smart, high-achieving professionals. The cream of educational crop. The top 2%. They are called “doctors”.

And we have a problem with improvement in health care … a big problem … the safety, delivery, quality and affordability of the NHS is getting worse. Not better.

Improvement implies change and change implies learning, so if smart people struggle to learn then could that explain why health care systems find self-improvement so difficult?

This paragraph from the 1991 HBR paper feels uncomfortably familiar:


The author, Chris Argyris, refers to something called “single-loop learning” and if we translate this management-speak into the language of medicine it would come out as “treating the symptom and ignoring the disease“.  That is poor medicine.

Chris also suggests an antidote to this problem and gave it the label “double-loop learning” which if translated into medical speak becomes “diagnosis“.  And that is something that doctors can relate to because without a diagnosis, a justifiable treatment is difficult to formulate.

We need to diagnose the root cause(s) of the NHS disease.

The 1991 HBR paper refers back to an earlier 1977 HBR paper called Double Loop Learning in Organisations where we find the theory that underpins it.

The proposed hypothesis is that we all have cognitive models that we use to decide our actions (and in-actions), what I have referred to before as ChimpWare.  In it is a reference to a table published in a 1974 book and the message is that Single-Loop learning is a manifestation of a Model 1 theory-in-action.


And if we consider the task that doctors are expected to do then we can empathize with their dominant Model 1 approach.  Health care is a dangerous business.  Doctors can cause a lot of unintentional harm – both physical and psychological.  Doctors are dealing with a very, very complex system – a human body – that they only partially understand.  No two patients are exactly the same and illness is a dynamic process.  Everyone’s expectations are high. We have come a long way since the days of blood-letting and leeches!  Failure is not tolerated.

Doctors are intelligent and competitive … they had to be to win the education race.

Doctors must make tough decisions and have to have tough conversations … many, many times … and yet not be consumed in the process.  They often have to suppress emotions to be effective.

Doctors feel the need to protect patients from harm – both physical and emotional.

And collectively they do a very good job.  Doctors are respected and trusted professionals.

But …  to quote Chris Argyris …

“Model I blinds people to their weaknesses. For instance, the six corporate presidents were unable to realize how incapable they were of questioning their assumptions and breaking through to fresh understanding. They were under the illusion that they could learn, when in reality they just kept running around the same track.”

This blindness is self-reinforcing because …

“All parties withheld information that was potentially threatening to themselves or to others, and the act of cover-up itself was closed to discussion.”

How many times have we seen this in the NHS?

The Mid-Staffordshire Hospital debacle that led to the Francis Report is all the evidence we need.

So what is the way out of this double-bind?

Chris gives us some hints with his Model II theory-in-use.

  1. Valid information – Study.
  2. Free and informed choice – Plan.
  3. Constant monitoring of the implementation – Do.

The skill required is to question assumptions and break through to fresh understanding and we can do that with design-led approach because that is what designers do.

They bring their unconscious assumptions up to awareness and ask “Is that valid?” and “What if” questions.

It is called Improvement-by-Design.

And the good news is that this Model II approach works in health care, and we know that because the evidence is accumulating.


thinker_figure_unsolve_puzzle_150_wht_18309Many of the challenges that we face in delivering effective and affordable health care do not have well understood and generally accepted solutions.

If they did there would be no discussion or debate about what to do and the results would speak for themselves.

This lack of understanding is leading us to try to solve a complicated system design challenge in our heads.  Intuitively.

And trying to do it this way is fraught with frustration and risk because our intuition tricks us. It was this sort of challenge that led Professor Rubik to invent his famous 3D Magic Cube puzzle.

It is difficult enough to learn how to solve the Magic Cube puzzle by trial and error; it is even more difficult to attempt to do it inside our heads! Intuitively.

And we know the Rubik Cube puzzle is solvable, so all we need are some techniques, tools and training to improve our Rubik Cube solving capability.  We can all learn how to do it.

Returning to the challenge of safe and affordable health care, and to the specific problem of unscheduled care, A&E targets, delayed transfers of care (DTOC), finance, fragmentation and chronic frustration.

This is a systems engineering challenge so we need some systems engineering techniques, tools and training before attempting it.  Not after failing repeatedly.


One technique that a systems engineer will use is called a Vee Diagram such as the one shown above.  It shows the sequence of steps in the generic problem solving process and it has the same sequence that we use in medicine for solving problems that patients present to us …

Diagnose, Design and Deliver

which is also known as …

Study, Plan, Do.

Notice that there are three words in the diagram that start with the letter V … value, verify and validate.  These are probably the three most important words in the vocabulary of a systems engineer.

One tool that a systems engineer always uses is a model of the system under consideration.

Models come in many forms from conceptual to physical and are used in two main ways:

  1. To assist the understanding of the past (diagnosis)
  2. To predict the behaviour in the future (prognosis)

And the process of creating a system model, the sequence of steps, is shown in the Vee Diagram.  The systems engineer’s objective is a validated model that can be trusted to make good-enough predictions; ones that support making wiser decisions of which design options to implement, and which not to.

So if a systems engineer presented us with a conceptual model that is intended to assist our understanding, then we will require some evidence that all stages of the Vee Diagram process have been completed.  Evidence that provides assurance that the model predictions can be trusted.  And the scope over which they can be trusted.

Last month a report was published by the Nuffield Trust that is entitled “Understanding patient flow in hospitals”  and it asserts that traffic flow on a motorway is a valid conceptual model of patient flow through a hospital.  Here is a direct quote from the second paragraph in the Executive Summary:

Unfortunately, no evidence is provided in the report to support the validity of the statement and that omission should ring an alarm bell.

The observation that “the hospitals with the least free space struggle the most” is not a validation of the conceptual model.  Validation requires a concrete experiment.

To illustrate why observation is not validation let us consider a scenario where I have a headache and I take a paracetamol and my headache goes away.  I now have some evidence that shows a temporal association between what I did (take paracetamol) and what I got (a reduction in head pain).

But this is not a valid experiment because I have not considered the other seven possible combinations of headache before (Y/N), paracetamol (Y/N) and headache after (Y/N).

An association cannot be used to prove causation; not even a temporal association.

When I do not understand the cause, and I am without evidence from a well-designed experiment, then I might be tempted to intuitively jump to the (invalid) conclusion that “headaches are caused by lack of paracetamol!” and if untested this invalid judgement may persist and even become a belief.

Understanding causality requires an approach called counterfactual analysis; otherwise known as “What if?” And we can start that process with a thought experiment using our rhetorical model.  But we must remember that we must always validate the outcome with a real experiment. That is how good science works.

A famous thought experiment was conducted by Albert Einstein when he asked the question “If I were sitting on a light beam and moving at the speed of light what would I see?” This question led him to the Theory of Relativity which completely changed the way we now think about space and time.  Einstein’s model has been repeatedly validated by careful experiment, and has allowed engineers to design and deliver valuable tools such as the Global Positioning System which uses relativity theory to achieve high positional precision and accuracy.

So let us conduct a thought experiment to explore the ‘faster movement requires more space‘ statement in the case of patient flow in a hospital.

First, we need to define what we mean by the words we are using.

The phrase ‘faster movement’ is ambiguous.  Does it mean higher flow (more patients per day being admitted and discharged) or does it mean shorter length of stage (the interval between the admission and discharge events for individual patients)?

The phrase ‘more space’ is also ambiguous. In a hospital that implies physical space i.e. floor-space that may be occupied by corridors, chairs, cubicles, trolleys, and beds.  So are we actually referring to flow-space or storage-space?

What we have in this over-simplified statement is the conflation of two concepts: flow-capacity and space-capacity. They are different things. They have different units. And the result of conflating them is meaningless and confusing.

However, our stated goal is to improve understanding so let us consider one combination, and let us be careful to be more precise with our terminology, “higher flow always requires more beds“. Does it? Can we disprove this assertion with an example where higher flow required less beds (i.e. space-capacity)?

The relationship between flow and space-capacity is well understood.

The starting point is Little’s Law which was proven mathematically in 1961 by J.D.C. Little and it states:

Average work in progress = Average lead time  X  Average flow.

In the hospital context, work in progress is the number of occupied beds, lead time is the length of stay and flow is admissions or discharges per time interval (which must be the same on average over a long period of time).

(NB. Engineers are rather pedantic about units so let us check that this makes sense: the unit of WIP is ‘patients’, the unit of lead time is ‘days’, and the unit of flow is ‘patients per day’ so ‘patients’ = ‘days’ * ‘patients / day’. Correct. Verified. Tick.)

So, is there a situation where flow can increase and WIP can decrease? Yes. When lead time decreases. Little’s Law says that is possible. We have disproved the assertion.

Let us take the other interpretation of higher flow as shorter length of stay: i.e. shorter length of stay always requires more beds.  Is this correct? No. If flow remains the same then Little’s Law states that we will require fewer beds. This assertion is disproved as well.

And we need to remember that Little’s Law is proven to be valid for averages, does that shed any light on the source of our confusion? Could the assertion about flow and beds actually be about the variation in flow over time and not about the average flow?

And this is also well understood. The original work on it was done almost exactly 100 years ago by Agner Arup Erlang and the problem he looked at was the quality of customer service of the early telephone exchanges. Specifically, how likely was the caller to get the “all lines are busy, please try later” response.

What Erlang showed was there there is a mathematical relationship between the number of calls being made (the demand), the probability of a call being connected first time (the service quality) and the number of telephone circuits and switchboard operators available (the service cost).

So it appears that we already have a validated mathematical model that links flow, quality and cost that we might use if we substitute ‘patients’ for ‘calls’, ‘beds’ for ‘telephone circuits’, and ‘being connected’ for ‘being admitted’.

And this topic of patient flow, A&E performance and Erlang queues has been explored already … here.

So a telephone exchange is a more valid model of a hospital than a motorway.

We are now making progress in deepening our understanding.

The use of an invalid, untested, conceptual model is sloppy systems engineering.

So if the engineering is sloppy we would be unwise to fully trust the conclusions.

And I share this feedback in the spirit of black box thinking because I believe that there are some valuable lessons to be learned here – by us all.

To vote for this topic please click here.
To subscribe to the blog newsletter please click here.
To email the author please click here.

motorway[Beep] Bob’s computer alerted him to Leslie signing on to the Webex session.

<Bob> Good afternoon Leslie, how are you? It seems a long time since we last chatted.

<Leslie> Hi Bob. I am well and it has been a long time. If you remember, I had to loop out of the Health Care Systems Engineering training because I changed job, and it has taken me a while to bring a lot of fresh skeptics around to the idea of improvement-by-design.

<Bob> Good to hear, and I assume you did that by demonstrating what was possible by doing it, delivering results, and describing the approach.

<Leslie> Yup. And as you know, even with objective evidence of improvement it can take a while because that exposes another gap, the one between intent and impact.  Many people get rather defensive at that point, so I have had to take it slowly. Some people get really fired up though.

 <Bob> Yes. Respect, challenge, patience and persistence are all needed. So, where shall we pick up?

<Leslie> The old chestnut of winter pressures and A&E targets.  Except that it is an all-year problem now and according to what I read in the news, everyone is predicting a ‘melt-down’.

<Bob> Did you see last week’s IS blog on that very topic?

<Leslie> Yes, I did!  And that is what prompted me to contact you and to re-start my CHIPs coaching.  It was a real eye opener.  I liked the black swan code-named “RC9” story, it makes it sound like a James Bond film!

<Bob> I wonder how many people dug deeper into how “RC9” achieved that rock-steady A&E performance despite a rising tide of arrivals and admissions?

<Leslie> I did, and I saw several examples of anti-carve-out design.  I have read though my notes and we have talked about carve out many times.

<Bob> Excellent. Being able to see the signs of competent design is just as important as the symptoms of inept design. So, what shall we talk about?

<Leslie> Well, by co-incidence I was sent a copy of of a report entitled “Understanding patient flow in hospitals” published by one of the leading Think Tanks and I confess it made no sense to me.  Can we talk about that?

<Bob> OK. Can you describe the essence of the report for me?

<Leslie> Well, in a nutshell it said that flow needs space so if we want hospitals to flow better we need more space, in other words more beds.

<Bob> And what evidence was presented to support that hypothesis?

<Leslie> The authors equated the flow of patients through a hospital to the flow of traffic on a motorway. They presented a table of numbers that made no sense to me, I think partly because there are no units stated for some of the numbers … I’ll email you a picture.


<Bob> I agree this is not a very informative table.  I am not sure what the definition of “capacity” is here and it may be that the authors may be equating “hospital bed” to “area of tarmac”.  Anyway, the assertion that hospital flow is equivalent to motorway flow is inaccurate.  There are some similarities and traffic engineering is an interesting subject, but they are not equivalent.  A hospital is more like a busy city with junctions, cross-roads, traffic lights, roundabouts, zebra crossings, pelican crossings and all manner of unpredictable factors such as cyclists and pedestrians. Motorways are intentionally designed without these “impediments”, for obvious reasons! A complex adaptive flow system like a hospital cannot be equated to a motorway. It is a dangerous over-simplification.

<Leslie> So, if the hospital-motorway analogy is invalid then the conclusions are also invalid?

<Bob> Sometimes, by accident, we get a valid conclusion from an invalid method. What were the conclusions?

<Leslie> That the solution to improving A&E performance is more space (i.e. hospital beds) but there is no more money to build them or people to staff them.  So the recommendations are to reduce volume, redesign rehabilitation and discharge processes, and improve IT systems.

<Bob> So just re-iterating the habitual exhortations and nothing about using well-understood systems engineering methods to accurately diagnose the actual root cause of the ‘symptoms’, which is likely to be the endemic carveoutosis multiforme, and then treat accordingly?

<Leslie> No. I could not find the term “carve out” anywhere in the document.

<Bob> Oh dear.  Based on that observation, I do not believe this latest Think Tank report is going to be any more effective than the previous ones.  Perhaps asking “RC9” to write an account of what they did and how they learned to do it would be more informative?  They did not reduce volume, and I doubt they opened more beds, and their annual report suggests they identified some space and flow carveoutosis and treated it. That is what a competent systems engineer would do.

<Leslie> Thanks Bob. Very helpful as always. What is my next step?

<Bob> Some ISP-2 brain-teasers, a juicy ISP-2 project, and some one day training workshops for your all-fired-up CHIPs.

<Leslie> Bring it on!

For more posts like this please vote here.
For more information please subscribe here.

reading_a_book_pa_150_wht_3136An effective way to improve is to learn from others who have demonstrated the capability to achieve what we seek.  To learn from success.

Another effective way to improve is to learn from those who are not succeeding … to learn from failures … and that means … to learn from our own failings.

But from an early age we are socially programmed with a fear of failure.

The training starts at school where failure is not tolerated, nor is challenging the given dogma.  Paradoxically, the effect of our fear of failure is that our ability to inquire, experiment, learn, adapt, and to be resilient to change is severely impaired!

So further failure in the future becomes more likely, not less likely. Oops!

Fortunately, we can develop a healthier attitude to failure and we can learn how to harness the gap between intent and impact as a source of energy, creativity, innovation, experimentation, learning, improvement and growing success.

And health care provides us with ample opportunities to explore this unfamiliar terrain. The creative domain of the designer and engineer.

The scatter plot below is a snapshot of the A&E 4 hr target yield for all NHS Trusts in England for the month of July 2016.  The required “constitutional” performance requirement is better than 95%.  The delivered whole system average is 85%.  The majority of Trusts are failing, and the Trust-to-Trust variation is rather wide. Oops!

This stark picture of the gap between intent (95%) and impact (85%) prompts some uncomfortable questions:

Q1: How can one Trust achieve 98% and yet another can do no better than 64%?

Q2: What can all Trusts learn from these high and low flying outliers?

[NB. I have not asked the question “Who should we blame for the failures?” because the name-shame-blame-game is also a predictable consequence of our fear-of-failure mindset.]

Let us dig a bit deeper into the information mine, and as we do that we need to be aware of a trap:

A snapshot-in-time tells us very little about how the system and the set of interconnected parts is behaving-over-time.

We need to examine the time-series charts of the outliers, just as we would ask for the temperature, blood pressure and heart rate charts of our patients.

Here are the last six years by month A&E 4 hr charts for a sample of the high-fliers. They are all slightly different and we get the impression that the lower two are struggling more to stay aloft more than the upper two … especially in winter.

And here are the last six years by month A&E 4 hr charts for a sample of the low-fliers.  The Mark I Eyeball Test results are clear … these swans are falling out of the sky!

So we need to generate some testable hypotheses to explain these visible differences, and then we need to examine the available evidence to test them.

One hypothesis is “rising demand”.  It says that “the reason our A&E is failing is because demand on A&E is rising“.

Another hypothesis is “slow flow”.  It says that “the reason our A&E is failing is because of the slow flow through the hospital because of delayed transfers of care (DTOCs)“.

So, if these hypotheses account for the behaviour we are observing then we would predict that the “high fliers” are (a) diverting A&E arrivals elsewhere, and (b) reducing admissions to free up beds to hold the DTOCs.

Let us look at the freely available data for the highest flyer … the green dot on the scatter gram … code-named “RC9”.

The top chart is the A&E arrivals per month.

The middle chart is the A&E 4 hr target yield per month.

The bottom chart is the emergency admissions per month.

Both arrivals and admissions are increasing, while the A&E 4 hr target yield is rock steady!

And arranging the charts this way allows us to see the temporal patterns more easily (and the images are deliberately arranged to show the overall pattern-over-time).

Patterns like the change-for-the-better that appears in the middle of the winter of 2013 (i.e. when many other trusts were complaining that their sagging A&E performance was caused by “winter pressures”).

The objective evidence seems to disprove the “rising demand”, “slow flow” and “winter pressure” hypotheses!

So what can we learn from our failure to adequately explain the reality we are seeing?

The trust code-named “RC9” is Luton and Dunstable, and it is an average district general hospital, on the surface.  So to reveal some clues about what actually happened there, we need to read their Annual Report for 2013-14.  It is a public document and it can be downloaded here.

This is just a snippet …

… and there are lots more knowledge nuggets like this in there …

… it is a treasure trove of well-known examples of good system flow design.

The results speak for themselves!

Q: How many black swans does it take to disprove the hypothesis that “all swans are white”.

A: Just one.

“RC9” is a black swan. An outlier. A positive deviant. “RC9” has disproved the “impossibility” hypothesis.

And there is another flock of black swans living in the North East … in the Newcastle area … so the “Big cities are different” hypothesis does not hold water either.

The challenge here is a human one.  A human factor.  Our learned fear of failure.

Learning-how-to-fail is the way to avoid failing-how-to-learn.

And to read more about that radical idea I strongly recommend reading the recently published book called Black Box Thinking by Matthew Syed.

It starts with a powerful story about the impact of human factors in health care … and here is a short video of Martin Bromiley describing what happened.

The “black box” that both Martin and Matthew refer to is the one that is used in air accident investigations to learn from what happened, and to use that learning to design safer aviation systems.

Martin Bromiley has founded a charity to support the promotion of human factors in clinical training, the Clinical Human Factors Group.

So if we can muster the courage and humility to learn how to do this in health care for patient safety, then we can also learn to how do it for flow, quality and productivity.

Our black swan called “RC9” has demonstrated that this goal is attainable.

And the body of knowledge needed to do this already exists … it is called Health and Social Care Systems Engineering (HSCSE).

For more posts like this please vote here.
For more information please subscribe here.
To email the author please click here.

Postscript: And I am pleased to share that Luton & Dunstable features in the House of Commons Health Committee report entitled Winter Pressures in A&E Departments that was published on 3rd Nov 2016.

Here is part of what L&D shared to explain their deviant performance:


These points describe rather well the essential elements of a pull design, which is the antidote to the rather more prevalent pressure cooker design.

On 5th July 2018, the NHS will be 70 years old, and like many of those it was created to serve, it has become elderly and frail.

We live much longer, on average, than we used to and the growing population of frail elderly are presenting an unprecedented health and social care challenge that the NHS was never designed to manage.

The creases and cracks are showing, and each year feels more pressured than the last.

This week a story that illustrates this challenge was shared with me along with permission to broadcast …

“My mother-in-law is 91, in general she is amazingly self-sufficient, able to arrange most of her life with reasonable care at home via a council tendered care provider.

She has had Parkinson’s for years, needing regular medication to enable her to walk and eat (it affects her jaw and swallowing capability). So the care provision is time critical, to get up, have lunch, have tea and get to bed.

She’s also going deaf, profoundly in one ear, pretty bad in the other. She wears a single ‘in-ear’ aid, which has a micro-switch on/off toggle, far too small for her to see or operate. Most of the carers can’t put it in, and fail to switch it off.

Her care package is well drafted, but rarely adhered to. It should be 45 minutes in the morning, 30, 15, 30 through the day. Each time administering the medications from the dossette box. Despite the register in/out process from the carers, many visits are far less time than designed (and paid for by the council), with some lasting 8 minutes instead of 30!

Most carers don’t ensure she takes her meds, which sometimes leads to dropped pills on the floor, with no hope of picking them up!

While the care is supposedly ‘time critical’ the provider don’t manage it via allocated time slots, they simply provide lists, that imply the order of work, but don’t make it clear. My mother-in-law (Mum) cannot be certain when the visit will occur, which makes going out very difficult.

The carers won’t cook food, but will micro-wave it, thus if a cooked meal is to happen, my Mum will start it, with the view of the carers serving it. If they arrive early, the food is under-cooked (“Just put vinegar on it, it will taste better”) and if they arrive late, either she’ll try to get it out herself, or it will be dried out / cremated.

Her medication pattern should be every 4 to 5 hours in the day, with a 11:40 lunch visit, and a 17:45 tea visit, followed by a 19:30 bed prep visit, she finishes up with too long between meds, followed by far too close together. Her GP has stated that this is making her health and Parkinson’s worse.

Mum also rarely drinks enough through the day, in the hot whether she tends to dehydrate, which we try to persuade her must be avoided. Part of the problem is Parkinson’s related, part the hassle of getting to the toilet more often. Parkinson’s affects swallowing, so she tends to sip, rather than gulp. By sipping often, she deludes herself that she is drinking enough.

She also is stubbornly not adjusting methods to align to issues. She drinks tea and water from her lovely bone china cups. Because her grip is not good and her hand shakes, we can’t fill those cups very high, so her ‘cup of tea’ is only a fraction of what it could be.

As she can walk around most days, there’s no way of telling whether she drinks enough, and she frequently has several different carers in a day.

When Mum gets dehydrated, it affects her memory and her reasoning, similar to the onset of dementia. It also seems to increase her probability of falling, perhaps due to forgetting to be defensive.

When she falls, she cannot get up, thus usually presses her alarm dongle, resulting in me going round to get her up, check for concussion, and check for other injuries, prior to settling her down again. These can be ten weeks apart, through to a few in a week.

When she starts to hallucinate, we do our very best to increase drinking, seeking to re-hydrate.

On Sunday, something exceptional happened, Mum fell out of bed and didn’t press her alarm. The carer found her and immediately called the paramedics and her GP, who later called us in. For the first time ever she was not sufficiently mentally alert to press her alarm switch.

After initial assessment, she was taken to A&E, luckily being early on Sunday morning it was initially quite quiet.


The Hospital is on the boundary between two counties, within a large town, a mixture of new build elements, between aging structures. There has been considerable investment within A&E, X-ray etc. due partly to that growth industry and partly due to the closures of cottage hospitals and reducing GP services out of hours.

It took some persuasion to have Mum put on a drip, as she hadn’t had breakfast or any fluids, and dehydration was a probable primary cause of her visit. They took bloods, an X-ray of her chest (to check for fall related damage) and a CT scan of her head, to see if there were issues.

I called the carers to tell them to suspend visits, but the phone simply rang without be answered (not for the first time.)

After about six hours, during which time she was awake, but not very lucid, she was transferred to the day ward, where after assessment she was given some meds, a sandwich and another drip.

Later that evening we were informed she was to be kept on a drip for 24 hours.

The next day (Bank Holiday Monday) she was transferred to another ward. When we arrived she was not on a drip, so their decisions had been reversed.

I spoke at length with her assigned staff nurse, and was told the following: Mum could come out soon if she had a 24/7 care package, and that as well as the known issues mum now has COPD. When I asked her what COPD was, she clearly didn’t know, but flustered a ‘it is a form of heart failure that affects breathing’. (I looked it up on my phone a few minutes later.)

So, to get mum out, I had to arrange a 24/7 care package, and nowhere was open until the next day.

Trying to escalate care isn’t going to be easy, even in the short term. My emails to ‘usually very good’ social care people achieved nothing to start with on Tuesday, and their phone was on the ‘out of hours’ setting for evenings and weekends, despite being during the day of a normal working week.

Eventually I was told that there would be nothing to achieve until the hospital processed the correct exit papers to Social Care.

When we went in to the hospital (on Tuesday) a more senior nurse was on duty. She explained that mum was now medically fit to leave hospital if care can be re-established. I told her that I was trying to set up 24/7 care as advised. She looked through the notes and said 24/7 care was not needed, the normal 4 x a day was enough. (She was clearly angry).

I then explained that the newly diagnosed COPD may be part of the problem, she said that she’s worked with COPD patients for 16 years, and mum definitely doesn’t have COPD. While she was amending the notes, I noticed that mum’s allergy to aspirin wasn’t there, despite us advising that on entry. The nurse also explained that as the hospital is in one county, but almost half their patients are from another, they are always stymied on ‘joined up working’

While we were talking with mum, her meds came round and she was only given paracetamol for her pain, but NOT her meds for Parkinson’s. I asked that nurse why that was the case, and she said that was not on her meds sheet. So I went back to the more senior nurse, she checked the meds as ordered and Parkinson’s was required 4 x a day, but it was NOT transferred onto the administration sheet. The doctor next to us said she would do it straight away, and I was told, “Thank God you are here to get this right!”

Mum was given her food, it consisted of some soup, which she couldn’t spoon due to lack of meds and a dry tough lump of gammon and some mashed sweet potato, which she couldn’t chew.

When I asked why meds were given at five, after the delivery of food, they said ‘That’s our system!’, when I suggested that administering Parkinson’s meds an hour before food would increase the ability to eat the food they said “that’s a really good idea, we should do that!”

On Wednesday I spoke with Social Care to try to re-start care to enable mum to get out. At that time the social worker could neither get through to the hospital nor the carers. We spoke again after I had arrived in hospital, but before I could do anything.

On arrival at the hospital I was amazed to see the white-board declaring that mum would be discharged for noon on Monday (in five days-time!). I spoke with the assigned staff nurse who said, “That’s the earliest that her carers can re-start, and anyway its nearly the weekend”.

I said that “mum was medically OK for discharge on Tuesday, after only two days in the hospital, and you are complacent to block the bed for another six days, have you spoken with the discharge team?”

She replied, “No they’ll have gone home by now, and I’ve not seen them all day” I told her that they work shifts, and that they will be here, and made it quite clear if she didn’t contact SHEDs that I’d go walkabout to find them. A few minutes later she told me a SHED member would be with me in 20 minutes.

While the hospital had resolved her medical issues, she was stuck in a ward, with no help to walk, the only TV via a complex pay-for system she had no hope of understanding, with no day room, so no entertainment, no exercise, just boredom encouraged to lay in bed, wear a pad because she won’t be taken to the loo in time.

When the SHED worker arrived I explained the staff nurse attitude, she said she would try to improve those thinking processes. She took lots of details, then said that so long as mum can walk with assistance, she could be released after noon, to have NHS carer support, 4 times a day, from the afternoon. She walked around the ward for the first time since being admitted, and while shaky was fine.

Hopefully all will be better now?”

This story is not exceptional … I have heard it many times from many people in many different parts of the UK.  It is the norm rather than the exception.

It is the story of a fragmented and fractured system of health and social care.

It is the story of frustration for everyone – patients, family, carers, NHS staff, commissioners, and tax-payers.  A fractured care system is unsafe, chaotic, frustrating and expensive.

There are no winners here.  It is not a trade off, compromise or best possible.

It is just poor system design.

What we want has a name … it is called a Frail Safe design … and this is not a new idea.  It is achievable. It has been achieved.

So why is this still happening?

The reason is simple – the NHS does not know any other way.  It does not know how to design itself to be safe, calm, efficient, high quality and affordable.

It does not know how to do this because it has never learned that this is possible.

But it is possible to do, and it is possible to learn, and that learning does not take very long or cost very much.

And the return vastly outnumbers the investment.

The title of this blog is Righteous Indignation

… if your frail elderly parents, relatives or friends were forced to endure a system that is far from frail safe; and you learned that this situation was avoidable and that a safer design would be less expensive; and all you hear is “can’t do” and “too busy” and “not enough money” and “not my job” …  wouldn’t you feel a sense of righteous indignation?

I do.

For more posts like this please vote here.
For more information please subscribe here.

figure_falling_with_arrow_17621The late Russell Ackoff used to tell a great story. It goes like this:

“A team set themselves the stretch goal of building the World’s Best Car.  So the put their heads together and came up with a plan.

First they talked to drivers and drew up a list of all the things that the World’s Best Car would need to have. Safety, speed, low fuel consumption, comfort, good looks, low emissions and so on.

Then they drew up a list of all the components that go into building a car. The engine, the wheels, the bodywork, the seats, and so on.

Then they set out on a quest … to search the world for the best components … and to bring the best one of each back.

Then they could build the World’s Best Car.

Or could they?

No.  All they built was a pile of incompatible parts. The WBC did not work. It was a futile exercise.

Then the penny dropped. The features in their wish-list were not associated with any of the separate parts. Their desired performance emerged from the way the parts worked together. The working relationships between the parts were as necessary as the parts themselves.

And a pile of average parts that work together will deliver a better performance than a pile of best parts that do not.

So the relationships were more important than the parts!

From this they learned that the quickest, easiest and cheapest way to degrade performance is to make working-well-together a bit more difficult.  Irrespective of the quality of the parts.

Q: So how do we reverse this degradation of performance?

A: Add more failure-avoidance targets of course!

But we just discovered that the performance is the effect of how the parts work well together?  Will another failure-metric-fueled performance target help? How will each part know what it needs to do differently – if anything?  How will each part know if the changes they have made are having the intended impact?

Fragmentation has a cost.  Fear, frustration, futility and ultimately financial failure.

So if performance is fading … the quality of the working relationships is a good place to look for opportunities for improvement.

stick_figure_help_button_150_wht_9911Imagine this scenario:

You develop some non-specific symptoms.

You see your GP who refers you urgently to a 2 week clinic.

You are seen, assessed, investigated and informed that … you have cancer!

The shock, denial, anger, blame, bargaining, depression, acceptance sequence kicks off … it is sometimes called the Kübler-Ross grief reaction … and it is a normal part of the human psyche.

But there is better news. You also learn that your condition is probably treatable, but that it will require chemotherapy, and that there are no guarantees of success.

You know that time is of the essence … the cancer is growing.

And time has a new relevance for you … it is called life time … and you know that you may not have as much left as you had hoped.  Every hour is precious.

So now imagine your reaction when you attend your local chemotherapy day unit (CDU) for your first dose of chemotherapy and have to wait four hours for the toxic but potentially life-saving drugs.

They are very expensive and they have a short shelf-life so the NHS cannot afford to waste any.   The Aseptic Unit team wait until all the safety checks are OK before they proceed to prepare your chemotherapy.  That all takes time, about four hours.

Once the team get to know you it will go quicker. Hopefully.

It doesn’t.

The delays are not the result of unfamiliarity … they are the result of the design of the process.

All your fellow patients seem to suffer repeated waiting too, and you learn that they have been doing so for a long time.  That seems to be the way it is.  The waiting room is well used.

Everyone seems resigned to the belief that this is the best it can be.

They are not happy about it but they feel powerless to do anything.

Then one day someone demonstrates that it is not the best it can be.

It can be better.  A lot better!

And they demonstrate that this better way can be designed.

And they demonstrate that they can learn how to design this better way.

And they demonstrate what happens when they apply their new learning …

… by doing it and by sharing their story of “what-we-did-and-how-we-did-it“.


If life time is so precious, why waste it?

And perhaps the most surprising outcome was that their safer, quicker, calmer design was also 20% more productive.

Portsmouth_News_20160609We form emotional attachments to places where we have lived and worked.  And it catches our attention when we see them in the news.

So this headline caught my eye, because I was a surgical SHO in Portsmouth in the closing years of the Second Millennium.  The good old days when we still did 1:2 on call rotas (i.e. up to 104 hours per week) and we were paid 70% LESS for the on call hours than the Mon-Fri 9-5 work.  We also had stable ‘firms’, superhuman senior registrars, a canteen that served hot food and strong coffee around the clock, and doctors mess parties that were … well … messy!  A lot has changed.  And not all for the better.

Here is the link to the fuller story about the emergency failures.

And from it we get the impression that this is a recent problem.  And with a bit of a smack and some name-shame-blame-game feedback from the CQC, then all will be restored to robust health. H’mm. I am not so sure that is the full story.

Portsmouth_A&E_4Hr_YieldHere is the monthly aggregate A&E 4-hour target performance chart for Portsmouth from 2010 to date.

It says “this is not a new problem“.

It also says that the ‘patient’ has been deteriorating spasmodically over six years and is now critically-ill.

And giving a critically-ill hospital a “good telling off” is about as effective as telling a critically-ill patient to “pull themselves together“.  Inept management.

In A&E a critically-ill patient requires competent resuscitation using a tried-and-tested process of ABC.  Airway, Breathing, Circulation.

Also, the A&E 4-hour performance is only a symptom of the sickness in the whole urgent care system.  It is the reading on an emotometer inserted into the A&E orifice of the acute hospital!  Just one piece in a much bigger flow jigsaw.

It only tells us the degree of distress … not the diagnosis … nor the required treatment.

So what level of A&E health can we realistically expect to be able to achieve? What is possible in the current climate of austerity? Just how chilled-out can the A&E cucumber run?


This is the corresponding A&E emotometer chart for a different district general hospital somewhere else in NHS England.

Luton & Dunstable Hospital to be specific.

This A&E happiness chart looks a lot healthier and it seems to be getting even healthier over time too.  So this is possible.

Yes, but … if our hospital deteriorates enough to be put on the ‘critical list’ then we need to call in an Emergency Care Intensive Support Team (ECIST) to resuscitate us.

Kettering_A&E_4Hr_YieldA very good idea.

And how do their critically-ill patients fare?

Here is the chart of one of them. The significant improvement following the ‘resuscitation’ is impressive to be sure!

But, disappointingly, it was not sustained and the patient ‘crashed’ again. Perhaps they were just too poorly? Perhaps the first resuscitation call was sent out too late? But at least they tried their best.

An experienced clinician might comment: Those are indeed a plausible explanations, but before we conclude that is the actual cause, can I check that we did not just treat the symptoms and miss the disease?

Q: So is it actually possible to resuscitate and repair a sick hospital?  Is it possible to restore it to sustained health, by diagnosing and treating the cause, and not just the symptoms?

Monklands_A&E_4Hr_YieldHere is the corresponding A&E emotometer chart of yet another hospital.

It shows the same pattern of deteriorating health. And it shows a dramatic improvement.  It appears to have responded to some form of intervention.

And this time the significant improvement has sustained. The patient did not crash-and-burn again.

So what has happened here that explains this different picture?

This hospital had enough insight and humility to seek the assistance of someone who knew what to do and who had a proven track record of doing it.  Dr Kate Silvester to be specific.  A dual-trained doctor and manufacturing systems engineer.

Dr Kate is now a health care systems engineer (HCSE), and an experienced ‘hospital doctor’.

Dr Kate helped them to learn how to diagnose the root causes of their A&E 4-hr fever, and then she showed them how to design an effective treatment plan.

They did the re-design; they tested it; and they delivered their new design. Because they owned it, they understood it, and they trusted their own diagnosis-and-design competence.

And the evidence of their impact matching their intent speaks for itself.

growing_workload_anim_6858There is a very easy and quick-to-cook recipe for chaos.

All we have to do is to ensure that the maximum number of jobs that we can do in a given time is set equal to the average number of jobs that we are required to do in the same period of time.


That does not make sense.  Our intuition says that looks like the perfect recipe for a hyper-efficient, zero-waste, zero idle-time design which is what we want.

I know it does, but it isn’t.  Our intuition is tricking us.

It is the recipe for chaos – and to prove it all we will have to do a real world experiment – because to prove it using maths is really difficult. So difficult in fact that the formula was not revealed until 1962 – by a mathematician called John Kingman while a postgraduate student at Pembroke College, Cambridge.

The empirical experiment is very easy to do – all we need is a single step process – and a stream of jobs to do.

And we could do it for real, or we can simulate it using an Excel spreadsheet – which is much quicker.

So we set up our spreadsheet to simulate a new job arriving every X minutes and each job taking X minutes to complete.

Our operator can only do one job at a time so if a job arrives and the operator is busy the job joins the back of a queue of jobs and waits.

When the operator finishes a job it takes the next one from the front of the queue, the one that has been waiting longest.

And if there is no queue the operator will wait until the next job arrives.


And when we run simulation the we see that there is indeed no queue, no jobs waiting and the operator is always busy (i.e. 100% utilised). Perfection!

BUT ….

This is not a realistic scenario.  In reality there is always some random variation.  Not all jobs require the same length of time, and jobs do not arrive at precisely the right intervals.

No matter, our confident intuition tells us. It will average out.  Swings-and-roundabouts. Give-and-take.

It doesn’t.

And if you do not believe me just build the simple Excel model outlined above, verify that it works, then add some random variation to the time it takes to do each job … and observe what happens to the average waiting time.

What you will discover is that as soon as we add even a small amount of random variation we get a queue, and waiting and idle resources as well!

But not a steady, stable, predictable queue … Oh No! … We get an unsteady, unstable and unpredictable queue … we get chaos.

Try it.

So what? How does this abstract ‘queue theory’ apply to the real world?

Well, suppose we have a single black box system called ‘a hospital’ – patients arrive and we work hard to diagnose and treat them.  And so long as we have enough resource-time to do all the jobs we are OK. No unstable queues. No unpredictable waiting.

But time-costs-money and we have an annual cost improvement target (CIP) that we are required to meet so we need to ‘trim’ resource-time capacity to push up resource utilisation.  And we will call that an ‘efficiency improvement’ which is good … yes?

It isn’t actually.  I can just as easily push up my ‘utilisation’ by working slower, or doing stuff I do not need to, or by making mistakes that I have to check for and then correct.  I can easily make myself busier and delude myself I am working harder.

And we are also a victim of our own success … the better we do our job … the longer people live and the more workload they put on the health and social care system.

So we have the perfect storm … the perfect recipe for chaos … slowly rising demand … slowly shrinking budgets … and an inefficient ‘business’ design.

And that in a nutshell is the reason the NHS is descending into chaos.

So what is the solution?

Reduce demand? Stop people getting sick? Or make them sicker so they die quicker?

Increase budgets? Where will the money come from? Beg? Borrow? Steal? Economic growth?

Improve the design?  Now there’s a thought. But how? By using the same beliefs and behaviours that have created the current chaos?

Maybe we need to challenge some invalid beliefs and behaviours … and replace those that fail the Reality Test with some more effective ones.

A few weeks ago I raised the undiscussable issue that the NHS feels like it is on a downward trajectory … and that what might be needed are some better engines … and to design, test, build and install them we will need some health care system engineers (HCSEs) … and that we do not have appear to have enough of those. None in fact.

The feedback shows that many people resonated with this sentiment.

This week I had the opportunity to peek inside the NHS Cockpit and look at the Dashboard … and this is what I saw on the A&E Performance panel.


This is the monthly aggregate A&E 4-hour performance for England (red), Scotland (purple), Wales (brown) and Northern Ireland (grey) for the last six years.

The trajectory looked alarmingly obvious to me – the NHS is on a predictable path to destruction – a controlled flight into terrain (CFIT).

The repeating up-and-down pattern is the annual cycle of seasons; better in the summer and worse in the winter.  This signal is driven by the celestial clock … the movement of the planets … which is beyond our power to influence.

The downward trajectory is the cumulative effect of our current design … which is the emergent effect of our collective beliefs, behaviours, policies and politics … which are completely within our gift to change.

If we chose to and if we knew how to – which we do not appear to.

Our collective ineptitude is not a topic for discussion. It is a taboo subject.

And I know that because if it were for discussion then this dashboard would be on public view on a website hosted by the NHS.

It isn’t.

George_DonaldIt was created by George Donald, a member of the public, a disappointed patient, and a retired IT consultant.  And it was shared, free for all to see and use via Twitter (@GMDonald).

The information source is open, public, shared NHS data, but it takes a lot of work to winkle it out and present it like this.  So well done George … keep up the great work!

Now have a closer look at the Dashboard Display … look at the most recent data for England and Scotland.  What do you see?

Does it look like Scotland is pulling out of the dive and England is heading down even faster?

Hard to say for sure; there are lots of signals and noise all mixed up.

So we need to use some Systems Engineering tools to help us separate the signals from the noise; and for this a statistical process control (SPC) chart is useless.  We need a system behaviour chart (SBC) and its handy helper the deviation from aim (DFA) chart.

I will not bore you with the technical details but, suffice it to say, it is a tried-and-tested technique called the Method of Residuals.

Scotland_A&E_DFA_02 Exhibit #1 is the DFA chart for Scotland.  The middle 4 years (2011-2014) are used to create a ‘predictive model’;  the model projection is then compared with measured performance; and the difference is plotted as the DFA chart.

What this “says” is that the 2015/16 performance in Scotland is significantly better than projected, and the change of direction seemed to start in the first half of 2015.

This evidence seems to support the results of our Mark I Eyeball test.


Exhibit #2 – the DFA for England suggests the 2015/16 performance is significantly worse than projected, and this deterioration appears to have started later in 2015.

Oh dear! I do not believe that was the intention, but it appears to be the impact.

So what are England and Scotland doing differently?
What can we all learn from this?
What can we all do differently in the future?

Isn’t that a question that more people like you, me and George could reasonably ask of those whom we entrust to design, build and fly our NHS?

Isn’t that a reasonable question that could be asked by the 65 million people in the UK who might, at any time, be unlucky enough to require a trip to their local A&E department.

So, let us all grasp the nettle and get the Elephant in the Room into plain view and say in unison “The Emperor Has No Clothes!”

We are suffering from mass ineptitude and hubris, to use Dr Atul Gawande’s language, and we need a better collective strategy.

And there is hope.

Some innovative hospitals have had the courage to grasp the nettle. They have seen what is coming; they have fully accepted the responsibility for their own fate; they have stepped up to the challenge; they have looked-listened-and-learned from others, and they are proving what is possible.

They have a name. They are called positive deviants.

Have a look at this short video … it is jaw-dropping … it is humbling … it is inspiring … and it is challenging … because it shows what has been achieved already.

It shows what is possible. Now, and here in the UK.

Luton and Dunstable



This week I had the great pleasure of watching Dr Don Berwick sharing the story of his own ‘near religious experience‘ and his conversion to a belief that a Science of Improvement exists.  In 1986, Don attended one of W.Edwards Deming’s famous 4-day workshops.  It was an emotional roller coaster ride for Don! See here for a link to the whole video … it is worth watching all of it … the best bit is at the end.

Don outlines Deming’s System of Profound Knowledge (SoPK) and explores each part in turn. Here is a summary of SoPK from the Deming website.


W.Edwards Deming was a physicist and statistician by training and his deep understanding of variation and appreciation for a system flows from that.  He was not trained as a biologist, psychologist or educationalist and those parts of the SoPK appear to have emerged later.

Here are the summaries of these parts – psychology first …


Neurobiologists and psychologists now know that we are the product of our experiences and our learning. What we think consciously is just the emergent tip of a much bigger cognitive iceberg. Most of what is happening is operating out of awareness. It is unconscious.  Our outward behaviour is just a visible manifestation of deeply ingrained values and beliefs that we have learned – and reinforced over and over again.  Our conscious thoughts are emergent effects.

So how do we learn?  How do we accumulate these values and beliefs?

This is the summary of Deming’s Theory of Knowledge …


But to a biologist, neuroanatomist, neurophysiologist, doctor, system designer and improvement coach … this does not feel correct.

At the most fundamental biological level we do not learn by starting with a theory; we start with a sensory.  The simplest element of the animal learning system – the nervous system – is called a reflex arc.

Sensor_Processor_EffectorFirst, we have some form of sensor to gather data from the outside world. Eyes, ears, smell, taste, touch, temperature, pain and so on.  Let us consider pain.

That signal is transmitted via a sensory nerve to the processor, the grey matter in this diagram, where it is filtered, modified, combined with other data, filtered again and a binary output generated. Act or Not.

If the decision is ‘Act’ then this signal is transmitted by a motor nerve to an effector, in this case a muscle, which results in an action.  The muscle twitches or contracts and that modifies the outside world – we pull away from the source of pain.  It is a harm avoidance design. Damage-limitation. Self-preservation.

Another example of this sensor-processor-effector design template is a knee-jerk reflex, so-named because if we tap the tendon just below the knee we can elicit a reflex contraction of the thigh muscle.  It is actually part of a very complicated, dynamic, musculoskeletal stability cybernetic control system that allows us to stand, walk and run … with almost no conscious effort … and no conscious awareness of how we are doing it.

But we are not born able to walk. As youngsters we do not start with a theory of how to walk from which we formulate a plan. We see others do it and we attempt to emulate them. And we fail repeatedly. Waaaaaaah! But we learn.

Human learning starts with study. We then process the sensory data using our internal mental model – our rhetoric; we then decide on an action based on our ‘current theory’; and then we act – on the external world; and then we observe the effect.  And if we sense a difference between our expectation and our experience then that triggers an ‘adjustment’ of our internal model – so next time we may do better because our rhetoric and the reality are more in sync.

The biological sequence is Study-Adjust-Plan-Do-Study-Adjust-Plan-Do and so on, until we have achieved our goal; or until we give up trying to learn.

So where does psychology come in?

Well, sometimes there is a bigger mismatch between our rhetoric and our reality. The world does not behave as we expect and predict. And if the mismatch is too great then we are left with feelings of confusion, disappointment, frustration and fear.  (PS. That is our unconscious mind telling us that there is a big rhetoric-reality mismatch).

We can see the projection of this inner conflict on the face of a child trying to learn to walk.  They screw up their faces in conscious effort, and they fall over, and they hurt themselves and they cry.  But they do not want us to do it for them … they want to learn to do it for themselves. Clumsily at first but better with practice. They get up and try again … and again … learning on each iteration.

Study-Adjust-Plan-Do over and over again.

There is another way to avoid the continual disappointment, frustration and anxiety of learning.  We can distort our sensation of external reality to better fit with our internal rhetoric.  When we do that the inner conflict goes away.

We learn how to tamper with our sensory filters until what we perceive is what we believe. Inner calm is restored (while outer chaos remains or increases). We learn the psychological defense tactics of denial and blame.  And we practice them until they are second-nature. Unconscious habitual reflexes. We build a reality-distortion-system (RDS) and it has a name – the Ladder of Inference.

And then one day, just by chance, somebody or something bypasses our RDS … and that is the experience that Don Berwick describes.

Don went to a 4-day workshop to hear the wisdom of W.Edwards Deming first hand … and he was forced by the reality he saw to adjust his inner model of the how the world works. His rhetoric.  It was a stormy transition!

The last part of his story is the most revealing.  It exposes that his unconscious mind got there first … and it was his conscious mind that needed to catch up.

Study-(Adjust)-Plan-Do … over-and-over again.

In Don’s presentation he suggests that Frederick W. Taylor is the architect of the failure of modern management. This is a commonly held belief, and everyone is equally entitled to an opinion, that is a definition of mutual respect.

But before forming an individual opinion on such a fundamental belief we should study the raw evidence. The words written by the person who wrote them not just the words written by those who filtered the reality through their own perceptual lenses.  Which we all do.

The Harvard Business Review is worth reading because many of its articles challenge deeply held assumptions, and then back up the challenge with the pragmatic experience of those who have succeeded to overcome the limiting beliefs.

So the heading on the April 2016 copy that awaited me on my return from an Easter break caught my eye: YOU CAN’T FIX CULTURE.



The successful leaders of major corporate transformations are agreed … the cultural change follows the technical change … and then the emergent culture sustains the improvement.

The examples presented include the Ford Motor Company, Delta Airlines, Novartis – so these are not corporate small fry!

The evidence suggests that the belief of “we cannot improve until the culture changes” is the mantra of failure of both leadership and management.

A health care system is characterised by a culture of risk avoidance. And for good reason. It is all too easy to harm while trying to heal!  Primum non nocere is a core tenet – first do no harm.

But, change and improvement implies taking risks – and those leaders of successful transformation know that the bigger risk by far is to become paralysed by fear and to do nothing.  Continual learning from many small successes and many small failures is preferable to crisis learning after a catastrophic failure!

The UK healthcare system is in a state of chronic chaos.  The evidence is there for anyone willing to look.  And waiting for the NHS culture to change, or pushing for culture change first appears to be a guaranteed recipe for further failure.

The HBR article suggests that it is better to stay focussed; to work within our circles of control and influence; to learn from others where knowledge is known, and where it is not – to use small, controlled experiments to explore new ground.

And I know this works because I have done it and I have seen it work.  Just by focussing on what is important to every member on the team; focussing on fixing what we could fix; not expecting or waiting for outside help; gathering and sharing the feedback from patients on a continuous basis; and maintaining patient and team safety while learning and experimenting … we have created a micro-culture of high safety, high efficiency, high trust and high productivity.  And we have shared the evidence via JOIS.

The micro-culture required to maintain the safety, flow, quality and productivity improvements emerged and evolved along with the improvements.

It was part of the effect, not the cause.

So the concept of ‘fix the system design flaws and the continual improvement culture will emerge’ seems to work at macro-system and at micro-system levels.

We just need to learn how to diagnose and treat healthcare system design flaws. And that is known knowledge.

So what is the next excuse?  Too busy?

frailsafeSafe means avoiding harm, and safety is an emergent property of a well-designed system.

Frail means infirm, poorly, wobbly and at higher risk of harm.

So we want our health care system to be a FrailSafe Design.

But is it? How would we know? And what could we do to improve it?

About ten years ago I was involved in a project to improve the safety design of a specific clinical stream flowing through the hospital that I work in.

The ‘at risk’ group of patients were frail elderly patients admitted as an emergency after a fall and who had suffered a fractured thigh bone. The neck of the femur.

Historically, the outcome for these patients was poor.  Many do not survive, and many of the survivors never returned to independent living. They become even more frail.

The project was undertaken during an organisational transition, the hospital was being ‘taken over’ by a bigger one.  This created a window of opportunity for some disruptive innovation, and the project was labelled as a ‘Lean’ one because we had been inspired by similar work done at Bolton some years before and Lean was the flavour of the month.

The actual change was small: it was a flow design tweak that cost nothing to implement.

First we asked two flow questions:
Q1: How many of these high-risk frail patients do we admit a year?
A1: About one per day on average.
Q2: What is the safety critical time for these patients?
A2: The first four days.  The sooner they have hip surgery and are able to be actively mobilise the better their outcome.

Second we applied Little’s Law which showed the average number of patients in this critical phase is four. This was the ‘work in progress’ or WIP.

And we knew that variation is always present, and we knew that having all these patients in one place would make it much easier for the multi-disciplinary teams to provide timely care and to avoid potentially harmful delays.

So we suggested that one six-bedded bay on one of the trauma wards be designated the Fractured Neck Of Femur bay.

That was the flow diagnosis and design done.

The safety design was created by the multi-disciplinary teams who looked after these patients: the geriatricians, the anaesthetists, the perioperative emergency care team (PECT), the trauma and orthopaedic team, the physiotherapists, and so on.

They designed checklists to ensure that all #NOF patients got what they needed when they needed it and so that nothing important was left to chance.

And that was basically it.

And the impact was remarkable. The stream flowed. And one measured outcome was a dramatic and highly statistically significant reduction in mortality.

The full paper was published in Injury 2011; 42: 1234-1237.

We had created a FrailSafe Design … which implied that what was happening before was clearly not safe for these frail patients!

And there was an improved outcome for the patients who survived: A far larger proportion rehabilitated and returned to independent living, and a far smaller proportion required long-term institutional care.

By learning how to create and implement a FrailSafe Design we had added both years-to-life and life-to-years.

It cost nothing to achieve and the message was clear, as this quote is from the 2011 paper illustrates …


What was a bit disappointing was the gap of four years between delivering this dramatic and highly significant patient safety and quality improvement and the sharing of the story.

What is more exciting is that the concept of FrailSafe is growing, evolving and spreading.

Pearl_and_OysterThe word pearl is a metaphor for something rare, beautiful, and valuable.

Pearls are formed inside the shell of certain mollusks as a defense mechanism against a potentially threatening irritant.

The mollusk creates a pearl sac to seal off the irritation.

And so it is with change and improvement.  The growth of precious pearls of improvement wisdom – the ones that develop slowly over time – are triggered by an irritant.

Someone asking an uncomfortable question perhaps, or presenting some information that implies that an uncomfortable question needs to be asked.

About seven years ago a question was asked “Would improving healthcare flow and quality result in lower costs?”

It is a good question because some believe that it would and some believe that it would not.  So an experiment to test the hypothesis was needed.

The Health Foundation stepped up to the challenge and funded a three year project to find the answer. The design of the experiment was simple. Take two oysters and introduce an irritant into them and see if pearls of wisdom appeared.

The two ‘oysters’ were Sheffield Hospital and Warwick Hospital and the irritant was Dr Kate Silvester who is a doctor and manufacturing system engineer and who has a bit-of-a-reputation for asking uncomfortable questions and backing them up with irrefutable information.

Two rare and precious pearls did indeed grow.

In Sheffield, it was proved that by improving the design of their elderly care process they improved the outcome for their frail, elderly patients.  More went back to their own homes and fewer left via the mortuary.  That was the quality and safety improvement. They also showed a shorter length of stay and a reduction in the number of beds needed to store the work in progress.  That was the flow and productivity improvement.

What was interesting to observe was how difficult it was to get these profoundly important findings published.  It appeared that a further irritant had been created for the academic peer review oyster!

The case study was eventually published in Age and Aging 2014; 43: 472-77.

The pearl that grew around this seed is the Sheffield Microsystems Academy.

In Warwick, it was proved that the A&E 4 hour performance could be improved by focussing on improving the design of the processes within the hospital, downstream of A&E.  For example, a redesign of the phlebotomy and laboratory process to ensure that clinical decisions on a ward round are based on todays blood results.

This specific case study was eventually published as well, but by a different path – one specifically designed for sharing improvement case studies – JOIS 2015; 22:1-30

And the pearls of wisdom that developed as a result of irritating many oysters in the Warwick bed are clearly described by Glen Burley, CEO of Warwick Hospital NHS Trust in this recent video.

Getting the results of all these oyster bed experiments published required irritating the Health Foundation oyster … but a pearl grew there too and emerged as the full Health Foundation report which can be downloaded here.

So if you want to grow a fistful of improvement and a bagful of pearls of wisdom … then you will need to introduce a bit of irritation … and Dr Kate Silvester is a proven source of grit for your oyster!

SaveTheNHSGameThe first step in the process of improvement is raising awareness, and this has to be done carefully.

Most of us spend most of our time in a mental state called blissful ignorance.  We are happily unaware of the problems, and of their solutions.

Some of us spend some of our time in a different mental state called denial.

And we enter that from yet another mental state called painful awareness.

By raising awareness we are deliberately nudging ourselves, and others, out of our comfort zones.

But suddenly moving from blissful ignorance to painful awareness is not a comfortable transition. It feels like a shock. We feel confused. We feel vulnerable. We feel frightened. And we have a choice: freeze, flee or fight.

Freeze is shock. We feel paralysed by the mismatch between rhetoric and reality.

Flee is denial.  We run away from a new and uncomfortable reality.

Fight is anger. Directed first at others (blame) and then at ourselves (guilt).

It is this anger-passion that we must learn to channel and focus as determination to listen, learn and then lead.

The picture is of a recent awareness-raising event; it happened this week.

The audience is a group of NHS staff from across the depth and breadth of a health and social care system.

On the screen is the ‘Save the NHS Game’.  It is an interactive, dynamic flow simulation of a whole health care system; and its purpose is educational.  It is designed to illustrate the complex and counter-intuitive flow behaviour of a system of interdependent parts: primary care, an acute hospital, intermediate care, residential care, and so on.

We all became aware of a lot of unfamiliar concepts in a short space of time!

We all learned that a flow system can flip from calm to chaotic very quickly.

We all learned that a small change in one part of a system of interdependent parts can have a big effect in another part – either harmful or beneficial and often both.

We all learned that there is often a long time-lag between the change and the effect.

We all learned that we cannot reverse the effect just by reversing the change.

And we all learned that this high sensitivity to small changes is the result of the design of our system; i.e. our design.

Learning all that in one go was a bit of a shock!  Especially the part where we realised that we had, unintentionally, created near perfect conditions for chaos to emerge. Oh dear!

Denial felt like a very reasonable option; as did blame and guilt.

What emerged was a collective sense of determination.  “Let’s Do It!” captured the mood.

puzzle_lightbulb_build_PA_150_wht_4587The second step in the process of improvement is to show the door to the next phase of learning; the phase called ‘know how’.

This requires demonstrating that there is an another way out of the zone of painful awareness.  An alternative to denial.

This is where how-to-diagnose-and-correct-the-design-flaws needs to be illustrated. A step-at-a-time.

And when that happens it feels like a light bulb has been switched on.  What before was obscure and confusing suddenly becomes clear and understandable; and we say ‘Ah ha!’

So, if we deliberately raise awareness about a problem then, as leaders of change and improvement, we also have the responsibility to raise awareness about feasible solutions.

Because only then are we able to ask “Would we like to learn how to do this ourselves!”

And ‘Yes, please’ is what 68% of the people said after attending the awareness raising event.  Only 15% said ‘No, thank you’ and only 17% abstained.

Raising awareness is the first step to improvement.
Choosing the path out of the pain towards knowledge is the second.
And taking the first step on that path is the third.

british_pound_money_three_bundled_stack_400_wht_2425This week I conducted an experiment – on myself.

I set myself the challenge of measuring the cost of chaos, and it was tougher than I anticipated it would be.

It is easy enough to grasp the concept that fire-fighting to maintain patient safety amidst the chaos of healthcare would cost more in terms of tears and time …

… but it is tricky to translate that concept into hard numbers; i.e. cash.

Chaos is an emergent property of a system.  Safety, delivery, quality and cost are also emergent properties of a system. We can measure cost, our finance departments are very good at that. We can measure quality – we just ask “How did your experience match your expectation”.  We can measure delivery – we have created a whole industry of access target monitoring.  And we can measure safety by checking for things we do not want – near misses and never events.

But while we can feel the chaos we do not have an easy way to measure it. And it is hard to improve something that we cannot measure.

So the experiment was to see if I could create some chaos, then if I could calm it, and then if I could measure the cost of the two designs – the chaotic one and the calm one.  The difference, I reasoned, would be the cost of the chaos.

And to do that I needed a typical chunk of a healthcare system: like an A&E department where the relationship between safety, flow, quality and productivity is rather important (and has been a hot topic for a long time).

But I could not experiment on a real A&E department … so I experimented on a simplified but realistic model of one. A simulation.

What I discovered came as a BIG surprise, or more accurately a sequence of big surprises!

  1. First I discovered that it is rather easy to create a design that generates chaos and danger.  All I needed to do was to assume I understood how the system worked and then use some averaged historical data to configure my model.  I could do this on paper or I could use a spreadsheet to do the sums for me.
  2. Then I discovered that I could calm the chaos by reactively adding lots of extra capacity in terms of time (i.e. more staff) and space (i.e. more cubicles).  The downside of this approach was that my costs sky-rocketed; but at least I had restored safety and calm and I had eliminated the fire-fighting.  Everyone was happy … except the people expected to foot the bill. The finance director, the commissioners, the government and the tax-payer.
  3. Then I got a really big surprise!  My safe-but-expensive design was horribly inefficient.  All my expensive resources were now running at rather low utilisation.  Was that the cost of the chaos I was seeing? But when I trimmed the capacity and costs the chaos and danger reappeared.  So was I stuck between a rock and a hard place?
  4. Then I got a really, really big surprise!!  I hypothesised that the root cause might be the fact that the parts of my system were designed to work independently, and I was curious to see what happened when they worked interdependently. In synergy. And when I changed my design to work that way the chaos and danger did not reappear and the efficiency improved. A lot.
  5. And the biggest surprise of all was how difficult this was to do in my head; and how easy it was to do when I used the theory, techniques and tools of Improvement-by-Design.

So if you are curious to learn more … I have written up the full account of the experiment with rationale, methods, results, conclusions and references and I have published it here.

Hypothesis: Chaotic behaviour of healthcare systems is inevitable without more resources.

This appears to be a rather widely held belief, but what is the evidence?

Can we disprove this hypothesis?

Chaos is a predictable, emergent behaviour of many systems, both natural and man made, a discovery that was made rather recently, in the 1970’s.  Chaotic behaviour is not the same as random behaviour.  The fundamental difference is that random implies independence, while chaos requires the opposite: chaotic systems have interdependent parts.

Chaotic behaviour is complex and counter-intuitive, which may explain why it took so long for the penny to drop.

Chaos is a complex behaviour and it is tempting to assume that complicated structures always lead to complex behaviour.  But they do not.  A mechanical clock is a complicated structure but its behaviour is intentionally very stable and highly predictable – that is the purpose of a clock.  It is a fit-for-purpose design.

The healthcare system has many parts; it too is a complicated system; it has a complicated structure.  It is often seen to demonstrate chaotic behaviour.

So we might propose that a complicated system like healthcare could also be stable and predictable. If it were designed to be.

But there is another critical factor to take into account.

A mechanical clock only has inanimate cogs and springs that only obey the Laws of Physics – and they are neither adaptable nor negotiable.

A healthcare system is different. It is a living structure. It has patients, providers and purchasers as essential components. And the rules of how people work together are both negotiable and adaptable.

So when we are thinking about a healthcare system we are thinking about a complex adaptive system or CAS.

And that changes everything!

The good news is that adaptive behaviour can be a very effective anti-chaos strategy, if it is applied wisely.  The not-so-good news is that if it is not applied wisely then it can actually generate even more chaos.

Which brings us back to our hypothesis.

What if the chaos we are observing on out healthcare system is actually iatrogenic?

What if we are unintentionally and unconsciously generating it?

These questions require an answer because if we are unwittingly contributing to the chaos, with insight, understanding and wisdom we can intentionally calm it too.

These questions also challenge us to study our current way of thinking and working.  And in that challenge we will need to demonstrate a behaviour called humility. An ability to acknowledge that there are gaps in our knowledge and our understanding. A willingness to learn.

This all sounds rather too plausible in theory. What about an example?

Let us consider the highest flow process in healthcare: the outpatient clinic stream.

The typical design is a three-step process called the New-Test-Review design. This sequential design is simpler because the steps are largely independent of each other. And this simplicity is attractive because it is easier to schedule so is less likely to be chaotic. The downsides are the queues and delays between the steps and the risk of getting lost in the system. So if we are worried that a patient may have a serious illness that requires prompt diagnosis and treatment (e.g. cancer), then this simpler design is actually a potentially unsafe design.

A one-stop clinic is a better design because the New-Test-Review steps are completed in one visit, and that is better for everyone. But, a one-stop clinic is a more challenging scheduling problem because all the steps are now interdependent, and that is fertile soil for chaos to emerge.  And chaos is exactly what we often see.

Attending a chaotic one-stop clinic is frustrating experience for both patients and staff, and it is also less productive use of resources. So the chaos and cost appears to be price we are asked to pay for a quicker and safer design.

So is the one stop clinic chaos inevitable, or is it avoidable?

Simple observation of a one stop clinic shows that the chaos is associated with queues – which are visible as a waiting room full of patients and front-of-house staff working very hard to manage the queue and to signpost and soothe the disgruntled patients.

What if the one stop clinic queue and chaos is iatrogenic? What if it was avoidable without investing in more resources? Would the chaos evaporate? Would the quality improve?  Could we have a safer, calmer, higher quality and more productive design?

Last week I shared evidence that proved the one-stop clinic chaos was iatrogenic – by showing it was avoidable.

A team of healthcare staff were shown how to diagnose the cause of the queue and were then able to remove that cause, and to deliver the same outcome without the queue and the associated chaos.

And the most surprising lesson that the team learned was that they achieved this improvement using the same resources as before; and that those resources also felt the benefit of the chaos evaporating. Their work was easier, calmer and more predictable.

The impossible-without-more-resources hypothesis had been disproved.

So, where else in our complicated and complex healthcare system might we apply anti-chaos?


And for more about complexity science see Santa Fe Institute


<Leslie> Hi Bob, I hope I am not interrupting you.  Do you have five minutes?

<Bob> Hi Leslie. I have just finished what I was working on and a chat would be a very welcome break.  Fire away.

<Leslie> I really just wanted to say how much I enjoyed the workshop this week, and so did all the delegates.  They have been emailing me to say how much they learned and thanking me for organising it.

<Bob> Thank you Leslie. I really enjoyed it too … and I learned lots … I always do.

<Leslie> As you know I have been doing the ISP programme for some time, and I have come to believe that you could not surprise me any more … but you did!  I never thought that we could make such a dramatic improvement in waiting times.  The queue just melted away and I still cannot really believe it.  Was it a trick?

<Bob> Ahhhh, the siren-call of the battle-hardened sceptic! It was no trick. What you all saw was real enough. There were no computers, statistics or smoke-and-mirrors used … just squared paper and a few coloured pens. You saw it with your own eyes; you drew the charts; you made the diagnosis; and you re-designed the policy.  All I did was provide the context and a few nudges.

<Leslie> I know, and that is why I think seeing the before and after data would help me. The process felt so much better, but I know I will need to show the hard evidence to convince others, and to convince myself as well, to be brutally honest.  I have the before data … do you have the after data?

<Bob> I do. And I was just plotting it as BaseLine charts to send to you.  So you have pre-empted me.  Here you are.

This is the waiting time run chart for the one stop clinic improvement exercise that you all did.  The leftmost segment is the before, and the rightmost are the after … your two ‘new’ designs.

As you say, the queue and the waiting has melted away despite doing exactly the same work with exactly the same resources.  Surprising and counter-intuitive but there is the evidence.

<Leslie> Wow! That fits exactly with how it felt.  Quick and calm! But I seem to remember that the waiting room was empty, particularly in the case of the design that Team 1 created. How come the waiting is not closer to zero on the chart?

<Bob> You are correct.  This is not just the time in the waiting room, it also includes the time needed to move between the rooms and the changeover time within the rooms.  It is what I call the ‘tween-time.

<Leslie> OK, that makes sense now.  And what also jumps out of the picture for me is the proof that we converted an unstable process into a stable one.  The chaos was calmed.  So what is the root cause of the difference between the two ‘after’ designs?

<Bob> The middle one, the slightly better of the two, is the one where all patients followed the newly designed process.  The rightmost one was where we deliberately threw a spanner in the works by assuming an unpredictable case mix.

<Leslie> Which made very little difference!  The new design was still much, much better than before.

<Bob> Yes. What you are seeing here is the footprint of resilient design. Do you believe it is possible now?

<Leslie> You bet I do!

stick_figure_magic_carpet_150_wht_5040It was the appointed time for Bob and Leslie’s regular coaching session as part of the improvement science practitioner programme.

<Leslie> Hi Bob, I am feeling rather despondent today so please excuse me in advance if you hear a lot of “Yes, but …” language.

<Bob> I am sorry to hear that Leslie. Do you want to talk about it?

<Leslie> Yes, please.  The trigger for my gloom was being sent on a mandatory training workshop.

<Bob> OK. Training to do what?

<Leslie> Outpatient demand and capacity planning!

<Bob> But you know how to do that already, so what is the reason you were “sent”?

<Leslie> Well, I am no longer sure I know how to it.  That is why I am feeling so blue.  I went more out of curiosity and I came away utterly confused and with my confidence shattered.

<Bob> Oh dear! We had better start at the beginning.  What was the purpose of the workshop?

<Leslie> To train everyone in how to use an Outpatient Demand and Capacity planning model, an Excel one that we were told to download along with the User Guide.  I think it is part of a national push to improve waiting times for outpatients.

<Bob> OK. On the surface that sounds reasonable. You have designed and built your own Excel flow-models already; so where did the trouble start?

<Leslie> I will attempt to explain.  This was a paragraph in the instructions. I felt OK with this because my Improvement Science training has given me a very good understanding of basic demand and capacity theory.

IST_DandC_Model_01<Bob> OK.  I am guessing that other delegates may have felt less comfortable with this. Was that the case?

<Leslie> The training workshops are targeted at Operational Managers and the ones I spoke to actually felt that they had a good grasp of the basics.

<Bob> OK. That is encouraging, but a warning bell is ringing for me. So where did the trouble start?

<Leslie> Well, before going to the workshop I decided to read the User Guide so that I had some idea of how this magic tool worked.  This is where I started to wobble – this paragraph specifically …


<Bob> H’mm. What did you make of that?

<Leslie> It was complete gibberish to me and I felt like an idiot for not understanding it.  I went to the workshop in a bit of a panic and hoped that all would become clear. It didn’t.

<Bob> Did the User Guide explain what ‘percentile’ means in this context, ideally with some visual charts to assist?

<Leslie> No and the use of ‘th’ and ‘%’ was really confusing too.  After that I sort of went into a mental fog and none of the workshop made much sense.  It was all about practising using the tool without any understanding of how it worked. Like a black magic box.

<Bob> OK.  I can see why you were confused, and do not worry, you are not an idiot.  It looks like the author of the User Guide has unwittingly used some very confusing and ambiguous terminology here.  So can you talk me through what you have to do to use this magic box?

<Leslie> First we have to enter some of our historical data; the number of new referrals per week for a year; and the referral and appointment dates for all patients for the most recent three months.

<Bob> OK. That sounds very reasonable.  A run chart of historical demand and the raw event data for a Vitals Chart® is where I would start the measurement phase too – so long as the data creates a valid 3 month reporting window.

<Leslie> Yes, I though so too … but that is not how the black box model seems to work. The weekly demand is used to draw an SPC chart, but the event data seems to disappear into the innards of the black box, and recommendations pop out of it.

<Bob> Ah ha!  And let me guess the relationship between the term ‘percentile’ and the SPC chart of weekly new demand was not explained?

<Leslie> Spot on.  What does percentile mean?

<Bob> It is statistics jargon. Remember that we have talked about the distribution of the data around the average on a BaseLine chart; and how we use the histogram feature of BaseLine to show it visually.  Like this example.

IST_DandC_Model_03<Leslie> Yes. I recognise that. This chart shows a stable system of demand with an average of around 150 new referrals per week and the variation distributed above and below the average in a symmetrical pattern, falling off to zero around the upper and lower process limits.  I believe that you said that over 99% will fall within the limits.

<Bob> Good.  The blue histogram on this chart is called a probability distribution function, to use the terminology of a statistician.

<Leslie> OK.

<Bob> So, what would happen if we created a Pareto chart of demand using the number of patients per week as the categories and ignoring the time aspect? We are allowed to do that if the behaviour is stable, as this chart suggests.

<Leslie> Give me a minute, I will need to do a rough sketch. Does this look right?


<Bob> Perfect!  So if you now convert the Y-axis to a percentage scale so that 52 weeks is 100% then where does the average weekly demand of about 150 fall? Read up from the X-axis to the line then across to the Y-axis.

<Leslie> At about 26 weeks or 50% of 52 weeks.  Ah ha!  So that is what a percentile means!  The 50th percentile is the average, the zeroth percentile is around the lower process limit and the 100th percentile is around the upper process limit!

<Bob> In this case the 50th percentile is the average, it is not always the case though.  So where is the 85th percentile line?

<Leslie> Um, 52 times 0.85 is 44.2 which, reading across from the Y-axis then down to the X-axis gives a weekly demand of about 170 per week.  That is about the same as the average plus one sigma according to the run chart.

<Bob> Excellent. The Pareto chart that you have drawn is called a cumulative probability distribution function … and that is usually what percentiles refer to. Comparative Statisticians love these but often omit to explain their rationale to non-statisticians!

<Leslie> Phew!  So, now I can see that the 65th percentile is just above average demand, and 85th percentile is above that.  But in the confusing paragraph how does that relate to the phrase “65% and 85% of the time”?

<Bob> It doesn’t. That is the really, really confusing part of  that paragraph. I am not surprised that you looped out at that point!

<Leslie> OK. Let us leave that for another conversation.  If I ignore that bit then does the rest of it make sense?

<Bob> Not yet alas. We need to dig a bit deeper. What would you say are the implications of this message?

<Leslie> Well.  I know that if our flow-capacity is less than our average demand then we will guarantee to create an unstable queue and chaos. That is the Flaw of Averages trap.

<Bob> OK.  The creator of this tool seems to know that.

<Leslie> And my outpatient manager colleagues are always complaining that they do not have enough slots to book into, so I conclude that our current flow-capacity is just above the 50th percentile.

<Bob> A reasonable hypothesis.

<Leslie> So to calm the chaos the message is saying I will need to increase my flow capacity up to the 85th percentile of demand which is from about 150 slots per week to 170 slots per week. An increase of 7% which implies a 7% increase in costs.

<Bob> Good.  I am pleased that you did not fall into the intuitive trap that a increase from the 50th to the 85th percentile implies a 35/50 or 70% increase! Your estimate of 7% is a reasonable one.

<Leslie> Well it may be theoretically reasonable but it is not practically possible. We are exhorted to reduce costs by at least that amount.

<Bob> So we have a finance versus governance bun-fight with the operational managers caught in the middle: FOG. That is not the end of the litany of woes … is there anything about Did Not Attends in the model?

<Leslie> Yes indeed! We are required to enter the percentage of DNAs and what we do with them. Do we discharge them or re-book them.

<Bob> OK. Pragmatic reality is always much more interesting than academic rhetoric and this aspect of the real system rather complicates things, at least for a comparative statistician. This is where the smoke and mirrors will appear and they will be hidden inside the black magic box.  To solve this conundrum we need to understand the relationship between demand, capacity, variation and yield … and it is rather counter-intuitive.  So, how would you approach this problem?

<Leslie> I would use the 6M Design® framework and I would start with a map and not with a model; least of all a magic black box one that I did not design, build and verify myself.

<Bob> And how do you know that will work any better?

<Leslie> Because at the One Day ISP Workshop I saw it work with my own eyes. The queues, waits and chaos just evaporated.  And it cost nothing.  We already had more than enough “capacity”.

<Bob> Indeed you did.  So shall we do this one as an ISP-2 project?

<Leslie> An excellent suggestion.  I already feel my confidence flowing back and I am looking forward to this new challenge. Thank you again Bob.

stick_figure_on_cloud_150_wht_9604Last week Bob and Leslie were exploring the data analysis trap called a two-points-in-time comparison: as illustrated by the headline “This winter has not been as bad as last … which proves that our winter action plan has worked.

Actually it doesn’t.

But just saying that is not very helpful. We need to explain the reason why this conclusion is invalid and therefore potentially dangerous.

So here is the continuation of Bob and Leslie’s conversation.

<Bob> Hi Leslie, have you been reflecting on the two-points-in-time challenge?

<Leslie> Yes indeed, and you were correct, I did know the answer … I just didn’t know I knew if you get my drift.

<Bob> Yes, I do. So, are you willing to share your story?

<Leslie> OK, but before I do that I would like to share what happened when I described what we talked about to some colleagues.  They sort of got the idea but got lost in the unfamiliar language of ‘variance’ and I realized that I needed an example to illustrate.

<Bob> Excellent … what example did you choose?

<Leslie> The UK weather – or more specifically the temperature.  My reasons for choosing this were many: first it is something that everyone can relate to; secondly it has strong seasonal cycle; and thirdly because the data is readily available on the Internet.

<Bob> OK, so what specific question were you trying to answer and what data did you use?

<Leslie> The question was “Are our winters getting warmer?” and my interest in that is because many people assume that the colder the winter the more people suffer from respiratory illness and the more that go to hospital … contributing to the winter A&E and hospital pressures.  The data that I used was the maximum monthly temperature from 1960 to the present recorded at our closest weather station.

<Bob> OK, and what did you do with that data?

<Leslie> Well, what I did not do was to compare this winter with last winter and draw my conclusion from that!  What I did first was just to plot-the-dots … I created a time-series chart … using the BaseLine© software.


And it shows what I expected to see, a strong, regular, 12-month cycle, with peaks in the summer and troughs in the winter.

<Bob> Can you explain what the green and red lines are and why some dots are red?

<Leslie> Sure. The green line is the average for all the data. The red lines are called the upper and lower process limits.  They are calculated from the data and what they say is “if the variation in this data is random then we will expect more than 99% of the points to fall between these two red lines“.

<Bob> So, we have 55 years of monthly data which is nearly 700 points which means we would expect fewer than seven to fall outside these lines … and we clearly have many more than that.  For example, the winter of 1962-63 and the summer of 1976 look exceptional – a run of three consecutive dots outside the red lines. So can we conclude the variation we are seeing is not random?

<Leslie> Yes, and there is more evidence to support that conclusion. First is the reality check … I do not remember either of those exceptionally cold or hot years personally, so I asked Dr Google.

BigFreeze_1963This picture from January 1963 shows copper telephone lines that are so weighed down with ice, and for so long, that they have stretched down to the ground.  In this era of mobile phones we forget this was what telecommunication was like!




And just look at the young Michal Fish in the Summer of ’76! Did people really wear clothes like that?

And there is more evidence on the chart. The red dots that you mentioned are indicators that BaseLine© has detected other non-random patterns.

So the large number of red dots confirms our Mark I Eyeball conclusion … that there are signals mixed up with the noise.

<Bob> Actually, I do remember the Summer of ’76 – it was the year I did my O Levels!  And your signals-in-the-noise phrase reminds me of SETI – the search for extra-terrestrial intelligence!  I really enjoyed the 1997 film of Carl Sagan’s book Contact with Jodi Foster playing the role of the determined scientist who ends up taking a faster-than-light trip through space in a machine designed by ET and built by humans. And especially the line about 10 minutes from the end when those-in-high-places who had discounted her story as “unbelievable” realized they may have made an error … the line ‘Yes, that is interesting isn’t it’.

<Leslie> Ha ha! Yes. I enjoyed that film too. It had lots of great characters – her glory seeking boss; the hyper-suspicious head of national security who militarized the project; the charismatic anti-hero; the ranting radical who blew up the first alien machine; and John Hurt as her guardian angel. I must watch it again.

Anyway, back to the story. The problem we have here is that this type of time-series chart is not designed to extract the overwhelming cyclical, annual pattern so that we can search for any weaker signals … such as a smaller change in winter temperature over a longer period of time.

<Bob>Yes, that is indeed the problem with these statistical process control charts.  SPC charts were designed over 60 years ago for process quality assurance in manufacturing not as a diagnostic tool in a complex adaptive system such a healthcare. So how did you solve the problem?

<Leslie> I realized that it was the regularity of  the cyclical pattern that was the key.  I realized that I could use that to separate out the annual cycle and to expose the weaker signals.  I did that using the rational grouping feature of BaseLine© with the month-of-the-year as the group.


Now I realize why the designers of the software put this feature in! With just one mouse click the story jumped out of the screen!

<Bob> OK. So can you explain what we are looking at here?

<Leslie> Sure. This chart shows the same data as before except that I asked BaseLine© first to group the data by month and then to create a mini-chart for each month-group independently.  Each group has its own average and process limits.  So if we look at the pattern of the averages, the green lines, we can clearly see the annual cycle.  What is very obvious now is that the process limits for each sub-group are much narrower, and that there are now very few red points  … other than in the groups that are coloured red anyway … a niggle that the designers need to nail in my opinion!

<Bob> I will pass on your improvement suggestion! So are you saying that the regular annual cycle has accounted for the majority of the signal in the previous chart and that now we have extracted that signal we can look for weaker signals by looking for red flags in each monthly group?

<Leslie> Exactly so.  And the groups I am most interested in are the November to March ones.  So, next I filtered out the November data and plotted it as a separate chart; and I then used another cool feature of BaseLine© called limit locking.


What that means is that I have used the November maximum temperature data for the first 30 years to get the baseline average and natural process limits … and we can see that there are no red flags in that section, no obvious signals.  Then I locked these limits at 1990 and this tells BaseLine© to compare the subsequent 25 years of data against these projected limits.  That exposed a lot of signal flags, and we can clearly see that most of the points in the later section are above the projected average from the earlier one.  This confirms that there has been a significant increase in November maximum temperature over this 55 year period.

<Bob> Excellent! You have answered part of your question. So what about December onwards?

<Leslie> I was on a roll now! I also noticed from my second chart that the December, January and February groups looked rather similar so I filtered that data out and plotted them as a separate chart.

MaxTempDecJanFeb1960-2015_GroupedThese were indeed almost identical so I lumped them together as a ‘winter’ group and compared the earlier half with the later half using another BaseLine© feature called segmentation.

MaxTempDecJanFeb1960-2015-SplitThis showed that the more recent winter months have a higher maximum temperature … on average. The difference is just over one degree Celsius. But it also shows that that the month-to-month and year-to-year variation still dominates the picture.

<Bob> Which implies?

<Leslie> That, with data like this, a two-points-in-time comparison is meaningless.  If we do that we are just sampling random noise and there is no useful information in noise. Nothing that we can  learn from. Nothing that we can justify a decision with.  This is the reason the ‘this year was better than last year’ statement is meaningless at best; and dangerous at worst.  Dangerous because if we draw an invalid conclusion, then it can lead us to make an unwise decision, then decide a counter-productive action, and then deliver an unintended outcome.

By doing invalid two-point comparisons we can too easily make the problem worse … not better.

<Bob> Yes. This is what W. Edwards Deming, an early guru of improvement science, referred to as ‘tampering‘.  He was a student of Walter A. Shewhart who recognized this problem in manufacturing and, in 1924, invented the first control chart to highlight it, and so prevent it.  My grandmother used the term meddling to describe this same behavior … and I now use that term as one of the eight sources of variation. Well done Leslie!