Archive for the ‘Productivity’ Category

The Elephant in the Room is an English-language metaphorical idiom for an obvious problem or risk no one wants to discuss.

An undiscussable topic.

And the undiscussability is also undiscussable.

So the problem or risk persists.

And people come to harm as a result.

Which is not the intended outcome.

So why do we behave this way?

Perhaps it is because the problem looks too big and too complicated to solve in one intuitive leap, and we give up and label it a “wicked problem”.


The well known quote “When eating an elephant take one bite at a time” is attributed to Creighton Abrams, a US Chief of Staff.


It says that even seemingly “impossible” problems can be solved so long as we proceed slowly and carefully, in small steps, learning as we go.

And the continued decline of the NHS UK Unscheduled Care performance seems to be an Elephant-in-the-Room problem, as shown by the monthly A&E 4-hour performance over the last 10 years and the fact that this chart is not published by the NHS.

Red = England, Brown=Wales, Grey=N.Ireland, Purple=Scotland.


This week I experienced a bite of this Elephant being taken and chewed on.

The context was a Flow Design – Practical Skills – One Day Workshop and the design challenge posed to the eager delegates was to improve the quality and efficiency of a one stop clinic.

A seemingly impossible task because the delegates reported that the queues, delays and chaos that they experienced in the simulated clinic felt very realistic.

Which means that this experience is accepted as inevitable, and is impossible to improve without more resources, but financial cuts prevent that, so we have to accept the waits.


At the end of the day their belief had been shattered.

The queues, delays and chaos had evaporated and the cost to run the new one stop clinic design was actually less than the old one.

And when we combined the quality metrics with the cost metrics and calculated the measured improvement in productivity; the answer was over 70%!

The delegates experienced it all first-hand. They did the diagnosis, design, and delivery using no more than squared-paper and squeaky-pen.

And at the end they were looking at a glaring mismatch between their rhetoric and the reality.

The “impossible to improve without more money” hypothesis lay in tatters – it had been rationally, empirically and scientifically disproved.

I’d call that quite a big bite out of the Elephant-in-the-Room.


So if you have a healthy appetite for Elephant-in-the-Room challenges, and are not afraid to try something different, then there is a whole menu of nutritious food-for-thought at a FISH&CHIPs® practical skills workshop.

This week about thirty managers and clinicians in South Wales conducted two experiments to test the design of the Flow Design Practical Skills One Day Workshop.

Their collective challenge was to diagnose and treat a “chronically sick” clinic and the majority had no prior exposure to health care systems engineering (HCSE) theory, techniques, tools or training.

Two of the group, Chris and Jat, had been delegates at a previous ODWS, and had then completed their Level-1 HCSE training and real-world projects.

They had seen it and done it, so this experiment was to test if they could now teach it.

Could they replicate the “OMG effect” that they had experienced and that fired up their passion for learning and using the science of improvement?

Read on »

A story was shared this week.

A story of hope for the hard-pressed NHS, its patients, its staff and its managers and its leaders.

A story that says “We can learn how to fix the NHS ourselves“.

And the story comes with evidence; hard, objective, scientific, statistically significant evidence.


The story starts almost exactly three years ago when a Clinical Commissioning Group (CCG) in England made a bold strategic decision to invest in improvement, or as they termed it “Achieving Clinical Excellence” (ACE).

They invited proposals from their local practices with the “carrot” of enough funding to allow GPs to carve-out protected time to do the work.  And a handful of proposals were selected and financially supported.

This is the story of one of those proposals which came from three practices in Sutton who chose to work together on a common problem – the unplanned hospital admissions in their over 70’s.

Their objective was clear and measurable: “To reduce the cost of unplanned admissions in the 70+ age group by working with hospital to reduce length of stay.

Did they achieve their objective?

Yes, they did.  But there is more to this story than that.  Much more.


One innovative step they took was to invest in learning how to diagnose why the current ‘system’ was costing what it was; then learning how to design an improvement; and then learning how to deliver that improvement.

They invested in developing their own improvement science skills first.

They did not assume they already knew how to do this and they engaged an experienced health care systems engineer (HCSE) to show them how to do it (i.e. not to do it for them).

Another innovative step was to create a blog to make it easier to share what they were learning with their colleagues; and to invite feedback and suggestions; and to provide a journal that captured the story as it unfolded.

And they measured stuff before they made any changes and afterwards so they could measure the impact, and so that they could assess the evidence scientifically.

And that was actually quite easy because the CCG was already measuring what they needed to know: admissions, length of stay, cost, and outcomes.

All they needed to learn was how to present and interpret that data in a meaningful way.  And as part of their IS training,  they learned how to use system behaviour charts, or SBCs.


By Jan 2015 they had learned enough of the HCSE techniques and tools to establish the diagnosis and start to making changes to the parts of the system that they could influence.


Two years later they subjected their before-and-after data to robust statistical analysis and they had a surprise. A big one!

Reducing hospital mortality was not a stated objective of their ACE project, and they only checked the mortality data to be sure that it had not changed.

But it had, and the “p=0.014” part of the statement above means that the probability that this 20.0% reduction in hospital mortality was due to random chance … is less than 1.4%.  [This is well below the 5% threshold that we usually accept as “statistically significant” in a clinical trial.]

But …

This was not a randomised controlled trial.  This was an intervention in a complicated, ever-changing system; so they needed to check that the hospital mortality for comparable patients who were not their patients had not changed as well.

And the statistical analysis of the hospital mortality for the ‘other’ practices for the same patient group, and the same period of time confirmed that there had been no statistically significant change in their hospital mortality.

So, it appears that what the Sutton ACE Team did to reduce length of stay (and cost) had also, unintentionally, reduced hospital mortality. A lot!


And this unexpected outcome raises a whole raft of questions …


If you would like to read their full story then you can do so … here.

It is a story of hunger for improvement, of humility to learn, of hard work and of hope for the future.

Bob Jekyll was already sitting at a table, sipping a pint of Black Sheep and nibbling on a bowl of peanuts when Hugh and Louise arrived.

<Hugh> Hello, are you Bob?

<Bob> Yes, indeed! You must be Hugh and Louise. Can I get you a thirst quencher?

<Louise> Lime and soda for me please.

<Hugh> I’ll have the same as you, a Black Sheep.

<Bob> On the way.

<Hugh> Hello Louise, I’m Hugh Lewis.  I am the ops manager for acute medicine at St. Elsewhere’s Hospital. It is good to meet you at last. I have seen your name on emails and performance reports.

<Louise> Good to meet you too Hugh. I am senior data analyst for St. Elsewhere’s and I think we may have met before, but I’m not sure when.  Do you know what this is about? Your invitation was a bit mysterious.

<Hugh> Yes. Sorry about that. I was chatting to a friend of mine at the golf club last week, Dr Bill Hyde who is one of our local GPs.  As you might expect, we got to talking about the chronic pressure we are all under in both primary and secondary care.  He said he has recently crossed paths with an old chum of his from university days who he’d had a very interesting conversation with in this very pub, and he recommended I email him. So I did. And that led to a phone conversation with Bob Jekyll. I have to say he asked some very interesting questions that left me feeling a mixture of curiosity and discomfort. After we talked Bob suggested that we meet for a longer chat and that I invite my senior data analyst along. So here we are.

<Louise> I have to say my curiosity was pricked by your invitation, specifically the phrase ‘system behaviour charts’. That is a new one on me and I have been working in the NHS for some time now. It is too many years to mention since I started as junior data analyst, fresh from university!

<Hugh> That is the term Bob used, and I confess it was new to me too.

<Bob> Here we are, Black Sheep, lime soda and more peanuts.  Thank you both for coming, so shall we talk about the niggle that Hugh raised when we spoke on the phone?

<Hugh> Ah! Louise, please accept my apologies in advance. I think Bob might be referring to when I said that “90% of the performance reports don’t make any sense to me“.

<Louise> There is no need to apologise Hugh. I am actually reassured that you said that. They don’t make any sense to me either! We only produce them that way because that is what we are asked for.  My original degree was geography and I discovered that I loved data analysis! My grandfather was a doctor so I guess that’s how I ended up in doing health care data analysis. But I must confess, some days I do not feel like I am adding much value.

<Hugh> Really? I believe we are in heated agreement! Some days I feel the same way.  Is that why you invited us both Bob?

<Bob> Yes.  It was some of the things that Hugh said when we talked on the phone.  They rang some warning bells for me because, in my line of work, I have seen many people fall into a whole minefield of data analysis traps that leave them feeling confused and frustrated.

<Louise> What exactly is your line of work, Bob?

<Bob> I am a systems engineer.  I design, build, verify, integrate, implement and validate systems. Fit-for-purpose systems.

<Louise> In health care?

<Bob> Not until last week when I bumped into Bill Hyde, my old chum from university.  But so far the health care system looks just like all the other ones I have worked in, so I suspect some of the lessons from other systems are transferable.

<Hugh> That sounds interesting. Can you give us an example?

<Bob> OK.  Hugh, in our first conversation, you often used the words “demand”  and “capacity”. What do you mean by those terms?

<Hugh> Well, demand is what comes through the door, the flow of requests, the workload we are expected to manage.  And capacity is the resources that we have to deliver the work and to meet our performance targets.  Capacity is the staff, the skills, the equipment, the chairs, and the beds. The stuff that costs money to provide.  As a manager, I am required to stay in-budget and that consumes a big part of my day!

<Bob> OK. Speaking as an engineer I would like to know the units of measurement of “demand” and “capacity”?

<Hugh> Oh! Um. Let me think. Er. I have never been asked that question before. Help me out here Louise.  I told you Bob asks tricky questions!

<Louise> I think I see what Bob is getting at.  We use these terms frequently but rather loosely. On reflection they are not precisely defined, especially “capacity”. There are different sorts of capacity all of which will be measured in different ways so have different units. No wonder we spend so much time discussing and debating the question of if we have enough capacity to meet the demand.  We are probably all assuming different things.  Beds cannot be equated to staff, but too often we just seem to lump everything together when we talk about “capacity”.  So by doing that what we are really asking is “do we have enough cash in the budget to pay for the stuff we thing we need?”. And if we are failing one target or another we just assume that the answer is “No” and we shout for “more cash”.

<Bob> Exactly my point. And this was one of the warning bells.  Lack of clarity on these fundamental definitions opens up a minefield of other traps like the “Flaw of Averages” and “Time equals Money“.  And if we are making those errors then they will, unwittingly, become incorporated into our data analysis.

<Louise> But we use averages all the time! What is wrong with an average?

<Bob> I can sense you are feeling a bit defensive Louise.  There is no need to.  An average is perfectly OK and is very useful tool.  The “flaw” is when it is used inappropriately.  Have you heard of Little’s Law?

<Louise> No. What’s that?

<Bob> It is the mathematically proven relationship between flow, work-in-progress and lead time.  It is a fundamental law of flow physics and it uses averages. So averages are OK.

<Hugh> So what is the “Flaw of Averages”?

<Bob> It is easier to demonstrate it than to describe it.  Let us play a game.  I have some dice and we have a big bowl of peanuts.  Let us simulate a simple two step process.  Hugh you are Step One and Louise you are Step Two.  I will be the the source of demand.

I will throw a dice and count that many peanuts out of the bowl and pass them to Hugh.  Hugh, you then throw the dice and move that many peanuts from your heap to Louise, then Louise throws the dice and moves that many from her pile to the final heap which we will call activity.

<Hugh> Sounds easy enough.  If we all use the same dice then the average flow through each step will be the same so after say ten rounds we should have, um …

<Louise> … thirty five peanuts in the activity heap.  On average.

<Bob> OK.  That’s the theory, let’s see what happens in reality.  And no eating the nuts-in-progress please.


They play the game and after a few minutes they have completed the ten rounds.


<Hugh> That’s odd.  There are only 30 nuts in the activity heap and we expected 35.  Nobody nibbled any nuts so its just chance I suppose.  Lets play again. It should average out.

…..  …..

<Louise> Thirty four this time which is better, but is still below the predicted average.  That could still be a chance effect though.  Let us run the ‘nutty’ game this a few more times.

….. …..

<Hugh> We have run the same game six times with the same nuts and the same dice and we delivered activities of 30, 34, 30, 24, 23 and 31 and there are usually nuts stuck in the process at the end of each game, so it is not due to a lack of demand.  We are consistently under-performing compared with our theoretical prediction.  That is weird.  My head says we were just unlucky but I have a niggling doubt that there is more to it.

<Louise> Is this the Flaw of Averages?

<Bob> Yes, it is one of them. If we set our average future flow-capacity to the average historical demand and there is any variation anywhere in the process then we will see this effect.

<Hugh> H’mmm.  But we do this all the time because we assume that the variation will average out over time. Intuitively it must average out over time.  What would happen if we kept going for more cycles?

<Bob> That is a very good question.  And your intuition is correct.  It does average out eventually but there is a catch.

<Hugh> What is the catch?

<Bob>  The number of peanuts in the process and the time it takes for one peanut to get through is very variable.

<Louise> Is there any pattern to the variation? Is it predictable?

<Bob> Another excellent question.  Yes, there is a pattern.  It is called “chaos”.  Predictable chaos if you like.

<Hugh> So is that the reason you said on the phone that we should present our metrics as time-series charts?

<Bob> Yes, one of them.  The appearance of chaotic system behaviour is very characteristic on a time-series chart.

<Louise> And if we see the chaos pattern on our charts then we could conclude that we have made the Flaw of Averages error?

<Bob> That would be a reasonable hypothesis.

<Hugh> I think I understand the reason you invited us to a face-to-face demonstration.  It would not have worked if you had just described it.  You have to experience it because it feels so counter-intuitive.  And this is starting to feel horribly familiar; perpetual chaos about sums up my working week!

<Louise> You also mentioned something you referred to as the “time equals money” trap.  Is that somehow linked to this?

<Bob> Yes.  We often equate time and money but they do not behave the same way.  If have five pounds today and I only spend four pounds then I can save the remaining one pound for tomorrow and spend it then – so the Law of Averages works.  But if I have five minutes today and I only use four minutes then the other minute cannot be saved and used tomorrow, it is lost forever.  That is why the Law of Averages does not work for time.

<Hugh> But that means if we set our budgets based on the average demand and the cost of people’s time then not only will we have queues, delays and chaos, we will also consistently overspend the budget too.  This is sounding more and more familiar by the minute!  This is nuts, if you will excuse the pun.

<Louise> So what is the solution?  I hope you would not have invited us here if there was no solution.

<Bob> Part of the solution is to develop our knowledge of system behaviour and how we need to present it in a visual format. With that we develop a deeper understanding of what the system behaviour charts are saying to us.  With that we can develop our ability to make wiser decisions that will lead to effective actions which will eliminate the queues, delays, chaos and cost-pressures.

<Hugh> This is possible?

<Bob> Yes. It is called systems engineering. That’s what I do.

<Louise> When do we start?

<Bob> We have started.

Phil and Pete are having a coffee and a chat.  They both work in the NHS and have been friends for years.

They have different jobs. Phil is a commissioner and an accountant by training, Pete is a consultant and a doctor by training.

They are discussing a challenge that affects them both on a daily basis: unscheduled care.

Both Phil and Pete want to see significant and sustained improvements and how to achieve them is often the focus of their coffee chats.


<Phil> We are agreed that we both want improvement, both from my perspective as a commissioner and from your perspective as a clinician. And we agree that what we want to see improvements in patient safety, waiting, outcomes, experience for both patients and staff, and use of our limited NHS resources.

<Pete> Yes. Our common purpose, the “what” and “why”, has never been an issue.  Where we seem to get stuck is the “how”.  We have both tried many things but, despite our good intentions, it feels like things are getting worse!

<Phil> I agree. It may be that what we have implemented has had a positive impact and we would have been even worse off if we had done nothing. But I do not know. We clearly have much to learn and, while I believe we are making progress, we do not appear to be learning fast enough.  And I think this knowledge gap exposes another “how” issue: After we have intervened, how do we know that we have (a) improved, (b) not changed or (c) worsened?

<Pete> That is a very good question.  And all that I have to offer as an answer is to share what we do in medicine when we ask a similar question: “How do I know that treatment A is better than treatment B?”  It is the essence of medical research; the quest to find better treatments that deliver better outcomes and at lower cost.  The similarities are strong.

<Phil> OK. How do you do that? How do you know that “Treatment A is better than Treatment B” in a way that anyone will trust the answer?

 <Pete> We use a science that is actually very recent on the scientific timeline; it was only firmly established in the first half of the 20th century. One reason for that is that it is rather a counter-intuitive science and for that reason it requires using tools that have been designed and demonstrated to work but which most of us do not really understand how they work. They are a bit like magic black boxes.

<Phil> H’mm. Please forgive me for sounding skeptical but that sounds like a big opportunity for making mistakes! If there are lots of these “magic black box” tools then how do you decide which one to use and how do you know you have used it correctly?

<Pete> Those are good questions! Very often we don’t know and in our collective confusion we generate a lot of unproductive discussion.  This is why we are often forced to accept the advice of experts but, I confess, very often we don’t understand what they are saying either! They seem like the medieval Magi.

<Phil> H’mm. So these experts are like ‘magicians’ – they claim to understand the inner workings of the black magic boxes but are unable, or unwilling, to explain in a language that a ‘muggle’ would understand?

<Pete> Very well put. That is just how it feels.

<Phil> So can you explain what you do understand about this magical process? That would be a start.


<Pete> OK, I will do my best.  The first thing we learn in medical research is that we need to be clear about what it is we are looking to improve, and we need to be able to measure it objectively and accurately.

<Phil> That  makes sense. Let us say we want to improve the patient’s subjective quality of the A&E experience and objectively we want to reduce the time they spend in A&E. We measure how long they wait. 

<Pete> The next thing is that we need to decide how much improvement we need. What would be worthwhile? So in the example you have offered we know that reducing the average time patients spend in A&E by just 30 minutes would have a significant effect on the quality of the patient and staff experience, and as a by-product it would also dramatically improve the 4-hour target performance.

<Phil> OK.  From the commissioning perspective there are lots of things we can do, such as commissioning alternative paths for specific groups of patients; in effect diverting some of the unscheduled demand away from A&E to a more appropriate service provider.  But these are the sorts of thing we have been experimenting with for years, and it brings us back to the question: How do we know that any change we implement has had the impact we intended? The system seems, well, complicated.

<Pete> In medical research we are very aware that the system we are changing is very complicated and that we do not have the power of omniscience.  We cannot know everything.  Realistically, all we can do is to focus on objective outcomes and collect small samples of the data ocean and use those in an attempt to draw conclusions can trust. We have to design our experiment with care!

<Phil> That makes sense. Surely we just need to measure the stuff that will tell us if our impact matches our intent. That sounds easy enough. What’s the problem?

<Pete> The problem we encounter is that when we measure “stuff” we observe patient-to-patient variation, and that is before we have made any changes.  Any impact that we may have is obscured by this “noise”.

<Phil> Ah, I see.  So if the our intervention generates a small impact then it will be more difficult to see amidst this background noise. Like trying to see fine detail in a fuzzy picture.

<Pete> Yes, exactly like that.  And it raises the issue of “errors”.  In medical research we talk about two different types of error; we make the first type of error when our actual impact is zero but we conclude from our data that we have made a difference; and we make the second type of error when we have made an impact but we conclude from our data that we have not.

<Phil> OK. So does that imply that the more “noise” we observe in our measure for-improvement before we make the change, the more likely we are to make one or other error?

<Pete> Precisely! So before we do the experiment we need to design it so that we reduce the probability of making both of these errors to an acceptably low level.  So that we can be assured that any conclusion we draw can be trusted.

<Phil> OK. So how exactly do you do that?

<Pete> We know that whenever there is “noise” and whenever we use samples then there will always be some risk of making one or other of the two types of error.  So we need to set a threshold for both. We have to state clearly how much confidence we need in our conclusion. For example, we often use the convention that we are willing to accept a 1 in 20 chance of making the Type I error.

<Phil> Let me check if I have heard you correctly. Suppose that, in reality, our change has no impact and we have set the risk threshold for a Type 1 error at 1 in 20, and suppose we repeat the same experiment 100 times – are you saying that we should expect about five of our experiments to show data that says our change has had the intended impact when in reality it has not?

<Pete> Yes. That is exactly it.

<Phil> OK.  But in practice we cannot repeat the experiment 100 times, so we just have to accept the 1 in 20 chance that we will make a Type 1 error, and we won’t know we have made it if we do. That feels a bit chancy. So why don’t we just set the threshold to 1 in 100 or 1 in 1000?

<Pete> We could, but doing that has a consequence.  If we reduce the risk of making a Type I error by setting our threshold lower, then we will increase the risk of making a Type II error.

<Phil> Ah! I see. The old swings-and-roundabouts problem. By the way, do these two errors have different names that would make it  easier to remember and to explain?

<Pete> Yes. The Type I error is called a False Positive. It is like concluding that a patient has a specific diagnosis when in reality they do not.

<Phil> And the Type II error is called a False Negative?

<Pete> Yes.  And we want to avoid both of them, and to do that we have to specify a separate risk threshold for each error.  The convention is to call the threshold for the false positive the alpha level, and the threshold for the false negative the beta level.

<Phil> OK. So now we have three things we need to be clear on before we can do our experiment: the size of the change that we need, the risk of the false positive that we are willing to accept, and the risk of a false negative that we are willing to accept.  Is that all we need?

<Pete> In medical research we learn that we need six pieces of the experimental design jigsaw before we can proceed. We only have three pieces so far.

<Phil> What are the other three pieces then?

<Pete> We need to know the average value of the metric we are intending to improve, because that is our baseline from which improvement is measured.  Improvements are often framed as a percentage improvement over the baseline.  And we need to know the spread of the data around that average, the “noise” that we referred to earlier.

<Phil> Ah, yes!  I forgot about the noise.  But that is only five pieces of the jigsaw. What is the last piece?

<Pete> The size of the sample.

<Phil> Eh?  Can’t we just go with whatever data we can realistically get?

<Pete> Sadly, no.  The size of the sample is how we control the risk of a false negative error.  The more data we have the lower the risk. This is referred to as the power of the experimental design.

<Phil> OK. That feels familiar. I know that the more experience I have of something the better my judgement gets. Is this the same thing?

<Pete> Yes. Exactly the same thing.

<Phil> OK. So let me see if I have got this. To know if the impact of the intervention matches our intention we need to design our experiment carefully. We need all six pieces of the experimental design jigsaw and they must all fall inside our circle of control. We can measure the baseline average and spread; we can specify the impact we will accept as useful; we can specify the risks we are prepared to accept of making the false positive and false negative errors; and we can collect the required amount of data after we have made the intervention so that we can trust our conclusion.

<Pete> Perfect! That is how we are taught to design research studies so that we can trust our results, and so that others can trust them too.

<Phil> So how do we decide how big the post-implementation data sample needs to be? I can see we need to collect enough data to avoid a false negative but we have to be pragmatic too. There would appear to be little value in collecting more data than we need. It would cost more and could delay knowing the answer to our question.

<Pete> That is precisely the trap than many inexperienced medical researchers fall into. They set their sample size according to what is achievable and affordable, and then they hope for the best!

<Phil> Well, we do the same. We analyse the data we have and we hope for the best.  In the magical metaphor we are asking our data analysts to pull a white rabbit out of the hat.  It sounds rather irrational and unpredictable when described like that! Have medical researchers learned a way to avoid this trap?

<Pete> Yes, it is a tool called a power calculator.

<Phil> Ooooo … a power tool … I like the sound of that … that would be a cool tool to have in our commissioning bag of tricks. It would be like a magic wand. Do you have such a thing?

<Pete> Yes.

<Phil> And do you understand how the power tool magic works well enough to explain to a “muggle”?

<Pete> Not really. To do that means learning some rather unfamiliar language and some rather counter-intuitive concepts.

<Phil> Is that the magical stuff I hear lurks between the covers of a medical statistics textbook?

<Pete> Yes. Scary looking mathematical symbols and unfathomable spells!

<Phil> Oh dear!  Is there another way for to gain a working understanding of this magic? Something a bit more pragmatic? A path that a ‘statistical muggle’ might be able to follow?

<Pete> Yes. It is called a simulator.

<Phil> You mean like a flight simulator that pilots use to learn how to control a jumbo jet before ever taking a real one out for a trip?

<Pete> Exactly like that.

<Phil> Do you have one?

<Pete> Yes. It was how I learned about this “stuff” … pragmatically.

<Phil> Can you show me?

<Pete> Of course.  But to do that we will need a bit more time, another coffee, and maybe a couple of those tasty looking Danish pastries.

<Phil> A wise investment I’d say.  I’ll get the the coffee and pastries, if you fire up the engines of the simulator.

reading_a_book_pa_150_wht_3136An effective way to improve is to learn from others who have demonstrated the capability to achieve what we seek.  To learn from success.

Another effective way to improve is to learn from those who are not succeeding … to learn from failures … and that means … to learn from our own failings.

But from an early age we are socially programmed with a fear of failure.

The training starts at school where failure is not tolerated, nor is challenging the given dogma.  Paradoxically, the effect of our fear of failure is that our ability to inquire, experiment, learn, adapt, and to be resilient to change is severely impaired!

So further failure in the future becomes more likely, not less likely. Oops!


Fortunately, we can develop a healthier attitude to failure and we can learn how to harness the gap between intent and impact as a source of energy, creativity, innovation, experimentation, learning, improvement and growing success.

And health care provides us with ample opportunities to explore this unfamiliar terrain. The creative domain of the designer and engineer.


The scatter plot below is a snapshot of the A&E 4 hr target yield for all NHS Trusts in England for the month of July 2016.  The required “constitutional” performance requirement is better than 95%.  The delivered whole system average is 85%.  The majority of Trusts are failing, and the Trust-to-Trust variation is rather wide. Oops!

This stark picture of the gap between intent (95%) and impact (85%) prompts some uncomfortable questions:

Q1: How can one Trust achieve 98% and yet another can do no better than 64%?

Q2: What can all Trusts learn from these high and low flying outliers?

[NB. I have not asked the question “Who should we blame for the failures?” because the name-shame-blame-game is also a predictable consequence of our fear-of-failure mindset.]


Let us dig a bit deeper into the information mine, and as we do that we need to be aware of a trap:

A snapshot-in-time tells us very little about how the system and the set of interconnected parts is behaving-over-time.

We need to examine the time-series charts of the outliers, just as we would ask for the temperature, blood pressure and heart rate charts of our patients.

Here are the last six years by month A&E 4 hr charts for a sample of the high-fliers. They are all slightly different and we get the impression that the lower two are struggling more to stay aloft more than the upper two … especially in winter.


And here are the last six years by month A&E 4 hr charts for a sample of the low-fliers.  The Mark I Eyeball Test results are clear … these swans are falling out of the sky!


So we need to generate some testable hypotheses to explain these visible differences, and then we need to examine the available evidence to test them.

One hypothesis is “rising demand”.  It says that “the reason our A&E is failing is because demand on A&E is rising“.

Another hypothesis is “slow flow”.  It says that “the reason our A&E is failing is because of the slow flow through the hospital because of delayed transfers of care (DTOCs)“.

So, if these hypotheses account for the behaviour we are observing then we would predict that the “high fliers” are (a) diverting A&E arrivals elsewhere, and (b) reducing admissions to free up beds to hold the DTOCs.

Let us look at the freely available data for the highest flyer … the green dot on the scatter gram … code-named “RC9”.

The top chart is the A&E arrivals per month.

The middle chart is the A&E 4 hr target yield per month.

The bottom chart is the emergency admissions per month.

Both arrivals and admissions are increasing, while the A&E 4 hr target yield is rock steady!

And arranging the charts this way allows us to see the temporal patterns more easily (and the images are deliberately arranged to show the overall pattern-over-time).

Patterns like the change-for-the-better that appears in the middle of the winter of 2013 (i.e. when many other trusts were complaining that their sagging A&E performance was caused by “winter pressures”).

The objective evidence seems to disprove the “rising demand”, “slow flow” and “winter pressure” hypotheses!

So what can we learn from our failure to adequately explain the reality we are seeing?


The trust code-named “RC9” is Luton and Dunstable, and it is an average district general hospital, on the surface.  So to reveal some clues about what actually happened there, we need to read their Annual Report for 2013-14.  It is a public document and it can be downloaded here.

This is just a snippet …

… and there are lots more knowledge nuggets like this in there …

… it is a treasure trove of well-known examples of good system flow design.

The results speak for themselves!


Q: How many black swans does it take to disprove the hypothesis that “all swans are white”.

A: Just one.

“RC9” is a black swan. An outlier. A positive deviant. “RC9” has disproved the “impossibility” hypothesis.

And there is another flock of black swans living in the North East … in the Newcastle area … so the “Big cities are different” hypothesis does not hold water either.


The challenge here is a human one.  A human factor.  Our learned fear of failure.

Learning-how-to-fail is the way to avoid failing-how-to-learn.

And to read more about that radical idea I strongly recommend reading the recently published book called Black Box Thinking by Matthew Syed.

It starts with a powerful story about the impact of human factors in health care … and here is a short video of Martin Bromiley describing what happened.

The “black box” that both Martin and Matthew refer to is the one that is used in air accident investigations to learn from what happened, and to use that learning to design safer aviation systems.

Martin Bromiley has founded a charity to support the promotion of human factors in clinical training, the Clinical Human Factors Group.

So if we can muster the courage and humility to learn how to do this in health care for patient safety, then we can also learn to how do it for flow, quality and productivity.

Our black swan called “RC9” has demonstrated that this goal is attainable.

And the body of knowledge needed to do this already exists … it is called Health and Social Care Systems Engineering (HSCSE).


For more posts like this please vote here.
For more information please subscribe here.
To email the author please click here.


Postscript: And I am pleased to share that Luton & Dunstable features in the House of Commons Health Committee report entitled Winter Pressures in A&E Departments that was published on 3rd Nov 2016.

Here is part of what L&D shared to explain their deviant performance:

luton_nuggets

These points describe rather well the essential elements of a pull design, which is the antidote to the rather more prevalent pressure cooker design.

This 100 second video of the late Russell Ackoff is solid gold!

In it he describes the DIKUW hierarchy – data, information, knowledge, understanding and wisdom – and how it is critical to put effectiveness before efficiency.

A wise objective is a purpose … the intended outcome … and a well designed system will be both effective and efficient.  That is the engineers definition of productivity.  Doing the right thing first, and doing it right second.

So how do we transform data into wisdom? What are needs to be added or taken away? What is the process?

Data is what we get from our senses.

To convert data into information we add context.

To convert information into knowledge we use memory.

To convert knowledge into understanding we need to learn-by-doing.

And the test of understanding is to be able to teach someone else what we know and to be able to support them developing an understanding through practice.

To convert understanding into wisdom requires years of experience of seeing, doing and teaching.

There are no short cuts.

So the sooner we start learning-by-doing the quicker we will develop the wisdom of purpose, and the understanding of process.


For more posts like this please vote here.
For more information please subscribe here.

BloodSuckerThis is a magnified picture of a blood sucking bug called a Red Poultry Mite.

They go red after having gorged themselves on chicken blood.

Their life-cycle is only 7 days so, when conditions are just right, they can quickly cause an infestation – and one that is remarkably difficult to eradicate!  But if it is not dealt with then chicken coop productivity will plummet.


We use the term “bug” for something else … a design error … in a computer program for example.  If the conditions are just right, then software bugs can spread too and can infest a computer system.  They feed on the hardware resources – slurping up processor time and memory space until the whole system slows to a crawl.


And one especially pernicious type of system design error is called an Error of Omission.  These are the things we do not do that would prevent the bloodsucking bugs from breeding and spreading.

Prevention is better than cure.


In the world of health care improvement there are some blood suckers out there, ones who home in on a susceptible host looking for a safe place to establish a colony.  They are masters of the art of mimicry.  They look like and sound like something they are not … they claim to be symbiotic whereas in reality they are parasitic.

The clue to their true nature is that their impact does not match their intent … but by the time that gap is apparent they are entrenched and their spores have already spread.

Unlike the Red Poultry Mites, we do not want to eradicate them … we need to educate them. They only behave like parasites because they are missing a few essential bits of software.  And once those upgrades are installed they can achieve their potential and become symbiotic.

So, let me introduce them, they are called Len, Siggy and Tock and here is their story:

Six Ways Not To Improve Flow

radar_screen_anim_300_clr_11649The most useful tool that a busy operational manager can have is a reliable and responsive early warning system (EWS).

One that alerts when something is changing and that, if missed or ignored, will cause a big headache in the future.

Rather like the radar system on an aircraft that beeps if something else is approaching … like another aircraft or the ground!


Operational managers are responsible for delivering stuff on time.  So they need a radar that tells them if they are going to deliver-on-time … or not.

And their on-time-delivery EWS needs to alert them soon enough that they have time to diagnose the ‘threat’, design effective plans to avoid it, decide which plan to use, and deliver it.

So what might an effective EWS for a busy operational manager look like?

  1. It needs to be reliable. No missed threats or false alarms.
  2. It needs to be visible. No tomes of text and tables of numbers.
  3. It needs to be simple. Easy to learn and quick to use.

And what is on offer at the moment?

The RAG Chart
This is a table that is coloured red, amber and green. Red means ‘failing’, green means ‘not failing’ and amber means ‘not sure’.  So this meets the specification of visible and simple, but it is reliable?

It appears not.  RAG charts do not appear to have helped to solve the problem.

A RAG chart is generated using historic data … so it tells us where we are now, not how we got here, where we are going or what else is heading our way.  It is a snapshot. One frame from the movie.  Better than complete blindness perhaps, but not much.

The SPC Chart
This is a statistical process control chart and is a more complicated beast.  It is a chart of how some measure of performance has changed over time in the past.  So like the RAG chart it is generated using historic data.  The advantage is that it is not just a snapshot of where were are now, it is a picture of story of how we got to where we are, so it offers the promise of pointing to where we may be heading.  It meets the specification of visible, and while more complicated than a RAG chart, it is relatively easy to learn and quick to use.

Luton_A&E_4Hr_YieldHere is an example. It is the SPC  chart of the monthly A&E 4-hour target yield performance of an acute NHS Trust.  The blue lines are the ‘required’ range (95% to 100%), the green line is the average and the red lines are a measure of variation over time.  What this charts says is: “This hospital’s A&E 4-hour target yield performance is currently acceptable, has been so since April 2012, and is improving over time.”

So that is much more helpful than a RAG chart (which in this case would have been green every month because the average was above the minimum acceptable level).


So why haven’t SPC charts replaced RAG charts in every NHS Trust Board Report?

Could there be a fly-in-the-ointment?

The answer is “Yes” … there is.

SPC charts are a quality audit tool.  They were designed nearly 100 years ago for monitoring the output quality of a process that is already delivering to specification (like the one above).  They are designed to alert the operator to early signals of deterioration, called ‘assignable cause signals’, and they prompt the operator to pay closer attention and to investigate plausible causes.

SPC charts are not designed for predicting if there is a flow problem looming over the horizon.  They are not designed for flow metrics that exhibit expected cyclical patterns.  They are not designed for monitoring metrics that have very skewed distributions (such as length of stay).  They are not designed for metrics where small shifts generate big cumulative effects.  They are not designed for metrics that change more slowly than the frequency of measurement.

And these are exactly the sorts of metrics that a busy operational manager needs to monitor, in reality, and in real-time.

Demand and activity both show strong cyclical patterns.

Lead-times (e.g. length of stay) are often very skewed by variation in case-mix and task-priority.

Waiting lists are like bank accounts … they show the cumulative sum of the difference between inflow and outflow.  That simple fact invalidates the use of the SPC chart.

Small shifts in demand, activity, income and expenditure can lead to big cumulative effects.

So if we abandon our RAG charts and we replace them with SPC charts … then we climb out of the RAG frying pan and fall into the SPC fire.

Oops!  No wonder the operational managers and financial controllers have not embraced SPC.


So is there an alternative that works better?  A more reliable EWS that busy operational managers and financial controllers can use?

Yes, there is, and here is a clue …

… but tread carefully …

… building one of these Flow-Productivity Early Warning Systems is not as obvious as it might first appear.  There are counter-intuitive traps for the unwary and the untrained.

You may need the assistance of a health care systems engineer (HCSE).

CapstanA capstan is a simple machine for combining the effort of many people and enabling them to achieve more than any of them could do alone.

The word appears to have come into English from the Portuguese and Spanish sailors at around the time of the Crusades.

Each sailor works independently of the others. There is no requirement them to be equally strong because the capstan will combine their efforts.  And the capstan also serves as a feedback loop because everyone can sense when someone else pushes harder or slackens off.  It is an example of simple, efficient, effective, elegant design.


In the world of improvement we also need simple, efficient, effective and elegant ways to combine the efforts of many in achieving a common purpose.  Such as raising the standards of excellence and weighing the anchors of resistance.

In health care improvement we have many simultaneous constraints and we have many stakeholders with specific perspectives and special expertise.

And if we are not careful they will tend to pull only in their preferred direction … like a multi-way tug-o-war.  The result?  No progress and exhausted protagonists.

There are those focused on improving productivity – Team Finance.

There are those focused on improving delivery – Team Operations.

There are those focused on improving safety – Team Governance.

And we are all tasked with improving quality – Team Everyone.

So we need a synergy machine that works like a capstan-of-old, and here is one design.

Engine_Of_ExcellenceIt has four poles and it always turns in a clockwise direction, so the direction of push is clear.

And when all the protagonists push in the same direction, they will get their own ‘win’ and also assist the others to make progress.

This is how the sails of success are hoisted to catch the wind of change; and how the anchors of anxiety are heaved free of the rocks of fear; and how the bureaucratic bilge is pumped overboard to lighten our load and improve our speed and agility.

And the more hands on the capstan the quicker we will achieve our common goal.

Collective excellence.

Chimp_NoHear_NoSee_NoSpeakLast week I shared a link to Dr Don Berwick’s thought provoking presentation at the Healthcare Safety Congress in Sweden.

Near the end of the talk Don recommended six books, and I was reassured that I already had read three of them. Naturally, I was curious to read the other three.

One of the unfamiliar books was “Overcoming Organizational Defenses” by the late Chris Argyris, a professor at Harvard.  I confess that I have tried to read some of his books before, but found them rather difficult to understand.  So I was intrigued that Don was recommending it as an ‘easy read’.  Maybe I am more of a dimwit that I previously believed!  So fear of failure took over my inner-chimp and I prevaricated. I flipped into denial. Who would willingly want to discover the true depth of their dimwittedness!


Later in the week, I was forwarded a copy of a recently published paper that was on a topic closely related to a key thread in Dr Don’s presentation:

understanding variation.

The paper was by researchers who had looked at the Board reports of 30 randomly selected NHS Trusts to examine how information on safety and quality was being shared and used.  They were looking for evidence that the Trust Boards understood the importance of variation and the need to separate ‘signal’ from ‘noise’ before making decisions on actions to improve safety and quality performance.  This was a point Don had stressed too, so there was a link.

The randomly selected Trust Board reports contained 1488 charts, of which only 88 demonstrated the contribution of chance effects (i.e. noise). Of these, 72 showed the Shewhart-style control charts that Don demonstrated. And of these, only 8 stated how the control limits were constructed (which is an essential requirement for the chart to be meaningful and useful).

That is a validity yield of 8 out of 1488, or 0.54%, which is for all practical purposes zero. Oh dear!


This chance combination of apparently independent events got me thinking.

Q1: What is the reason that NHS Trust Boards do not use these signal-and-noise separation techniques when it has been demonstrated, for at least 12 years to my knowledge, that they are very effective for facilitating improvement in healthcare? (e.g. Improving Healthcare with Control Charts by Raymond G. Carey was published in 2003).

Q2: Is there some form of “organizational defense” system in place that prevents NHS Trust Boards from learning useful ‘new’ knowledge?


So I surfed the Web to learn more about Chris Argyris and to explore in greater depth his concept of Single Loop and Double Loop learning.  I was feeling like a dimwit again because to me it is not a very descriptive title!  I suspect it is not to many others too.

I sensed that I needed to translate the concept into the language of healthcare and this is what emerged.

Single Loop learning is like treating the symptoms and ignoring the disease.

Double Loop learning is diagnosing the underlying disease and treating that.


So what are the symptoms?
The pain of NHS Trust  failure on all dimensions – safety, delivery, quality and productivity (i.e. affordability for a not-for-profit enterprise).

And what are the signs?
The tell-tale sign is more subtle. It’s what is not present that is important. A serious omission. The missing bits are valid time-series charts in the Trust Board reports that show clearly what is signal and what is noise. This diagnosis is critical because the strategies for addressing them are quite different – as Julian Simcox eloquently describes in his latest essay.  If we get this wrong and we act on our unwise decision, then we stand a very high chance of making the problem worse, and demoralizing ourselves and our whole workforce in the process! Does that sound familiar?

And what is the disease?
Undiscussables.  Emotive subjects that are too taboo to table in the Board Room.  And the issue of what is discussable is one of the undiscussables so we have a self-sustaining system.  Anyone who attempts to discuss an undiscussable is breaking an unspoken social code.  Another undiscussable is behaviour, and our social code is that we must not upset anyone so we cannot discuss ‘difficult’ issues.  But by avoiding the issue (the undiscussable disease) we fail to address the root cause and end up upsetting everyone.  We achieve exactly what we are striving to avoid, which is the technical definition of incompetence.  And Chris Argyris labelled this as ‘skilled incompetence’.


Does an apparent lack of awareness of what is already possible fully explain why NHS Trust Boards do not use the tried-and-tested tool called a system behaviour chart to help them diagnose, design and deliver effective improvements in safety, flow, quality and productivity?

Or are there other forces at play as well?

Some deeper undiscussables perhaps?

The Harvard Business Review is worth reading because many of its articles challenge deeply held assumptions, and then back up the challenge with the pragmatic experience of those who have succeeded to overcome the limiting beliefs.

So the heading on the April 2016 copy that awaited me on my return from an Easter break caught my eye: YOU CAN’T FIX CULTURE.


 

HBR_April_2016

The successful leaders of major corporate transformations are agreed … the cultural change follows the technical change … and then the emergent culture sustains the improvement.

The examples presented include the Ford Motor Company, Delta Airlines, Novartis – so these are not corporate small fry!

The evidence suggests that the belief of “we cannot improve until the culture changes” is the mantra of failure of both leadership and management.


A health care system is characterised by a culture of risk avoidance. And for good reason. It is all too easy to harm while trying to heal!  Primum non nocere is a core tenet – first do no harm.

But, change and improvement implies taking risks – and those leaders of successful transformation know that the bigger risk by far is to become paralysed by fear and to do nothing.  Continual learning from many small successes and many small failures is preferable to crisis learning after a catastrophic failure!

The UK healthcare system is in a state of chronic chaos.  The evidence is there for anyone willing to look.  And waiting for the NHS culture to change, or pushing for culture change first appears to be a guaranteed recipe for further failure.

The HBR article suggests that it is better to stay focussed; to work within our circles of control and influence; to learn from others where knowledge is known, and where it is not – to use small, controlled experiments to explore new ground.


And I know this works because I have done it and I have seen it work.  Just by focussing on what is important to every member on the team; focussing on fixing what we could fix; not expecting or waiting for outside help; gathering and sharing the feedback from patients on a continuous basis; and maintaining patient and team safety while learning and experimenting … we have created a micro-culture of high safety, high efficiency, high trust and high productivity.  And we have shared the evidence via JOIS.

The micro-culture required to maintain the safety, flow, quality and productivity improvements emerged and evolved along with the improvements.

It was part of the effect, not the cause.


So the concept of ‘fix the system design flaws and the continual improvement culture will emerge’ seems to work at macro-system and at micro-system levels.

We just need to learn how to diagnose and treat healthcare system design flaws. And that is known knowledge.

So what is the next excuse?  Too busy?

Pearl_and_OysterThe word pearl is a metaphor for something rare, beautiful, and valuable.

Pearls are formed inside the shell of certain mollusks as a defense mechanism against a potentially threatening irritant.

The mollusk creates a pearl sac to seal off the irritation.


And so it is with change and improvement.  The growth of precious pearls of improvement wisdom – the ones that develop slowly over time – are triggered by an irritant.

Someone asking an uncomfortable question perhaps, or presenting some information that implies that an uncomfortable question needs to be asked.


About seven years ago a question was asked “Would improving healthcare flow and quality result in lower costs?”

It is a good question because some believe that it would and some believe that it would not.  So an experiment to test the hypothesis was needed.

The Health Foundation stepped up to the challenge and funded a three year project to find the answer. The design of the experiment was simple. Take two oysters and introduce an irritant into them and see if pearls of wisdom appeared.

The two ‘oysters’ were Sheffield Hospital and Warwick Hospital and the irritant was Dr Kate Silvester who is a doctor and manufacturing system engineer and who has a bit-of-a-reputation for asking uncomfortable questions and backing them up with irrefutable information.


Two rare and precious pearls did indeed grow.

In Sheffield, it was proved that by improving the design of their elderly care process they improved the outcome for their frail, elderly patients.  More went back to their own homes and fewer left via the mortuary.  That was the quality and safety improvement. They also showed a shorter length of stay and a reduction in the number of beds needed to store the work in progress.  That was the flow and productivity improvement.

What was interesting to observe was how difficult it was to get these profoundly important findings published.  It appeared that a further irritant had been created for the academic peer review oyster!

The case study was eventually published in Age and Aging 2014; 43: 472-77.

The pearl that grew around this seed is the Sheffield Microsystems Academy.


In Warwick, it was proved that the A&E 4 hour performance could be improved by focussing on improving the design of the processes within the hospital, downstream of A&E.  For example, a redesign of the phlebotomy and laboratory process to ensure that clinical decisions on a ward round are based on todays blood results.

This specific case study was eventually published as well, but by a different path – one specifically designed for sharing improvement case studies – JOIS 2015; 22:1-30

And the pearls of wisdom that developed as a result of irritating many oysters in the Warwick bed are clearly described by Glen Burley, CEO of Warwick Hospital NHS Trust in this recent video.


Getting the results of all these oyster bed experiments published required irritating the Health Foundation oyster … but a pearl grew there too and emerged as the full Health Foundation report which can be downloaded here.


So if you want to grow a fistful of improvement and a bagful of pearls of wisdom … then you will need to introduce a bit of irritation … and Dr Kate Silvester is a proven source of grit for your oyster!

british_pound_money_three_bundled_stack_400_wht_2425This week I conducted an experiment – on myself.

I set myself the challenge of measuring the cost of chaos, and it was tougher than I anticipated it would be.

It is easy enough to grasp the concept that fire-fighting to maintain patient safety amidst the chaos of healthcare would cost more in terms of tears and time …

… but it is tricky to translate that concept into hard numbers; i.e. cash.


Chaos is an emergent property of a system.  Safety, delivery, quality and cost are also emergent properties of a system. We can measure cost, our finance departments are very good at that. We can measure quality – we just ask “How did your experience match your expectation”.  We can measure delivery – we have created a whole industry of access target monitoring.  And we can measure safety by checking for things we do not want – near misses and never events.

But while we can feel the chaos we do not have an easy way to measure it. And it is hard to improve something that we cannot measure.


So the experiment was to see if I could create some chaos, then if I could calm it, and then if I could measure the cost of the two designs – the chaotic one and the calm one.  The difference, I reasoned, would be the cost of the chaos.

And to do that I needed a typical chunk of a healthcare system: like an A&E department where the relationship between safety, flow, quality and productivity is rather important (and has been a hot topic for a long time).

But I could not experiment on a real A&E department … so I experimented on a simplified but realistic model of one. A simulation.

What I discovered came as a BIG surprise, or more accurately a sequence of big surprises!

  1. First I discovered that it is rather easy to create a design that generates chaos and danger.  All I needed to do was to assume I understood how the system worked and then use some averaged historical data to configure my model.  I could do this on paper or I could use a spreadsheet to do the sums for me.
  2. Then I discovered that I could calm the chaos by reactively adding lots of extra capacity in terms of time (i.e. more staff) and space (i.e. more cubicles).  The downside of this approach was that my costs sky-rocketed; but at least I had restored safety and calm and I had eliminated the fire-fighting.  Everyone was happy … except the people expected to foot the bill. The finance director, the commissioners, the government and the tax-payer.
  3. Then I got a really big surprise!  My safe-but-expensive design was horribly inefficient.  All my expensive resources were now running at rather low utilisation.  Was that the cost of the chaos I was seeing? But when I trimmed the capacity and costs the chaos and danger reappeared.  So was I stuck between a rock and a hard place?
  4. Then I got a really, really big surprise!!  I hypothesised that the root cause might be the fact that the parts of my system were designed to work independently, and I was curious to see what happened when they worked interdependently. In synergy. And when I changed my design to work that way the chaos and danger did not reappear and the efficiency improved. A lot.
  5. And the biggest surprise of all was how difficult this was to do in my head; and how easy it was to do when I used the theory, techniques and tools of Improvement-by-Design.

So if you are curious to learn more … I have written up the full account of the experiment with rationale, methods, results, conclusions and references and I have published it here.

FreshMeatOldBonesEvolution is an amazing process.

Using the same building blocks that have been around for a lot time, it cooks up innovative permutations and combinations that reveal new and ever more useful properties.

Very often a breakthrough in understanding comes from a simplification, not from making it more complicated.

Knowledge evolves in just the same way.

Sometimes a well understood simplification in one branch of science is used to solve an ‘impossible’ problem in another.

Cross-fertilisation of learning is a healthy part of the evolution process.


Improvement implies evolution of knowledge and understanding, and then application of that insight in the process of designing innovative ways of doing things better.


And so it is in healthcare.  For many years the emphasis on healthcare improvement has been the Safety-and-Quality dimension, and for very good reasons.  We need to avoid harm and we want to achieve happiness; for everyone.

But many of the issues that plague healthcare systems are not primarily SQ issues … they are flow and productivity issues. FP. The safety and quality problems are secondary – so only focussing on them is treating the symptoms and not the cause.  We need to balance the wheel … we need flow science.


Fortunately the science of flow is well understood … outside healthcare … but apparently not so well understood inside healthcare … given the queues, delays and chaos that seem to have become the expected norm.  So there is a big opportunity for cross fertilisation here.  If we choose to make it happen.


For example, from computer science we can borrow the knowledge of how to schedule tasks to make best use of our finite resources and at the same time avoid excessive waiting.

It is a very well understood science. There is comprehensive theory, a host of techniques, and fit-for-purpose tools that we can pick of the shelf and use. Today if we choose to.

So what are the reasons we do not?

Is it because healthcare is quite introspective?

Is it because we believe that there is something ‘special’ about healthcare?

Is it because there is no evidence … no hard proof … no controlled trials?

Is it because we assume that queues are always caused by lack of resources?

Is it because we do not like change?

Is it because we do not like to admit that we do not know stuff?

Is it because we fear loss of face?


Whatever the reasons the evidence and experience shows that most (if not all) the queues, delays and chaos in healthcare systems are iatrogenic.

This means that they are self-generated. And that implies we can un-self-generate them … at little or no cost … if only we knew how.

The only cost is to our egos of having to accept that there is knowledge out there that we could use to move us in the direction of excellence.

New meat for our old bones?

 

RIA_graphicA question that is often asked by doctors in particular is “What is the difference between Research, Audit and Improvement Science?“.

It is a very good question and the diagram captures the essence of the answer.

Improvement science is like a bridge between research and audit.

To understand why that is we first need to ask a different question “What are the purposes of research, improvement science and audit? What do they do?

In a nutshell:

Research provides us with new knowledge and tells us what the right stuff is.
Improvement Science provides us with a way to design our system to do the right stuff.
Audit provides us with feedback and tells us if we are doing the right stuff right.


Research requires a suggestion and an experiment to test it.   A suggestion might be “Drug X is better than drug Y at treating disease Z”, and the experiment might be a randomised controlled trial (RCT).  The way this is done is that subjects with disease Z are randomly allocated to two groups, the control group and the study group.  A measure of ‘better’ is devised and used in both groups. Then the study group is given drug X and the control group is given drug Y and the outcomes are compared.  The randomisation is needed because there are always many sources of variation that we cannot control, and it also almost guarantees that there will be some difference between our two groups. So then we have to use sophisticated statistical data analysis to answer the question “Is there a statistically significant difference between the two groups? Is drug X actually better than drug Y?”

And research is often a complicated and expensive process because to do it well requires careful study design, a lot of discipline, and usually large study and control groups. It is an effective way to help us to know what the right stuff is but only in a generic sense.


Audit requires a standard to compare with and to know if what we are doing is acceptable, or not. There is no randomisation between groups but we still need a metric and we still need to measure what is happening in our local reality.  We then compare our local experience with the global standard and, because variation is inevitable, we have to use statistical tools to help us perform that comparison.

And very often audit focuses on avoiding failure; in other words the standard is a ‘minimum acceptable standard‘ and as long as we are not failing it then that is regarded as OK. If we are shown to be failing then we are in trouble!

And very often the most sophisticated statistical tool used for audit is called an average.  We measure our performance, we average it over a period of time (to remove the troublesome variation), and we compare our measured average with the minimum standard. And if it is below then we are in trouble and if it is above then we are not.  We have no idea how reliable that conclusion is though because we discounted any variation.


A perfect example of this target-driven audit approach is the A&E 95% 4-hour performance target.

The 4-hours defines the metric we are using; the time interval between a patient arriving in A&E and them leaving. It is called a lead time metric. And it is easy to measure.

The 95% defined the minimum  acceptable average number of people who are in A&E for less than 4-hours and it is usually aggregated over three months. And it is easy to measure.

So, if about 200 people arrive in a hospital A&E each day and we aggregate for 90 days that is about 18,000 people in total so the 95% 4-hour A&E target implies that we accept as OK for about 900 of them to be there for more than 4-hours.

Do the 900 agree? Do the other 17,100?  Has anyone actually asked the patients what they would like?


The problem with this “avoiding failure” mindset is that it can never lead to excellence. It can only deliver just above the minimum acceptable. That is called mediocrity.  It is perfectly possible for a hospital to deliver 100% on its A&E 4 hour target by designing its process to ensure every one of the 18,000 patients is there for exactly 3 hours and 59 minutes. It is called a time-trap design.

We can hit the target and miss the point.

And what is more the “4-hours” and the “95%” are completely arbitrary numbers … there is not a shred of research evidence to support them.

So just this one example illustrates the many problems created by having a gap between research and audit.


And that is why we need Improvement Science to help us to link them together.

We need improvement science to translate the global knowledge and apply it to deliver local improvement in whatever metrics we feel are most important. Safety metrics, flow metrics, quality metrics and productivity metrics. Simultaneously. To achieve system-wide excellence. For everyone, everywhere.

When we learn Improvement Science we learn to measure how well we are doing … we learn the power of measurement of success … and we learn to avoid averaging because we want to see the variation. And we still need a minimum acceptable standard because we want to exceed it 100% of the time. And we want continuous feedback on just how far above the minimum acceptable standard we are. We want to see how excellent we are, and we want to share that evidence and our confidence with our patients.

We want to agree a realistic expectation rather than paint a picture of the worst case scenario.

And when we learn Improvement Science we will see very clearly where to focus our improvement efforts.


Improvement Science is the bit in the middle.


Stop Press:  There is currently an offer of free on-line foundation training in improvement science for up to 1000 doctors-in-training … here  … and do not dally because places are being snapped up fast!

Nerve_CurveThe emotional journey of change feels like a roller-coaster ride and if we draw as an emotion versus time chart it looks like the diagram above.

The toughest part is getting past the low point called the Well of Despair and doing that requires a combination of inner strength and external support.

The external support comes from an experienced practitioner who has been through it … and survived … and has the benefit of experience and hindsight.

The Improvement Science coach.


What happens as we  apply the IS principles, techniques and tools that we have diligently practiced and rehearsed? We discover that … they work!  And all the fence-sitters and the skeptics see it too.

We start to turn the corner and what we feel next is that the back pressure of resistance falls a bit. It does not go away, it just gets less.

And that means that the next test of change is a bit easier and we start to add more evidence that the science of improvement does indeed work and moreover it is a skill we can learn, demonstrate and teach.

We have now turned the corner of disbelief and have started the long, slow, tough climb through mediocrity to excellence.


This is also a time of risks and there are several to be aware of:

  1. The objective evidence that dramatic improvements in safety, flow, quality and productivity are indeed possible and that the skills can be learned will trigger those most threatened by the change to fight harder to defend their disproved rhetoric. And do not underestimate how angry and nasty they can get!
  2. We can too easily become complacent and believe that the rest will follow easily. It doesn’t.  We may have nailed some of the easier niggles to be sure … but there are much more challenging ones ahead.  The climb to excellence is a steep learning curve … all the way. But the rewards get bigger and bigger as we progress so it is worth it.
  3. We risk over-estimating our capability and then attempting to take on the tougher improvement assignments without the necessary training, practice, rehearsal and support. If we do that we will crash and burn.  It is like a game of snakes and ladders.  Our IS coach is there to help us up the ladders and to point out where the slippery snakes are lurking.

So before embarking on this journey be sure to find a competent IS coach.

They are easy to identify because they will have a portfolio of case studies that they have done themselves. They have the evidence of successful outcomes and that they can walk-the-talk.

And avoid anyone who talks-the-walk but does not have a portfolio of evidence of their own competence. Their Siren song will lure you towards the submerged Rocks of Disappointment and they will disappear like morning mist when you need them most – when it comes to the toughest part – turning the corner. You will be abandoned and fall into the Well of Despair.

So ask your IS coach for credentials, case studies and testimonials and check them out.

Dr_Bob_ThumbnailDr Bob runs a Clinic for Sick Systems and is sharing the Case of St Elsewhere’s® Hospital which is suffering from chronic pain in their A&E department.

The story so far: The history and examination of St.Elsewhere’s® Emergency Flow System have revealed that the underlying disease includes carveoutosis multiforme.  StE has consented to a knowledge transplant but is suffering symptoms of disbelief – the emotional rejection of the new reality. Dr Bob prescribed some loosening up exercises using the Carveoutosis Game.  This is the appointment to review the progress.


<Dr Bob> Hello again. I hope you have done the exercises as we agreed.

<StE> Indeed we have.  Many times in fact because at first we could not believe what we were seeing. We even modified the game to explore the ramifications.  And we have an apology to make. We discounted what you said last week but you were absolutely correct.

<Dr Bob> I am delighted to hear that you have explored further and I applaud you for the curiosity and courage in doing that.  There is no need to apologize. If this flow science was intuitively obvious then we we would not be having this conversation. So, how have you used the new understanding?

<StE> Before we tell the story of what happened next we are curious to know where you learned about this?

<Dr Bob> The pathogenesis of carveoutosis spatialis has been known for about 100 years but in a different context.  The story goes back to the 1870s when Alexander Graham Bell invented the telephone.  He was not an engineer or mathematician by background; he was interested in phonetics and he was a pragmatist and experimented by making things. He invented the telephone and the Bell Telephone Co. was born.  This innovation spread like wildfire, as you can imagine, and by the early 1900’s there were many telephone companies all over the world.  At that time the connections were made manually by telephone operators using patch boards and the growing demand created a new problem.  How many lines and operators were needed to provide a high quality service to bill paying customers? In other words … to achieve an acceptably low chance of hearing the reply “I’m sorry but all lines are busy, please try again later“.  Adding new lines and more operators was a slow and expensive business so they needed a way to predict how many would be needed – and how to do that was not obvious!  In 1917 a Danish mathematician, statistician and engineer called Agner Karup Erlang published a paper with the solution.  A complicated formula that described the relationship and this equation allowed telephone exchanges to be designed, built and staffed and to provide a high quality service at an acceptably low cost.  Mass real-time voice communication by telephone became affordable and has transformed the world.

<StE> Fascinating! We sort of sense there is a link here and certainly the “high quality and low cost” message resonates for us. But how does designing telephone exchanges relate to hospital beds?

<Dr Bob> If we equate an emergency admission needing a bed to a customer making a phone call, and we equate the number of telephone lines to the number of beds, then the two systems are the same from the flow physics perspective. Erlang’s scary-looking equation can be used to estimate the number of beds needed to achieve any specified level of admission service quality if you know the average rate of demand and average the length of stay.  That is how I made the estimate last week. It is this predictable-within-limits behaviour that you demonstrated to yourself with the Carveoutosis Game.

<StE> And this has been known for nearly 100 years but we have only just learned about it!

<Dr Bob> Yes. That is a bit annoying isn’t it?

<StE> And that explains why when we ‘ring-fence’ our fixed stock of beds the 4-hour performance falls!

<Dr Bob> Yes, that is a valid assertion. By doing that you are reducing your space-capacity resilience and the resulting danger, chaos, disappointment and escalating cost is completely predictable.

<StE> So our pain is iatrogenic as you said! We have unwittingly caused this. That is uncomfortable news to hear.

<Dr Bob> The root cause is actually not what you have done wrong, it is what you have not done right. Yours is an error of omission. You have not learned to listen to what your system is telling you. You have not learned how that can help you to deepen your understanding of how your system works. It is that wisdom that you need to design a safer, calmer, higher quality and more affordable healthcare system.

<StE> And now we can see our omission … before it was like a blind spot … and now we can see the fallacy of our previously deeply held belief: that it was impossible to solve this without more beds, more staff and more money.  The gap is now obvious where before it was invisible. It is like a light has been turned on.  Now we know what to do and we are on the road to recovery. We need to learn how to do this ourselves … but not by meddling … we need to learn to diagnose and then to design and then to deliver safety, flow, quality and productivity.  All at the same time.

<Dr Bob> Welcome to the exciting world of Improvement Science. And here I must sound a note of caution … there is a lot more to it than just blindly applying Erlang’s scary equation. That will get us into the right ball-park, which is a big leap forward, but real systems are not just simple, passive games of chance; they are complicated, active and adaptive.  Applying the principles of flow design in that context requires more than just mathematics, statistics and computer models.  But that know-how is available and accessible too … and waiting for when you are ready to take that leap of learning.

So I do not think you require any more help from me at this stage or follow up appointments. You have what you need and I wish you well.  And please let me know the outcome.

<StE> Thank you and rest assured we will. We have already started writing our story … and we wanted to share the that with you today … but with this new insight we will need to write a few more chapters first.  This is really exciting … thank you so much.


St.Elsewhere’s® is a registered trademark of Kate Silvester Ltd,  and to read more real cases of 4-hour A&E pain download Kate’s: The Christmas Crisis


For more posts like this please vote here.
For more information please subscribe here.

Dr_Bob_ThumbnailThere is a big bun-fight kicking off on the topic of 7-day working in the NHS.

The evidence is that there is a statistical association between mortality in hospital of emergency admissions and day of the week: and weekends are more dangerous.

There are fewer staff working at weekends in hospitals than during the week … and delays and avoidable errors increase … so risk of harm increases.

The evidence also shows that significantly fewer patients are discharged at weekends.


So the ‘obvious’ solution is to have more staff on duty at weekends … which will cost more money.


Simple, obvious, linear and wrong.  Our intuition has tricked us … again!


Let us unravel this Gordian Knot with a bit of flow science and a thought experiment.

1. The evidence shows that there are fewer discharges at weekends … and so demonstrates lack of discharge flow-capacity. A discharge process is not a single step, there are many things that must flow in sync for a discharge to happen … and if any one of them is missing or delayed then the discharge does not happen or is delayed.  The weakest link effect.

2. The evidence shows that the number of unplanned admissions varies rather less across the week; which makes sense because they are unplanned.

3. So add those two together and at weekends we see hospitals filling up with unplanned admissions – not because the sick ones are arriving faster – but because the well ones are leaving slower.

4. The effect of this is that at weekends the queue of people in beds gets bigger … and they need looking after … which requires people and time and money.

5. So the number of staffed beds in a hospital must be enough to hold the biggest queue – not the average or some fudged version of the average like a 95th percentile.

6. So a hospital running a 5-day model needs more beds because there will be more variation in bed use and we do not want to run out of beds and delay the admission of the newest and sickest patients. The ones at most risk.

7. People do not get sicker because there is better availability of healthcare services – but saying we need to add more unplanned care flow capacity at weekends implies that it does.  What is actually required is that the same amount of flow-resource that is currently available Mon-Fri is spread out Mon-Sun. The flow-capacity is designed to match the customer demand – not the convenience of the supplier.  And that means for all parts of the system required for unplanned patients to flow.  What, where and when. It costs the same.

8. Then what happens is that the variation in the maximum size of the queue of patients in the hospital will fall and empty beds will appear – as if by magic.  Empty beds that ensure there is always one for a new, sick, unplanned admission on any day of the week.

9. And empty beds that are never used … do not need to be staffed … so there is a quick way to reduce expensive agency staff costs.

So with a comprehensive 7-day flow-capacity model the system actually gets safer, less chaotic, higher quality and less expensive. All at the same time. Safety-Flow-Quality-Productivity.

by Julian Simcox & Terry Weight

Ben Goldacre has spent several years popularizing the idea that we all ought all to be more interested in science.

Every day he writes and tweets examples of “bad science”, and about getting politicians and civil servants to be more evidence-based; about how governmental interventions should be more thoroughly tested before being rolled-out to the hapless citizen; about how the development and testing of new drugs should be more transparent to ensure the public get drugs that actually make a difference rather than risk harm; and about bad statistics – the kind that “make clever people do stupid things”(8).

Like Ben we would like to point the public sector, in particular the healthcare sector and its professionals, toward practical ways of doing more of the good kind of science, but just what is GOOD science?

In collaboration with the Cabinet Office’s behaviour insights team, Ben has recently published a polemic (9) advocating evidence-based government policy. For us this too is commendable, yet there is a potentially grave error of omission in their paper which seems to fixate upon just a single method of research, and risks setting-up the unsuspecting healthcare professional for failure and disappointment – as Abraham Maslow once famously said

.. it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail”(17)

We question the need for the new Test, Learn and Adapt (TLA) model he offers because the NHS already possesses such a model – one which in our experience is more complete and often simpler to follow – it is called the “Improvement Model”(15) – and via its P-D-S-A mnemonic (Plan-Do-Study-Act) embodies the scientific method.

Moreover there is a preexisting wealth of experience on how best to embed this thinking within organisations – from top-to-bottom and importantly from bottom-to-top; experience that has been accumulating for fully nine decades – and though originally established in industrial settings has long since spread to services.

We are this week publishing two papers, one longer and one shorter, in which we start by defining science, ruing the dismal way in which it is perennially conveyed to children and students, the majority of whom leave formal education without understanding the power of discovery or gaining any first hand experience of the scientific method.

View Shorter Version Abstract

We argue that if science were to be defined around discovery, and learning cycles, and built upon observation, measurement and the accumulation of evidence – then good science could vitally be viewed as a process rather than merely as an externalized entity. These things comprise the very essence of what Don Berwick refers to as Improvement Science (2) as embodied by the Institute of Healthcare Improvement (IHI) and in the NHS’s Model for Improvement.

We also aim to bring an evolutionary perspective to the whole idea of science, arguing that its time has been coming for five centuries, yet is only now more fully arriving. We suggest that in a world where many at school have been turned-off science, the propensity to be scientific in our daily lives – and at work – makes a vast difference to the way people think about outcomes and their achievement. This is especially so if those who take a perverse pride in saying they avoided science at school, or who freely admit they do not do numbers, can get switched on to it.

The NHS Model for Improvement has a pedigree originating with Walter Shewhart in the 1920’s, then being famously applied by Deming and Juran after WWII. Deming in particular encapsulates the scientific method in his P-D-C-A model (three decades later he revised it to P-D-S-A in order to emphasize that the Check stage must not be short-changed) – his pragmatic way of enabling a learning/improvement to evolve bottom-up in organisations.

After the 1980’s Dr Don Berwick , standing on these shoulders, then applied the same thinking to the world of healthcare – initially in his native America. Berwick’s approach is to encourage people to ask questions such as: What works? .. and How would we know? His method, is founded upon a culture of evidence-based learning, providing a local context for systemic improvement efforts. A new organisational culture, one rooted in the science of improvement, if properly nurtured, may then emerge.

Yet, such a culture may initially jar with the everyday life of a conventional organisation, and the individuals within it. One of several reasons, according to Yuval Harari (21), is that for hundreds of generations our species has evolved such that imagined reality has been lorded over objective reality. Only relatively recently in our evolution has the advance of science been leveling up this imbalance, and in our papers we argue that a method is now needed that enables these two realities to more easily coexist.

We suggest that a method that enables data-rich evidence-based storytelling – by those who most know about the context and intend growing their collective knowledge – will provide the basis for an approach whereby the two realities may do just that.

In people’s working lives, a vital enabler is the 3-paradigm “Accountability/Improvement/Research” measurement model (AIRmm), reflecting the three archetypal ways in which people observe and measure things. It was created by healthcare professionals (23) to help their colleagues and policy-makers to unravel a commonly prevailing confusion, and to help people make better sense of the different approaches they may adopt when needing to evidence what they’re doing – depending on the specific purpose. An amended version of this model is already widely quoted inside the NHS, though this is not to imply that it is yet as widely understood or applied as it needs to be.

goodscience_AIR_model

This 3-paradigm A-I-R measurement model underpins the way that science can be applied by, and has practical appeal for, the stretched healthcare professional, managerial leader, civil servant.

Indeed for anyone who intuitively suspects there has to be a better way to combine goals that currently feel disconnected or even in conflict: empowerment and accountability; safety and productivity; assurance and improvement; compliance and change; extrinsic and intrinsic motivation; evidence and action; facts and ideas; logic and values; etc.

Indeed for anyone who is searching for ways to unify their actions with the system-based implementation of those actions as systemic interventions. Though widely quoted in other guises, we are returning to the original model (23) because we feel it better connects to the primary aim of supporting healthcare professionals make best sense of their measurement options.

In particular the model makes it immediately plain that a way out of the apparent Research/Accountability dichotomy is readily available to anyone willing to “Learn, master and apply the modern methods of quality control, quality improvement and quality planning” – the recommendation made for all staff in the Berwick Report (3).

In many organisations, and not just in healthcare, the column 1 paradigm is the only game in town. Column 3 may feel attractive as a way-out, but it also feels inaccessible unless there is a graduate in statistician on hand. Moreover, the mainstay of the Column 3 worldview: the Randomized Controlled Trial (RCT) can feel altogether overblown and lacking in immediacy. It can feel like reaching for a spanner and finding a lump hammer in your hand – as Berwick says “Fans of traditional research methods view RCTs as the gold standard, but RCTs do not work well in many healthcare contexts” (2).

Like us, Ben is frustrated by the ways that healthcare organisations conduct themselves – not just the drug companies that commercialize science and publish only the studies likely to enhance sales, but governments too who commonly implement politically expedient policies only to then have to subsequently invent evidence to support them.

Policy-based evidence rather than evidence-based policy.

Ben’s recommended Column 3-style T-L-A approach is often more likely to make day-to-day sense to people and teams on the ground if complemented by Column 2-style improvement science.
One reason why Improvement Science can sometimes fail to dent established cultures is that it gets corralled by organisational “experts” – some of whom then use what little knowledge they have gathered merely to make themselves indispensable, not realising the extent to which everyone else as a consequence gets dis-empowered.

In our papers we take the opportunity to outline the philosophical underpinnings, and to do this we have borrowed the 7-point framework from a recent paper by Perla et al (35) who suggest that Improvement Science:

1. Is grounded in testing and learning cycles – the aim is collective knowledge and understanding about cause & effect over time. Some scientific method is needed, together with a way to make the necessary inquiry a collaborative one. Shewhart realised this and so invented the concept “continual improvement”.

2. Embraces a combination of psychology and logic – systemic learning requires that we balance myth and received wisdom with logic and the conclusions we derive from rational inquiry. This balance is approximated by the Sensing-Intuiting continuum in the Jungian-based MBTI model (12) reminding us that constructing a valid story requires bandwidth.

3. Has a philosophical foundation of conceptualistic pragmatism (16) – it cannot be expected that two scientists when observing, experiencing, or experimenting will make the same theory-neutral observations about the same event – even if there is prior agreement about methods of inference and interpretation. The normative nature of reality therefore has to be accommodated. Whereas positivism ultimately reduces the relation between meaning and experience to a matter of logical form, pragmatism allows us to ground meaning in conceived experience.

4. Employs Shewhart’s “theory of cause systems” – Walter Shewhart created the Control Chart for tuning-in to systemic behaviour that would otherwise remain unnoticed. It is a diagnostic tool, but by flagging potential trouble also aids real time prognosis. It might have been called a “self-control chart” for he was especially interested in supporting people working in and on their system being more considered (less reactive) when taking action to enhance it without over-reacting – avoiding what Deming later referred to as “Tampering” (4).

5. Requires the use of Operational Definitions – Deming warned that some of the most important aspects of a system cannot be expressed numerically, and those that can require care because “there is no true value of anything measured or observed” (5). When it comes to metric selection therefore it is essential to understand the measurement process itself, as well as the “operational definition” that each metric depends upon – the aim being to reduce ambiguity to zero.

6. Considers the contexts of both justification and discovery – Science can be defined as a process of discovery – testing and learning cycles built upon observation, measurement and accumulating evidence or experience – shared for example via a Flow Chart or a Gantt chart in order to justify a belief in the truth of an assertion. To be worthy of the term “science” therefore, a method or procedure is needed that is characterised by collaborative inquiry.

7. Is informed by Systems Theory – Systems Theory is the study of systems, any system: as small as a quark or as large as the universe. It aims to uncover archetypal behaviours and the principles by which systems hang together – behaviours that can be applied across all disciplines and all fields of research. There are several types of systems thinking, but Jay Forrester’s “System Dynamics” has most pertinence to Improvement Science because of its focus on flows and relationships – recognising that the behaviour of the whole may not be explained by the behaviour of the parts.

In the papers, we say more about this philosophical framing, and we also refer to the four elements in Deming’s “System of Profound Knowledge”(5). We especially want to underscore that the overall aim of any scientific method we employ is contextualised knowledge – which is all the more powerful if continually generated in context-specific experimental cycles. Deming showed that good science requires a theory of knowledge based upon ever-better questions and hypotheses. We two aim now to develop methods for building knowledge-full narratives that can work well in healthcare settings.

We wholeheartedly agree with Ben that for the public sector – not just in healthcare – policy-making needs to become more evidence-based.

In a poignant blog from the Health Foundation’s (HF) Richard Taunt (24), he recently describes attending two recent conferences on the same day. At the first one, policymakers from 25 countries had assembled to discuss how national policy can best enhance the quality of health care. When collectively asked which policies they would retain and repeat, their list included: use of data, building quality improvement capability, ensuring senior management are aware of improvement approaches, and supporting and spreading innovations.

In a different part of London, UK health politicians happened also to be debating Health and Care in order to establish the policy areas they would focus on if forming the next government. This second discussion brought out a completely different set of areas: the role of competition, workforce numbers, funding, and devolution of commissioning. These two discussions were supposedly about the same topic, but a Venn diagram would have contained next to no overlap.

Clare Allcock, also from the HF, then blogged to comment that “in England, we may think we are fairly advanced in terms of policy levers, but (unlike, for example in Scotland or the USA) we don’t even have a strategy for implementing health system quality.” She points in particular to Denmark who recently have announced they are phasing out their hospital accreditation scheme in favour of an approach strongly focused around quality improvement methodology and person-centred care. The Danes are in effect taking the 3-paradigm model and creating space for Column 2: improvement thinking.

The UK needs to take a leaf out of their book, for without changing fundamentally the way the NHS (and the public sector as a whole) thinks about accountability, any attempt to make column 2 the dominant paradigm is destined to be still born.

It is worth noting that in large part the AIRmm Column 2 paradigm was actually central to the 2012 White Paper’s values, and with it the subsequent Outcomes Framework consultation – both of which repeatedly used the phrase “bottom-up” to refer to how the new system of accountability would need to work, but somehow this seems to have become lost in legislative procedures that history will come to regard as having been overly ambitious. The need for a new paradigm of accountability however remains – and without it health workers and clinicians – and the managers who support them – will continue to view metrics more as something intrusive than as something that can support them in delivering enhancements in sustained outcomes. In our view the Stevens’ Five Year Forward View makes this new kind of accountability an imperative.

“Society, in general, and leaders and opinion formers, in particular, (including national and local media, national and local politicians of all parties, and commentators) have a crucial role to play in shaping a positive culture that, building on these strengths, can realise the full potential of the NHS.
When people find themselves working in a culture that avoids a predisposition to blame, eschews naïeve or mechanistic targets, and appreciates the pressures that can accumulate under resource constraints, they can avoid the fear, opacity, and denial that will almost inevitably lead to harm.”
Berwick Report (3)

Changing cultures means changing our habits – it starts with us. It won’t be easy because people default to the familiar, to more of the same. Hospitals are easier to build than relationships; operations are easier to measure than knowledge, skills and confidence; and prescribing is easier than enabling. The two of us do not of course possess a monopoly on all possible solutions, but our experience tells us that now is the time for: evidence-rich storytelling by front line teams; by pharmaceutical development teams; by patients and carers conversing jointly with their physicians.

We know that measurement is not a magic bullet, but what frightens us is that the majority of people seem content to avoid it altogether. As Oliver Moody recently noted in The Times ..

Call it innumeracy, magical thinking or intrinsic mental laziness, but even intelligent members of the public struggle, through no fault of their own, to deal with statistics and probability. This is a problem. People put inordinate amounts of trust in politicians, chief executives, football managers and pundits whose judgment is often little better than that of a psychic octopus.     Short of making all schoolchildren study applied mathematics to A level, the only thing scientists can do about this is stick to their results and tell more persuasive stories about them.

Too often, Disraeli’s infamous words: “Lies, damned lies, and statistics” are used as the refuge of busy professionals looking for an excuse to avoid numbers.

If Improvement Science is to become a shared language, Berwick’s recommendation that all NHS staff “Learn, master and apply the modern methods of quality control, quality improvement and quality planning” has to be taken seriously.

As a first step we recommend enabling teams to access good data in as near to real time as possible, data that indicates the impact that one’s intervention is having – this alone can prompt a dramatic shift in the type of conversation that people working in and on their system may have. Often this can be initiated simply by converting existing KPI data into System Behaviour Chart form which, using a tool like BaseLine® takes only a few mouse clicks.

In our longer paper we offer three examples of Improvement Science in action – combining to illustrate how data may be used to evidence both sustained systemic enhancement, and to generate engagement by the people most directly connected to what in real time is systemically occurring.

1. A surgical team using existing knowledge established by column 3-type research as a platform for column 2-type analytic study – to radically reduce post-operative surgical site infection (SSI).

2. 25 GP practices are required to collect data via the Friends & Family Test (FFT) and decide to experiment with being more than merely compliant. In two practices they collectively pilot a system run by their PPG (patient participation group) to study the FFT score – patient by patient – as they arrive each day. They use IS principles to separate signal from noise in a way that prompts the most useful response to the feedback in near to real time. Separately they summarise all the comments as a whole and feed their analysis into the bi-monthly PPG meeting. The aim is to address both “special cause” feedback and “common cause” feedback in a way that, in what most feel is an over-loaded system, can prompt sensibly prioritised improvement activity.

3. A patient is diagnosed with NAFLD and receives advice from their doctor to get more exercise e.g. by walking more. The patient uses the principles of IS to monitor what happens – using the data not just to show how they are complying with their doctor’s advice, but to understand what drives their personal mind/body system. The patient hopes that this knowledge can lead them to better decision-making and sustained motivation.

The landscape of NHS improvement and innovation support is fragmented, cluttered, and currently pretty confusing. Since May 2013 Academic Health Science Networks (AHSNs) funded by NHS England (NHSE) have been created with the aim of bringing together health services, and academic and industry members. Their stated purpose is to improve patient outcomes and generate economic benefits for the UK by promoting and encouraging the adoption of innovation in healthcare. They have a 5 year remit and have spent the first 2 years establishing their structures and recruiting, it is not yet clear if they will be able to deliver what’s really needed.

Patient Safety Collaboratives linked with AHSN areas have also been established to improve the safety of patients and ensure continual patient safety learning. The programme, coordinated by NHSE and NHSIQ will provide safety improvements across a range of healthcare settings by tackling the leading causes of avoidable harm to patients. The intention is to empower local patients and healthcare staff to work together to identify safety priorities and develop solutions – implemented and tested within local healthcare organisations, then later shared nationally.

We hope our papers will significantly influence the discussions about how improvement and innovation can assist with these initiatives. In the shorter paper to echo Deming, we even include our own 14 points for how healthcare organisations need to evolve. We will know that we have succeeded if the papers are widely read; if we enlist activists like Ben to the definition of science embodied by Improvement Science; and if we see a tidal wave of improvement science methods being applied across the NHS?

As patient volunteers, we each intend to find ways of contributing in any way that appears genuinely helpful. It is our hope that Improvement Science enables the cultural transformation we have envisioned in our papers and with our case studies. This is what we feel most equipped to help with. When in your sixties it easy to feel that time is short, but maybe people of every age should feel this way? In the words of Francis Bacon, the father of the scientific method.

goodscience_francisbaconquote

Download Long Version

References

goodscience_refs

It was the time for Bob and Leslie’s regular coaching session. Dr_Bob_ThumbnailBob was already on line when Leslie dialed in to the teleconference.

<Leslie> Hi Bob, sorry I am a bit late.

<Bob> No problem Leslie. What aspect of improvement science shall we explore today?

<Leslie> Well, I’ve been working through the Safety-Flow-Quality-Productivity cycle in my project and everything is going really well.  The team are really starting to put the bits of the jigsaw together and can see how the synergy works.

<Bob> Excellent. And I assume they can see the sources of antagonism too.

<Leslie> Yes, indeed! I am now up to the point of considering productivity and I know it was introduced at the end of the Foundation course but only very briefly.

<Bob> Yes,  productivity was described as a system metric. A ratio of a steam metric and a stage metric … what we get out of the streams divided by what we put into the stages.  That is a very generic definition.

<Leslie> Yes, and that I think is my problem. It is too generic and I get it confused with concepts like efficiency.  Are they the same thing?

<Bob> A very good question and the short answer is “No”, but we need to explore that in more depth.  Many people confuse efficiency and productivity and I believe that is because we learn the meaning of words from the context that we see them used in. If  others use the words imprecisely then it generates discussion, antagonism and confusion and we are left with the impression of that it is a ‘difficult’ subject.  The reality is that it is not difficult when we use the words in a valid way.

<Leslie> OK. That reassures me a bit … so what is the definition of efficiency?

<Bob> Efficiency is a stream metric – it is the ratio of the minimum cost of the resources required to complete one task divided by the actual cost of the resources used to complete one task.

<Leslie> Um.  OK … so how does time come into that?

<Bob> Cost is a generic concept … it can refer to time, money and lots of other things.  If we stick to time and money then we know that if we have to employ ‘people’ then time will cost money because people need money to buy essential stuff that the need for survival. Water, food, clothes, shelter and so on.

<Leslie> So we could use efficiency in terms of resource-time required to complete a task?

<Bob> Yes. That is a very useful way of looking at it.

<Leslie> So how is productivity different? Completed tasks out divided by cash in to pay for resource time would be a productivity metric. It looks the same.

<Bob> Does it?  The definition of efficiency is possible cost divided by actual cost. It is not the as our definition of system productivity.

<Leslie> Ah yes, I see. So do others define productivity the same way?

<Bob> Try looking it up on Wikipedia …

<Leslie> OK … here we go …

Productivity is an average measure of the efficiency of production. It can be expressed as the ratio of output to inputs used in the production process, i.e. output per unit of input”.

Now that is really confusing!  It looks like efficiency and productivity are the same. Let me see what the Wikipedia definition of efficiency is …

“Efficiency is the (often measurable) ability to avoid wasting materials, energy, efforts, money, and time in doing something or in producing a desired result”.

But that is closer to your definition of efficiency – the actual cost is the minimum cost plus the cost of waste.

<Bob> Yes.  I think you are starting to see where the confusion arises.  And this is because there is a critical piece of the jigsaw missing.

<Leslie> Oh …. and what is that?

<Bob> Worth.

<Leslie> Eh?

<Bob> Efficiency has nothing to do with whether the output of the stream has any worth.  I can produce a worthless product with low waste … in other words very efficiently.  And what if we have the situation where the output of my process is actually harmful.  The more efficiently I use my resources the more harm I will cause from a fixed amount of resource … and in that situation it is actually safer to have an inefficient process!

<Leslie> Wow!  That really hits the nail on the head … and the implications are … profound.  Efficiency is objective and relates only to flow … and between flow and productivity we have to cross the Safety-Quality line. Productivity also includes the subjective concept of worth or value. That all makes complete sense now. A productive system is a subjectively and objectively win-win-win design.

<Bob> Yup.  Get the safety, flow and quality perspectives of the design in synergy and productivity will sky-rocket. It is called a Fit-4-Purpose design.

stick_figure_balance_mind_heart_150_wht_9344Improvement implies learning.  And to learn we need feedback from reality because without it we will continue to believe our own rhetoric.

So reality feedback requires both sensation and consideration.

There are many things we might sense, measure and study … so we need to be selective … we need to choose those things that will help us to make the wise decisions.


Wise decisions lead to effective actions which lead to intended outcomes.


Many measures generate objective data that we can plot and share as time-series charts.  Pictures that tell an evolving story.

There are some measures that matter – our intended outcomes for example. Our safety, flow, quality and productivity charts.

There are some measures that do not matter – the measures of compliance for example – the back-covering blame-avoiding management-by-fear bureaucracy.


And there are some things that matter but are hard to measure … objectively at least.

We can sense them subjectively though.  We can feel them. If we choose to.

And to do that we only need to go to where the people are and the action happens and just watch, listen, feel and learn.  We do not need to do or say anything else.

And it is amazing what we learn in a very short period of time. If we choose to.


If we enter a place where a team is working well we will see smiles and hear laughs. It feels magical.  They will be busy and focused and they will show synergism. The team will be efficient, effective and productive.

If we enter place where is team is not working well we will see grimaces and hear gripes. It feels miserable. They will be busy and focused but they will display antagonism. The team will be inefficient, ineffective and unproductive.


So what makes the difference between magical and miserable?

The difference is the assumptions, attitudes, prejudices, beliefs and behaviours of those that they report to. Their leaders and managers.

If the culture is management-by-fear (a.k.a bullying) then the outcome is unproductive and miserable.

If the culture is management-by-fearlessness (a.k.a. inspiring) then the outcome is productive and magical.

It really is that simple.

beehive_bees_150_wht_12723There is a condition called SFQPosis which is an infection that is transmitted by a vector called an ISP.

The primary symptom of SFQPosis is sudden clarity of vision and a new understanding of how safety, flow, quality and productivity improvements can happen at the same time …

… when they are seen as partners on the same journey.


There are two sorts of ISP … Solitary and Social.

Solitary ISPs infect one person at a time … often without them knowing.  And there is often a long lag time between the infection and the appearance of symptoms. Sometimes years – and often triggered by an apparently unconnected event.

In contrast the Social ISPs will tend to congregate together and spend their time foraging for improvement pollen and nectar and bringing it back to their ‘hive’ to convert into delicious ‘improvement honey’ which once tasted is never forgotten.


It appears that Jeremy Hunt, the Secretary of State for Health, has recently been bitten by an ISP and is now exhibiting the classic symptoms of SFQPosis.

Here is the video of Jeremy describing his symptoms at the recent NHS Confederation Conference. The talk starts at about 4 minutes.

His account suggests that he was bitten while visiting the Virginia Mason Hospital in the USA and on return home then discovered some Improvement hives in the UK … and some of the Solitary ISPs that live in England.

Warwick and Sheffield NHS Trusts are buzzing with ISPs … and the original ISP that infected them was one Kate Silvester.

The repeated message in Jeremy’s speech is that improved safety, quality and productivity can happen at the same time and are within our gift to change – and the essence of achieving that is to focus on flow.

SFQPThe sequence is safety first (eliminate the causes of avoidable harm), then flow second (eliminate the causes of avoidable chaos), then quality (measure both expectation and experience) and then productivity will soar.

And everyone will  benefit.

This is not a zero-sum win-lose game.


So listen for the buzz of the ISPs …. follow it and ask them to show you how … ask them to innoculate you with SFQPosis.


And here is a recent video of Dr Steve Allder, a consultant neurologist and another ISP that Kate infected with SFQPosis a few years ago.  Steve is describing his own experience of learning how to do Improvement-by-Design.

FISH_ISP_eggs_jumpingResistance-to-change is an oft quoted excuse for improvement torpor. The implied sub-message is more like “We would love to change but They are resisting“.

Notice the Us-and-Them language.  This is the observable evidence of an “We‘re OK and They’re Not OK” belief.  And in reality it is this unstated belief and the resulting self-justifying behaviour that is an effective barrier to systemic improvement.

This Us-and-Them language generates cultural friction, erodes trust and erects silos that are effective barriers to the flow of information, of innovation and of learning.  And the inevitable reactive solutions to this Us-versus-Them friction create self-amplifying positive feedback loops that ensure the counter-productive behaviour is sustained.

One tangible manifestation are DRATs: Delusional Ratios and Arbitrary Targets.


So when a plausible, rational and well-evidenced candidate for an alternative approach is discovered then it is a reasonable reaction to grab it and to desperately spray the ‘magic pixie dust’ at everything.

This a recipe for disappointment: because there is no such thing as ‘improvement magic pixie dust’.

The more uncomfortable reality is that the ‘magic’ is the result of a long period of investment in learning and the associated hard work in practising and polishing the techniques and tools.

It may look like magic but is isn’t. That is an illusion.

And some self-styled ‘magicians’ choose to keep their hard-won skills secret … because by sharing them know that they will lose their ‘magic powers’ in a flash of ‘blindingly obvious in hindsight’.

And so the chronic cycle of despair-hope-anger-and-disappointment continues.


System-wide improvement in safety, flow, quality and productivity requires that the benefits of synergism overcome the benefits of antagonism.  This requires two changes to the current hope-and-despair paradigm.  Both are necessary and neither are sufficient alone.

1) The ‘wizards’ (i.e. magic folk) share their secrets.
2) The ‘muggles’ (i.e. non-magic folk) invest the time and effort in learning ‘how-to-do-it’.


The transition to this awareness is uncomfortable so it needs to be managed pro-actively … by being open about the risk … and how to mitigate it.

That is what experienced Practitioners of Improvement Science (and ISP) will do. Be open about the challenged ahead.

And those who desperately want the significant and sustained SFQP improvements; and an end to the chronic chaos; and an end to the gaming; and an end to the hope-and-despair cycle …. just need to choose. Choose to invest and learn the ‘how to’ and be part of the future … or choose to be part of the past.


Improvement science is simple … but it is not intuitively obvious … and so it is not easy to learn.

If it were we would be all doing it.

And it is the behaviour of a wise leader of change to set realistic and mature expectations of the challenges that come with a transition to system-wide improvement.

That is demonstrating the OK-OK behaviour needed for synergy to grow.

SFQP_enter_circle_middle_15576For a system to be both effective and efficient the parts need to work in synergy. This requires both alignment and collaboration.

Systems that involve people and processes can exhibit complex behaviour. The rules of engagement also change as individuals learn and evolve their beliefs and their behaviours.

The values and the vision should be more fixed. If the goalposts are obscure or oscillate then confusion and chaos is inevitable.


So why is collaborative alignment so difficult to achieve?

One factor has been mentioned. Lack of a common vision and a constant purpose.

Another factor is distrust of others. Our fear of exploitation, bullying, blame, and ridicule.

Distrust is a learned behaviour. Our natural inclination is trust. We have to learn distrust. We do this by copying trust-eroding behaviours that are displayed by our role models. So when leaders display these behaviours then we assume it is OK to behave that way too.  And we dutifully emulate.

The most common trust eroding behaviour is called discounting.  It is a passive-aggressive habit characterised by repeated acts of omission:  Such as not replying to emails, not sharing information, not offering constructive feedback, not asking for other perspectives, and not challenging disrespectful behaviour.


There are many causal factors that lead to distrust … so there is no one-size-fits-all solution to dissolving it.

One factor is ineptitude.

This is the unwillingness to learn and to use available knowledge for improvement.

It is one of the many manifestations of incompetence.  And it is an error of omission.


Whenever we are unable to solve a problem then we must always consider the possibility that we are inept.  We do not tend to do that.  Instead we prefer to jump to the conclusion that there is no solution or that the solution requires someone else doing something different. Not us.

The impossibility hypothesis is easy to disprove.  If anyone has solved the problem, or a very similar one, and if they can provide evidence of what and how then the problem cannot be impossible to solve.

The someone-else’s-fault hypothesis is trickier because proving it requires us to influence others effectively.  And that is not easy.  So we tend to resort to easier but less effective methods … manipulation, blame, bullying and so on.


A useful way to view this dynamic is as a set of four concentric circles – with us at the centre.

The outermost circle is called the ‘Circle of Ignorance‘. The collection of all the things that we do not know we do not know.

Just inside that is the ‘Circle of Concern‘.  These are things we know about but feel completely powerless to change. Such as the fact that the world turns and the sun rises and falls with predictable regularity.

Inside that is the ‘Circle of Influence‘ and it is a broad and continuous band – the further away the less influence we have; the nearer in the more we can do. This is the zone where most of the conflict and chaos arises.

The innermost is the ‘Circle of Control‘.  This is where we can make changes if we so choose to. And this is where change starts and from where it spreads.


SFQP_enter_circle_middle_15576So if we want system-level improvements in safety, flow, quality and productivity (or cost) then we need to align these four circles. Or rather the gaps in them.

We start with the gaps in our circle of control. The things that we believe we cannot do … but when we try … we discover that we can (and always could).

With this new foundation of conscious competence we can start to build new relationships, develop trust and to better influence others in a win-win-win conversation.

And then we can collaborate to address our common concerns – the ones that require coherent effort. We can agree and achieve our common purpose, vision and goals.

And from there we will be able to explore the unknown opportunities that lie beyond. The ones we cannot see yet.

Troublemaker_vs_RebelSystem-wide, significant, and sustained improvement implies system-wide change.

And system-wide change implies more than 20% of the people commit to action. This is the cultural tipping point.

These critical 20% have a badge … they call themselves rebels … and they are perceived as troublemakers by those who profit most from the status quo.

But troublemakers and rebels are radically different … as shown in the summary by Lois Kelly.


Rebels share a common, future-focussed purpose.  A mission.  They are passionate, optimistic and creative.  They understand synergy and how to release and align the stored emotional energy of both themselves and others.  And most importantly they are value-led and that makes them attractive.  Values such as honesty, integrity and industry are what make leaders together-effective.

SHCR_logoAnd as we speak there is school for rebels in healthcare gaining momentum …  and their programme is current, open to all and free to access. And the change agent development materials are excellent!

Click here to download their study guide.


Converting possibilities into realities is the essence of design … so our merry band of rebels will also need to learn how to convert their positive rhetoric into practical reality. And that is more physics than psychology.

Streams flow because of physics not because of passion.SFQP_Compass

And this is why the science of improvement is important because it is the synthesis of the people dimension and the process dimension – into a system that delivers significant and sustained improvement.

On all dimensions. Safety, Flow, Quality and Productivity.

The lighthouse is our purpose; the whale represents the magnitude of our challenge; the blue sky is the creative thinking we need … to avoid trying to boil the ocean.

And the noisy, greedy, s****y seagulls are the troublemakers who always will plague us.

[Image by Malaika Art].