Dr Foster Hospital Guide 2009
The Dr Foster Hospital Guide 2009 was released last week. This release was notable because it followed the censure of the Basildon and Thurrock University Hospitals NHS Foundation Trust by the regulator Monitor. The Dr Foster report was released soon after Monitor intervened in those two hospitals and lurid reports appeared in the press.
Unfortunately the newspapers seem to have simply taken headline figures from the Dr Foster report and then extrapolated to suggest alarming figures of unexpected deaths at our hospitals. The press justify their actions by pointing out that the Dr Foster Hospital Report listed Basildon and Thurrock at the bottom of their league table, they say that this proves the "accuracy" of the report, and then make exaggerated claims about the other 145 trusts in the list.
The problem is that the Dr Foster report is mostly a relative report, that is, how hospitals compare with each other and that since hospital standards have increased considerably over the last few years it means that for many hospitals the Dr Foster results are not indicators of poor treatment. Indeed, as explained below, a hospital can improve on standards of care over the year and yet get a worse score in reward. Furthermore, the Press treat the Dr Foster report as being definitive. This is far from the truth, for example CHKS Ltd produce a hospital report (available from here), using the same data, but producing very different results.
Dr Foster Patient Safety League Table
Information about how the league table is compiled will be given later, but first let us examine the published results. The Dr Foster Hospital Guide league table purports to list all the hospital trusts in England in order of "patient safety": the most safe hospital has a score of 100.00 and the least safe has a score of 0.00. There are 146 trusts listed in five bands, the "most safe" are in band 5 and the "least safe" in band 1. The following table is a summary of the league table bands.
|Band||Score Range||Number of Trusts|
|5||100.00 - 91.78||14|
|4||90.41 - 56.85||48|
|3||54.79 - 41.78||25|
|2||37.67 - 8.90||47|
|1||6.85 - 0.00||12|
The first point to make is that no absolute meaning can be placed on any score. For example, University College London Hospitals has a score of 100.00, but this does not indicate that it is ten times safer than Peterborough and Stamford Hospitals with a score of 10.27. The scores exist solely to place the hospitals in order and they give little information about actual safety within the hospital. UCL Hospitals could be a thousand times more safe, or just 0.1% more safe than Peterborough and Stamford Hospitals and they will still have the same scores.
Next, look at these three extracts from the table:
|1||University College London Hospitals NHS Foundation Trust||100.00||5|
|9||Mid Staffordshire NHS Foundation Trust||93.84||5|
|146||Basildon and Thurrock University Hospitals NHS Foundation Trust||0.00||1|
At the top of the league table is UCL Hospitals with a score of 100.00, and this is placed in band 5. But notice that at position number 9, and within band 5, is Mid Staffordshire NHS Foundation Trust. You may recognise this trust from the flurry of news reports at the beginning of 2009 when Monitor issued an intervention notice. The Healthcare Commission had received indications that there were issues at Mid Staffs many months before and had conducted an investigation from April 2008 lasting until the end of the year. When the Healthcare Commission published the investigation it was clear that there were serious issues and so Monitor, the regulator of Foundation Trusts, issued an intervention notice so that the hospital governance could be changed. The Healthcare Commission noted in 2008 that there were "serious concerns about the A&E department at Stafford, including low staffing levels, poor leadership and the structure of the department". These concerns were made during the period when the basic information were collated to generate the Dr Foster league table.
If the Healthcare Commission were identifying problems then Dr Foster have clearly got the ranking of Mid Staffordshire wrong. At the very bottom of the list is Basildon and Thurrock with a score of zero. Basildon and Thurrock, of course, was also subject to intervention by Monitor after a Care Quality Commission inspection showed poor cleanliness standards. While the Dr Foster league table correctly highlights Basildon and Thurrock as having poor care, it ranks Mid Staffordshire as one of the best hospital trusts in the country. This means that the hospitals ranked below this hospital trust are worse, and hence it would imply that interventions should be in order at almost all trusts in England. This is not the case, so clearly Mid Staffordshire was misplaced in the league table, but if this is the case then can we be confident in the ranking?
All the values in the league table are quoted to two decimal places, that is, they are accurate to within one in ten thousand. In all areas of science the number of significant figures in a quoted figure is determined by the accuracy of the data used to make up that figure. An accuracy of one in ten thousand for the ranking score means that the least accurate datum used in the calculation of the ranking must be accurate to one in ten thousand. This is unlikely to be the case with the data Dr Foster uses. Indeed, the Dr Foster safety report for each trust quotes safe levels to be within a range of values, clearly showing that these criteria have a wide variation and hence the ranking score cannot be quoted as a figure with four significant figures.
Further, such an accuracy for 146 data points would normally indicate little chance of there being points with the same value. But this is not the case with Dr Foster's table. In fact, rather tellingly, there are only 99 unique values in the score column. For example, the following four trusts all have the same score:
|68||Harrogate and District NHS Foundation Trust||52.74||3|
|69||Royal United Hospital Bath NHS Trust||52.74||3|
|70||University Hospitals of Morecambe Bay NHS Trust||52.74||3|
|71||Lancashire Teaching Hospitals NHS Foundation Trust||52.74||3|
Which is the "safest" of these four hospitals? You don't know. You cannot tell because they have the same score. Dr Foster have clearly decided that there is some other factor that should be used to rank these hospitals, because they have not taken the obvious route of listing them in alphabetical order. What is the chance that all four hospital trusts will get exactly the same score? Not very high. It is as if the statistical process used to rank the hospitals assume that there are just 99 positions available and assign these to the hospitals, forcing some hospitals to share a position.
As mentioned in the introduction, CHKS Ltd also produce a hospital report and their Top Hospitals 2009 report lists what they regard as being the top 40 hospital trusts in the country (they include Wales and Northern Ireland, Dr Foster just covers England). CHKS Ltd do not produce a league table, they simply give an alphabetic list (clearly an indication of how difficult it is to produce a league table). Only eleven of the top forty of the Dr Foster Patient Safety league table appear in the CHKS top forty list. On initial sight this could be explained away because CHKS have more trusts to choose from. But more interesting is that out of the bottom forty in the Dr Foster league table, nine of them appear in the CHKS top forty list. This would indicate that either they are wrongly listed near the bottom in the Dr Foster list, or they are wrongly listed in the CHKS Top Hospitals list. Or indeed, it could indicate that it is not possible to produce any ranking of hospitals.
All of these points indicate that the Dr Foster Patient Safety league table has problems and that patients (and crucially, journalists) must be wary about making conclusions from the Dr Foster results.
Mid Staffordshire NHS Foundation Trust
The last section mentioned that Mid Staffordshire was ranked as one of the top ten trusts in England in the Patient Safety league table. In this section more information is given so that you can see why the placing of Mid Staffordshire should be considered an odd thing.
The Dr Foster Hospital Guide is dated 2009, but the methodology page on the Dr Foster website says:
Where the HSMR is calculated over one year, the time frame is April 2008 - March 2009.
(HSMR is the Hospital Standardised Mortality Ratio, a measure used by Dr Foster to indicate death rates in hospitals.) This time scale is significant when one considers the Dr Foster Quality account and Patient Safety score for this trust. This period includes the time when Mid Staffordshire was under investigation by the Healthcare Commission. During this time the Healthcare Commission reports that "When we visited the A&E department in May 2008, the initial evidence raised serious concerns. ... As late as September 2008, we found unacceptable examples of assessment and management of patients. The trust was poor at identifying and investigating such incidents." Further, the Commission reports "It was reported that 86% of patients requiring surgery [for fractured hip] were operated on within 24 hours. This issue had not been resolved during our visits in the summer of 2008 and we highlighted our concerns to the trust."
In addition, the Healthcare Commission notes that for the period April 2008 to November 2008 there were four mortality outlier alerts (two by Dr Foster and two by the Healthcare Commission). A mortality outlier alert is an indication that there is a cause for an investigation. Of these four, the Healthcare Commission that in the case of two, the "quality of care was not a concern", in the case of one of the remainder the trust's action plan to improve quality of care was found to be acceptable and for the fourth the case was escalated. That is, half of the alerts were justified and a quarter were so serious that the alert had to be escalated.
All of this information from the Healthcare Commission's report for Mid Staffordshire indicates that there were serious problems at the trust during the period April 2008 to the end of the year, which was the time period when Dr Foster gathered the patient safety data which resulted in Mid Staffordshire being given a Patient Safety score of 93.84. It has to be pointed out that HSMR values for Mid Staffordshire during the time of the investigation by the Healthcare Commission dropped dramatically from 127 (2007) to 92.1 for 2008/9. That Dr Foster gave Mid Staffordshire such a high score for Patient Safety during a period when the trust had safety issues, but a low HSMR indicates that a low HSMR must be the most influential value in determining the Patient Safety score.
South Warwickshire General Hospitals Trust Scores
The Dr Foster Hospital Guide lists South Warwickshire General Hospitals Trust in three places:
- Patient Safety league table: 17.81, band 2
- Percentage of suspected stroke patients given a CT scan within 24 hours: 36%
- Hospital Standardised Mortality Ratio: 116
The last section speculates that the Dr Foster Patient Safety score is heavily influenced by the the HSMR values for a trust. Since South Warwickshire has a high HSMR it can be concluded that regardless of any of the other values used to determine the Patient Safety score it is not possible for the Trust to gain a good score.
National guidelines say that hospitals must carry out a computerised tomography (CT) scan within 24 hours, to see where exactly the stroke took place. This enables clinicians to determine whether the stroke was a clot or a bleed and hence determine the best treatment. Clearly the sooner the CT scan is performed the better and Dr Foster lists the ten "worst" hospitals who scan less than 40% of suspected stroke patients. South Warwickshire score 36%. This appears clear cut, however, this is not the case. The value of 36% is from March 2008, and the report, of course, is dated 2009. The latest scores are available from the hospital: at the hospital board meeting on 2 December 2009 the Standards and Targets Report was presented and this shows that the actual value for the year to date is 79.1% (this is in the Patient Outcomes section on the KPIs - Internal sheet). The reason why this is not 100% is because the trust does not have a CT scanner, so arrangements have to be made for patients to be scanned at a neighbouring trust. Clearly the current value is more than twice the value quoted by Dr Foster and the value was obtained more than 18 months after the value that Dr Foster uses. The Dr Foster value is out of date, and if the current values was used then South Warwickshire would not appear in the table. [Note: from April 2011 the trust has had its own CT scanner]
South Warwickshire has high values for HSMR. In response to these scores the Trust says:
The Trust has not received any mortality alerts during the year from quality regulators which are the formal channels to flag any concerns. We therefore asked Dr Foster to help us to understand their model and to seek intelligence on areas where we can make improvements. This work has shown no significant issues and has led us to conclude that crude mortality is a more reliable measure.
It is frustrating that our HSMR is worse than last year despite mortality reducing, however all other indicators are at odds with this and there are no clear defined reasons for the score.
Note the comments made above about mortality outlier alerts: these are generated independent of the hospital trust when there is a situation that requires immediate attention. The generation of outlier alerts does not necessarily mean that there are problems at the trust since such systems are configured to err on the side of caution. However, this extra sensitivity does give more confidence in a trust that has no outlier alerts. This is the case with South Warwickshire, there have been no outlier alerts, yet the HSMR figures are still high.
Dr Foster Quality Account
The Dr Foster website gives more detailed information in what they call the Quality Account for each hospital, and this account has the "other indicators" as mentioned in the statement above. The values for South Warwickshire General Hospitals Trust are given in this section. The quality account lists 13 safety measures. In all areas but two the Trust scores "in line with expected" scores (). The two remaining scores are HSMR (all admissions) and HSMR (non-elective) and these are marked as "below expected" ().
|Marker||Result||Expected Value||Dr Foster Score|
|What is the hospital's overall death rate?||HSMR all admissions 116.15||100|
|What is the hospital's death rate for emergency admissions?||HSMR non-elective 116.48||100|
|What is the death rate for stroke patients?||SMR Stroke 103.48||100|
|What is the death rate for heart attack patients?||SMR AMI 140.54||100|
|What is the death rate for patients admitted with a broken hip?||SMR FNOF 118.60||100|
|What is the death rate for patients admitted for low-risk procedures?||Low mortality CCS groups: 0.0017||0.0016|
|Is the hospital fully compliant with National Patient Safety guidelines?||NPSA alert compliant|
|How consistently are Patient Safety Incidents reported to the NRLS?||NRLS alerts: 6.00|
|How quickly are Patient Safety Incidents reported?||NRLS alerts: 54.00|
|How many Patient Safety Incidents were reported in the first half of last year?||NRLS alerts: 3.56|
|What is the ratio of hospital staff to bed?||Staff to bed ratio: 1.71|
|How well does the hospital control infection?||Infection control (composite) 100.00||100|
|How committed is the trust to patient safety?||Trust commitment to patient safety (composite 90.00)||100|
Expected values are actually a range of values. So, for example, the expected value for SMR AMI is 100, but since there are many factors involved in calculating this value any trust value within the range 46 to 183 (estimated from the graphic on the Dr Foster website) is treated as being "in line with expected". A value for SMR AMI of 140.54 is well within this range even though it is higher than the median (expected) value of 100.
Clearly, no trust wants to get a score that is "below expected", but scores that are deemed as "in line with expected" show that the trust is performing well. This is the case with South Warwickshire.
HSMR Analysis by South Warwickshire General Hospitals Trust
In response to the abnormally high HSMR values calculated by Dr Foster the Chief executive of South Warwickshire presented a report on this measurement to the Board of Directors on the the 2 December 2009. The report is a public document and can be downloaded from here.
The first point to make is that Hospital Standardised Mortality Ratio (HSMR) is a statistical method created by Dr Foster, a commercial company. There are other statistical methods available, for example the Care Quality Commission use their own measure called Standardised Mortality Ratio (SMR), and CHKS Ltd use Risk Adjusted Mortality Index. Which of these values gives the most accurate results? All of them, and none of them: statistics just gives an indication, not an absolute value.
The first graph in the South Warwickshire Mortality Report shows the "crude mortality", that is, the actual percentage of deaths compared to the actual number of patients discharged. Examining the graph from Winter 2007/8 to the present shows two peaks, January 2008 and January 2009, it also shows a dip at March 2009. These three are clearly exceptional, if you examine the graph with these values excluded then the month by month values show a clear trend of reducing mortality from February 2008 to the present (red line).
Different hospitals treat different populations and have different specialities. Therefore, it is not possible to compare the crude mortality ratios for different hospitals. The "standardisation" attempts to correct these differences taking into account things like the age of the patient and severity of their illness and the level of deprivation in the local area.
Each year, when mortality values are gathered, Dr Foster "rebases" its figures. Rebasing is needed because the HSMR figure is a comparison with the expected value, and the expected value is calculated from actual mortality figures from all hospitals and normalised to a value of 100. As standards improve actual mortality rates will decrease, but the expected value will remain at 100, and so standardised mortality ratios are adjusted in relation. A hospital trust witnessing an increase in standardised mortality ratio over several years does not necessarily mean that the raw death rate has increased. For example, consider the case that a trust's mortality rate remains the same but the number of deaths nationally has decreased, this means that trust's standardised mortality ratio will increase, but this does not indicate a lowering of safety.
The South Warwickshire report indicates that rebasing over the last few years has increased HSMR values for the Trust by just a few points, but the rebasing in October 2009 has had a significant effect. This large increase (around 9 points, from 107 to 116) in light of the overall decreasing crude mortality of the Trust casts some doubt on the HSMR value.
Regardless of any doubts in the statistical method of calculating HSMR, the Trust is serious about improving care quality to improve HSMR values. The report identifies three areas which may be significant.
The first was the effect of end of life provision on HSMR. Since the Trust has a significantly higher proportion of patients ending their life in a hospital setting (due to the elderly population of the area) the Trust found that if this end of life provision was standardised in the HSMR calculation then the Trust's HSMR would drop by two points. In addition, the Trust is taking action to improve end of life provision including appointing a palliative care team and greater cooperation with a local hospice.
The next area concerns coding. Clinical coding is the mechanism of converting a diagnosis from a clinician into a code that can be used for statistical processes like HSMR as well as to report to the funding bodies. The Trust points out that external audits show that the Trust is one of the better performing trusts at accurately coding for the purpose of funding, but Dr Foster's analysis indicate that the Trust under-codes. (It must be noted that over-coding - ascribing more diagnosed conditions to each patient for the reason for their death - usually has the effect of reducing HSMR.) Taking this into account the Trust believes that if it coded like other trusts HSMR could be reduced by up to 10 points. The Trust is taking steps to improve clinical coding.
Finally, the Trust recognises that the mortality rate has a link to bed occupancy and to the performance of the A&E department. As as result the Trust has increased the number of beds by opening a new ward in December 2009 (with another ward opening in January 2010) and taken efforts to improve A&E performance.
Patient Safety League Table: A Closer Inspection
The Dr Foster Hospital Report is available from their website. This website also lists the methodology that has been used to create the Patient Safety league table, this shows that there are 16 indicators (the Quality Account for each hospital lists 13 indicators). The Dr Foster page gives some details about how these values are combined to create an aggregate score, but more information can be found in the document Following up mortality 'outliers' on the Care Quality Commission website.
The first step is to calculate the average (mean) value based on scores for the indicator for all trusts, and calculating the standard deviation. The standard deviation is a measure of the spread of the values either side of the mean and statistics theory says that for a normal distribution, 99.7% of values will be within ± three standard deviations of the mean. So if the mean is 100 and the standard deviation is 10, then 99.7% of values should be between 70 and 130.
In the Dr Foster analysis, the mean is taken as the expected value so the next step is to take the value for the indicator for a trust and subtract the expected value and divide by the standard deviation. This gives a value that is positive if the trust's value is greater than the expected value and negative if it is less than the expected value. However, some indicators are 'good' when they are less than the expected value and others are 'bad' when they are less than the expected value, so at this point the sign is adjusted ao that all indicators are in the form that positive is 'bad'. Dividing by the standard deviation scales the indicator so that it is possible to crudely compare different indicators. However, as mentioned above, 99.7% of samples will be within ± three standard deviations and since the indicator has been divided by one standard deviation this means that 99.7% of the adjusted indicator should be a value between -3 and +3. (hence why it is possible to crudely compare the different indicators at this point.) There is a chance that the indicator will have a value outside of this range (0.3% of values will be outside the range), but to make the figures more manageable Dr Foster applies a cap of ±3, that is, if the value is greater than +3 it is replaced with +3 and if the value is less than -3 then the value is capped to -3. Dr Foster calls this final result a z-score.
Now there is a problem. 16 different z-scores can only be accurately represented by 16 league tables. Combining 16 z-scores together into a single value for each trust means that there must be weighting so that the more important z-scores contribute more. The weighting for each z-score is difficult to determine, and this weighting process appears to be a commercial secret known only to Dr Foster (CHKS Ltd also have their own weighting process, which is also a commercial secret). The narative above suggests that HSMR has a significant effect on the final score, so the HSMR z-score must have a greater weighting than other z-scores. Dr Foster provides a vague description of their process by throwing in terms like Bayesian ranking and Monte Carlo procedures but these are generic terms which basically says that Dr Foster uses statistical methods to try and produce an ordering.
Take for example the term Bayesian ranking, where else have you seen this term used? If you have an email spam checking tool this will use Bayesian ranking. The tool will take each email and from its contents generate certain spam indicators: for example, the presence of words that might indicate that the email is spam, the source of the email (does it purport to come from someone you already know or from a stranger, or someone you know generates spam), or conditions like can the actual source of the email be verified, and so on. Some of these indicators are more important than others and each one gives a probability that the email is spam. Bayesian theory combines these indicators into an aggregated probability that indicates the chance that the email is spam. Typically, such tools want to reduce the number of emails falsely indicated as spam, so that genuine email are not deleted and this often allows spam emails through. It may be irritating to you that you have to manually mark false positives as being spam, but it protects you from an over aggressive tool deleting a non-spam email. I hope you will see from this discussion that Bayesian ranking is not an exact science, and does not produce absolute values. Indeed, Bayesian rankinmg is often regarded as a form of fuzzy logic.
(This is a real story. A few years back a friend sent me an email asking if I was a specialist in a particular area - as it happened I was - because there was an employment opportunity in this area that he thought I might be interested in. As a freelancer such emails are important to me. But I didn't receive this email. I contacted my ISP to report that this innocent email had been wrongly identified as spam. My ISP investigated and found that their spam checker had aggressively flagged up the word "specialist" because it thought that this contained a generic term often present in spam email. As a result, my ISP then made an intelligent change to their spam checking routine to allow such innocent emails through. It helped that my ISP is another friend of mine and could make the changes quickly.)
Bayesian ranking is an important statistical tool, but like all statistics it can only produce a result based on probability. An indication of the confidence that you can have with such probabilities can be seen by the comparison of the Dr Foster Patient Safety league table with the CHKS Ltd Top Hospitals list given above. This shows that just nine of the top forty best hospitals in the Dr Foster list are in the CHKS top hospitals list, whereas eleven of the lowest forty hospitals in the Dr Foster list are in the the CHKS top forty list. This disagreement between the two lists of hospitals does not mean that one list is better than the other, but it does show that you should take league tables of good hospitals and bad hospitals with a fair amount of scepticism.
Even if it is possible to list hospitals in a single league table, this raises the question of what the ranking means. For example, does the ranking mean that the hospital at the top is so good that no one dies, and the bottom hospital is where most people die? Or does it mean that there are a few percentage point differences between the mortality rate of the top and bottom hospital? In other words, unless you are informed of the spread of the actual quality then it is difficult to make an informed opinion.
For example, imagine that there is a long distance running race. If entrance is open to everyone then there will be a wide range of abilities and the times of completing the race will be spread over a wide range. On the other hand, if the entrance is restricted to club runners the spread will be far narrower because runners will be of closer ability. The Dr Foster Patient Safety score appears to be a wide range (0.00 to 100.00) but this is a consequence of the Bayesian ranking and is merely generated to produce an ordering. There is no indication in this scoring the range of safety to expect from our hospitals. Such an indication would be useful to patients who have a right to choose the provider of their healthcare, but such information is not available.
All hospitals want to produce high quality, safe care, but there are so many factors that determine whether care is high quality and safe that often it is difficult to identify which factor to target. Reports like Dr Foster could provide valuable information to help hospitals if they flag the factors that are most influential in causing a low safety value. An improvement in the Dr Foster table would be an additional column indicating which indicator contributed the most positively and the most negatively to the score. Another improvement would be a table that gives the relative weighting applied to each z-score to produce the patient safety value. The absence of this additional information means that the Dr Foster Patient Safety league table is less useful than it could be to hospitals.
South Warwickshire scores 121 out of 146 trusts in the Dr Foster Patient Safety league table. It has been shown that the score is most likely due to the high HSMR values given to the Trust, and that the HSMR score calculated by Dr Foster has inexplicably risen over the last year despite an actual fall in deaths at the hospital over this period.
7 December 2009