Sunday, 17 January 2021

Z - scores? Schmed scores!

 A response to the Covid Sceptic position of the Redline Blog

There is a memorable Sufi parable about the spiritual importance of mortality, and calmness in the face of death. From memory, it goes something like this: A ragged, humble Sufi mystic boards a ship alongside many other passengers. Curious what words of wisdom he has for them, the passengers ask him how he follows the Way. The Sufi replies ‘I think constantly of death’. Uninspired by this grim and negative sounding advice, the passengers go about their normal business as they travel on the ship. A few days later, a terrible storm hits the ship. Far from land, the ship starts to sink and the passengers run about in a mad panic, terrified of their imminent demise. They look in wonder at the Sufi, who sits calmly on his own, totally at peace with the knowledge that his life will end in the next few minutes.

 

A somewhat grotesque 21st century caricature of this wisdom tale would replace the Sufi mystic with the Covid 19 sceptic: “Yes,” he might say in a calm and lofty tone, “Covid 19 does exist and kills a few people here and there. But the numbers are tiny and you shouldn’t worry about them too much. The people who do panic and worry are ignorant and have been led astray by Bad Science. Here, look at my Graphs and learn!”

 

The latest iteration I have come across is an article written by Malcolm Kendrick, re-published on the local NZ Marxist blog ‘Redline’. After presenting a series of impressive looking graphs, Kendrick pours scorn on the apparently irrational and hysterical reaction of people who take the Covid threat seriously:

 

Hopefully, in time, we will learn something. Which is that we should not, ever, run about panicking, following the madly waved banners… ever again. However, I suspect that we will. This pandemic is going to be a model for all mass panicking stupidity in the future. Because to do otherwise, would be to admit that we made a pig’s ear of it this time. Far too many powerful reputations at stake to allow that.

 

The Redline blog published its first piece on the Covid topic on April 3rd 2020, just over a week after the start of New Zealand’s lockdown. Hyperbolically titled ‘Corona fevers and the madness of models’, the article conjures up and deploys the same rhetorical framing of Kendrick’s recent piece: lockdown proponents are hysterical and irrational, sober science does not back up their claims. Daphna Whitmore confidently claimed that lockdowns are ‘destructive’ and unfeasible:

 

[…]Nor is there evidence that a lockdown of an entire country is effective. It has never been done before, let alone encompassing one-third of the world. Why was this extremely destructive action taken? There was abundant evidence that some countries were not being overwhelmed by the virus. Countries such as South Korea, Singapore, Taiwan and Vietnam have not shut down their economies. Their eateries are open and they carry on with a mostly normal life.

 

New Zealand’s aim of eradication is probably unachievable. It would take a closed border and strict quarantine for any arrivals indefinitely. It would also rely on a highly effective vaccine being made which would then have to be made mandatory. None of these are realistic options.

 

Given the outstanding success of the New Zealand response, it appears remarkable that the Redline blog has not offered any sort of retraction or mea culpa for this erroneous prediction. With the publication of Kendrick’s piece, it seems that the focus has shifted away from New Zealand, and the details of the ‘Covid Sceptic’ position have changed in response to our latest knowledge about the virus and its impact.

 

Before I plunge into investigating the statistical claims made by Kendrick, another guiding existential metaphor: just as the ship in the parable above was battered and pelted by winds and rain, leaving the passengers breathless and not sure what to hold on to, we too are battered by Science. Wall to wall media coverage of the Covid pandemic produces anxiety and then fatigue, and even people with statistical training will find themselves wearied and frustrated by the sheer enormity and endlessness of the graphs, predictions, models and commentaries which constitute the Covid Discourse. In what follows I have stepped gingerly in to the statistical vortex, doing my best to look for the most simple and solid handholds with which to pin down and judge some of the claims made by Kendrick. Some of his comments and claims are worth investigating, whereas some are clearly not – I’ll focus mostly on those which deserve scrutiny.

 

On the question of the origin of the virus, Kendrick states:

 

So, what do I know? I know that COVID19 exists – or I am as certain of this as I can be. Was it a natural mutation from a bat, or was it created in a laboratory? Well, I suppose it doesn’t really matter. It’s here, and there is no chance that any Government, anywhere, would ever admit responsibility for creating the damned thing. So, we will never know. If you asked me to bet, I would say it was created in a lab, then escaped by accident.

 

I’ll leave it up to the reader to judge his guess that it probably came out of a lab ‘by accident’. Notice what he is doing here though – by speculating that it was ‘accidental’, he avoids looking like a conspiracy nut. The sceptical take “Well we have all this information but we just don’t know for sure …” is applied to just about every aspect of the Covid phenomena, and the fact that he is a doctor is referenced several times to give his views more authority and importance. The glib pronouncement about the origins question, ‘Well, I suppose it doesn’t really matter’ is quite incredible, especially for readers of a Marxist blog. Facts like these really do matter for how Covid is interpreted politically!!

 

Anyway, moving on: Kendrick goes on to claim that because people who supposedly die of Covid often have many co-morbidities, and because determining exact cause of death is not an exact science, it is likely that Covid 19 death statistics are overreported:

 

There are so many cases where – even if the COVID19 test was accurate – COVID19 would have had nothing whatsoever to do with the death. Another thing known, or at least we probably know, is that the vast majority of people who die had many other things wrong with them.

 

One way of testing this claim is to compare reported Covid deaths with excess mortality statistics from 2020. Kendrick himself goes on at great length about the importance and robustness of excess mortality data: given that there is doubt and uncertainty about the integrity of worldwide Covid death data (which could in theory be either over-reported or under-reported), comparing total 2020 deaths with average figures from previous years and looking at the difference gives us a more comprehensive and ‘solid’ view of the impact of Covid 19. Of course there are a host of interpretative issues which complicate the ‘solidity’ of the excess mortality statistic: population size and demographics, whether the population has changed drastically over the past five years (usually the way they work out a baseline) and the question of whether there are significant numbers of deaths resulting from other causes. But if we look at a comparison of reported Covid deaths with excess mortality figures for a whole lot of different countries, many of these issues are (indirectly) addressed and we can see a clear pattern. Here is a screenshot from a recent article in the Economist, which uses the same EuroMOMO data set referred to by Kendrick:

 



 

In just about all of the countries listed, the excess mortality figure is larger than the reported Covid death figure. In some countries the difference is massive (eg, Russia and the US). I have highlighted Britain, one of the few countries where the reported Covid death figure is larger (by a very tiny amount!) than the excess mortality figure. These comparisons clearly show that if we are going to worry about the accuracy of Covid 19 death data, the worry should be about under-reporting not over-reporting.

 

It is also interesting to note here the views of Christopher J Snowdon, writing for Quillette. Snowdon is a libertarian, and has also been a critic of lockdowns. Politically, worlds away from anything resembling Marxism, but he appears keen to avoid the disingenuous and illogical pitfalls of Covid scepticism:

 

 

A rise in the number of excess deaths would be compelling evidence that the people dying “with COVID” had died of COVID and would not have died of anything else that year. The ONS has recorded excess mortality every week since mid-October, with the north-west hardest hit at first followed by London and the south-east more recently. In total, there were 71,731 excess deaths in England last year and 76,610 people had COVID-19 mentioned on their death certificate. Coincidence? Why yes, say the sceptics. They claim that the excess deaths were not caused by COVID-19, but by the lockdowns themselves. In any case, they say, the rate of excess mortality is lower than it was in the spring and the current rate is not without historical precedent. Any suggestion that there would have been even more deaths without lockdowns is dismissed as impossible because “lockdowns don’t work.”

 

Kendrick attempts to impress his readers by referencing the EuroMOMO data on excess mortality, and taking us through a whirlwind tour of various graphs copied from this site. He quickly and confidently pronounces ‘Look – nothing to see here! Hardly any statistical significance!’. Things get tricky and sophisticated when he uses z scores, and denies any sort of correlation between lockdown measures in specific countries with their success in reducing excess mortality. If you actually go to the trouble of looking carefully at sites such as EuroMOMO or Our World in Data it is not that hard to spot where Kendrick goes wrong. His analysis is fast and superficial, he cherry-picks misleading graphs and he fails to interpret statistics correctly. I will start with the graph of raw worldwide excess mortality data Kendrick leads with:

 



 

Kendrick acknowledges the big Covid spike in early 2020, but then compares the ‘winter spike’ of 2020 with similar winter spikes in 2018 and 2019. Not that different at all right? All that fuss and bother for what – just a few thousand lives out of populations of hundreds of millions? Extremely hysterical and irrational, surely? Well, here are a couple of other graphs to consider, taken from exactly the same dataset:

 

A.   Cumulative view:



B.   Weekly view



The difference between 2020 and the other years is huge and noticeable. The margins are in the hundreds of thousands, and it is really clear that the second wave cannot be explained by a regular pattern of increased deaths over the European winter. The Economist article referenced above uses the same dataset to provide yet another graph, emphasising the gravity of the European situation:

 

“The chart below uses data from EuroMOMO, a network of epidemiologists who collect weekly reports on deaths from all causes in 24 European countries, covering 290m people. These figures show that, compared with a historical baseline of 2009-19, Europe has suffered some deadly flu seasons since 2016—but that the death toll this year from covid-19 is far greater. Overall, the number of excess deaths across the continent since March is about 170,000. Though most of those victims have been older than 65, the number of deaths among Europeans aged 45-64 was 40% higher than usual in early April.”



 

Next, Kendrick goes into graphs showing excess mortality for individual European countries. He swiftly explains his use of ‘z scores’ instead of raw numbers, and then shows the graph for England:

 

It is a thing called the Z-score. Which means standard deviation from the mean. Sorry, maths. If the Z-score goes above five, this means something significant is happening. The red, upper, dotted line is Z > 5. As you can see, despite the howls of anguish from England about COVID19 overwhelming the country, we are really not seeing much at all.



 

For anyone out there interested in the math, the most useful explainer I could find was on the ‘Our World in Data’ site. There is a very clear and readable article which goes through the methodologies used by EuroMOMO and other dataset providers, explaining how measures such as the z score are calculated and the strengths and weaknesses of different statistics. For the (more likely) people out there who won’t read such boring articles, or who struggle to get their heads around abstract representations of variation in time series data sets, it’s probably useful to slow down and back up to the actual numbers themselves. The famous Mark Twain quote ‘Lies, lies and damn statistics!’ gets its justly deserved status from the immense pliability and manipulability of statistics. Motivated reasoners, if they are clever enough, can tell just about any story they want if they use the “right” statistic. Returning back to raw data to check on and monitor conclusions reached by using complex mathematical formulae is a useful heuristic. If your fancy-pants statistic tells you a different story from the raw data, then it’s quite likely that you are using the fancy-pants statistic in an inappropriate manner. Here is the raw excess death data for England, taken this time from the ‘Our World in Data’ site:

 

 


 


 

Again, the graph shows a clear increase from the average figures starting around November 2020. If you examine the graph closely with the data embedded, the difference between the 2020 deaths and the average deaths from the 2015-2019 baseline is around 1,500 to 2000 for every week from mid November to January 3rd. If we knew for sure that had Covid 19 not happened 2020 would have been very similar to the ‘Average 2015-2019’ line, then we could conclude with confidence that some 10 – 12,000 people died in an eight week period in England who would have been alive had Covid not happened. We don’t know this for sure – partly just because of the variability of the baseline data, partly because we can never be confident about counterfactuals. But that sure is a big gap between the red line and the other lines. And if we notice similar (or bigger!) gaps in most other countries affected by Covid (as we do), then it makes the case stronger to interpret the increase as Covid caused, not just an effect of variability.

 

Anyway, back to the z scores – these are designed specifically to deal with variability, but there are difficulties. Another measure, favoured over z scores by the Our World in Data site, is the P score. The P score is just the percentage difference between the average baseline and the excess mortality for each week. I’m going to finish up by quoting from the discussion article referenced above and showing the P score graph for England as a comparison:

 

EuroMOMO’s measures of weekly excess mortality in Europe show the mortality patterns between different time-periods, across countries, and by age-groups. The Z-scores standardise data on excess deaths by scaling by the standard deviation of deaths. EuroMOMO are currently not permitted to publish actual excess death figures by country and do not publish the standard deviations used in their calculations. However, they graph the Z-scores and the estimated confidence intervals back to 2015 providing a visual guide to their variability. In contrast to the P-scores, the Z-scores are a measure that is less easily interpretable. Moreover, if the natural variability of the weekly data is lower in one country compared to another, then the Z-scores could lead to exaggeration of excess mortality compared to the P-scores. Strictly, the Z-scores are not comparable across countries, though see the caveats in section 4.1. [….]

Another major defect of Z-scores, compared to P-scores and per capita excess death measures, is that their cumulation over multiple pandemic weeks is problematic. While excess deaths can be cumulated, the standard deviation of normal deaths cannot, and, in any case, EuroMOMO do not report either excess deaths or these standard deviations. This makes it hard to obtain a comprehensive summary of the pandemic’s impact from the Z-scores.

 

 


Clearly, there are some big differences between the ‘pictures’ we get through the lens of z scores compared to P scores. Indeed for all of the European countries Kendrick breezes through in his lighting-fast survey, the ‘Our World in Data’ graphs using P scores tell a very different story.

 

A very recent article in the online Guardian reports that “University lecturers will not resume “unsafe” face-to-face teaching this academic year, and any attempt by the government or vice-chancellors to reopen campuses in February will fail, the UK’s largest academic union has warned.” Maybe these ignorant lecturers need a good dose of Sufic mysticism to calm their illogical fears.