Let’s have another go, shall we?
Last December we wrote about a paper published in Occupational Medicine, in which the following information was presented in a table:

The study concerned a group of patients who were scrutinised at two time-points, firstly at “baseline”, and secondly at “follow-up”. That is basically all you need to know here.
The table quite clearly shows that 167 patients were “working at baseline and follow-up.” The table also clearly shows that a further 18 patients “dropped out of work at follow-up,” meaning that they were working at baseline but had ceased employment before the follow-up point had been reached. Adding these two numbers together gives us a total of 185 patients who were working at the start of the study, all of which is pretty simple arithmetic.
But as simple as it is, whenever they tried to explain what they (thought they had) found, the authors of the paper somehow managed to repeatedly garble their own statistics.
For example, the authors started by saying this:
“Patients were followed up…and over this period 53% of patients who were working remained in employment.”
Except, actually, 185 patients “were working” of whom 167 “remained in employment.” That proportion is a lot more than 53%. The correct percentage is 90.27%. The authors’ statement was just wrong.
They made this same basic error several more times throughout the paper.
For example, have another look at the table:

You will see that 18 patients had “dropped out of work” at follow-up. Bearing in mind that 185 patients were working at baseline, this means that the percentage of patients who “dropped out of work” before follow-up was 9.72% (i.e., 18 as a percentage of 185).
But this is what the authors said:
“However, of those working at baseline, 6% were unable to continue to work at follow-up.“
Again, this statement is plainly wrong.
Here is an ever-so-slightly trickier example. Take one more look a that table:

You can see that 104 patients were described as “not working at baseline and follow-up,” which means that they never worked at all. You can also see that 27 were described as having “returned to work at follow-up,” which means that they were not working at the start of the study (but were able to return to before the end of the study). Putting these two figures together gives us a total of 131 patients who were not working at baseline.
Here is what the authors said:
“Of the patients who were not working at baseline, 9% had returned to work at follow-up.“
Again, this is just arithmetically incorrect. A total of 131 patients were “not working at baseline” whereas 27 patients had “returned to work at follow-up.” This means that the correct proportion who returned to work at follow-up was 20.61% and not, as the authors claimed, 9%.
It is clear that the authors of this paper made a recurring error when attempting to describe their own statistical findings in plain language. That’s okay. We all make mistakes.
* * *
In December, David Tuller and I wrote to the editor of the journal concerned and described these errors to him. Given the clear and obvious evidence that the journal had published a paper that was replete with statistically erroneous assertions, we called on him to retract it.
However, instead of engaging with us on the substance of the problem, the editor insisted on allowing the authors to compose a rebuttal to our explanation. He then undertook to publish both our statement and whatever the authors would come up with by way of reply.
We thought it strange that the editor would envisage a scenario where statistical errors would remain on the record courtesy of his journal. Nonetheless, we acceded to the request in anticipation that — surely — the authors would realise their mistakes and retract the paper themselves.
Our paper and the authors’ reply were published online today.
* * *
Cutting a long story short — the authors did not seek to have their own paper retracted. Oh well. I guess that is not entirely surprising.
Here is what they said about the fact they claimed that 53% of patients “remained in employment” when in fact a whopping 90.27% had actually done so:
“…we agree that the wording used in the abstract was less precise than it could have been.”
Less precise than it could have been? I suppose this is kind of true, in that it was not precise in any way whatsoever. It was utterly imprecise. It was completely wrong.
The authors argue that it is they who are right — and thus we who are wrong — because they “made it clear” in a sentence in their results section that they “only included the 316 patients in our analyses for whom we have baseline and follow-up data.” As such, they claim it should be clear to readers that all the percentages they mentioned throughout the paper were based on this denominator.
However, this would of course have been very far from clear. Their defence is difficult if not impossible to sustain when you consider the phrases they actually used.
In each of the specific statistical statements we cited, the authors quite specifically identify exactly what they were computing percentages of. They talked about percentages “of those working at baseline” or “of those who were not working at baseline.” They said things like “53% of patients who were working…” and “of those working at baseline, 6% were unable to continue.” It is very difficult to imagine how readers were supposed to understand that these statements were alluding to percentages of all patients in the dataset, rather than to percentages of those subsets DESCRIBED BY THE WORDS THAT THEY USED.
The authors say that they will correct the “53%” statement for the record. We understand this will be by way of a corrigendum. Presumably they will correct all the other mistakes too. But this will require a complete re-writing of their results and conclusions sections. The errors they made were not merely statistical, they were material to their paper and to its alleged findings.
The authors state that “they stand by the results described in our paper.” This is despite the fact that they admit that those descriptions were stated imprecisely. But what does their defiance mean? Do they “stand by” the claim that 53% of patients working at baseline were still working at follow-up? When the real figure was 90.27%? Do they “stand by” the result that claimed that just 9% of patients not working at baseline returned to work? When the real proportion was 20.61%?
It is quite clear that the authors just made a mistake and repeated it without realising. They should admit it, own up, retract this one, and start a fresh new paper. Heck, I’ll even help them to write it if they want me to.
The statistical inaccuracies render the paper useless, and it is silly that it remains on the record. Perhaps the refusal to retract is our fault for not having explained the problems clearly enough. The original paper was pretty long and the statistics at issue were presented in an unorthodox way: frequency data that would normally be cross-tabulated were instead presented as a series of numbers (Ns) listed horizontally across four table-columns. But nonetheless, we thought the errors were pretty obvious once pointed out.
I guess the authors — and the editor — think differently.
Or perhaps they just don’t want to get it.
Cognitive dissonance, sunk costs, and the Dunning-Kruger effect are powerful forces indeed…

Brian Hughes is an academic psychologist and university professor in Galway, Ireland, specialising in stress, health, and the application of psychology to social issues. He writes widely on the psychology of empiricism and of empirically disputable claims, especially as they pertain to science, health, medicine, and politics.