Relatively speaking, it isn’t absolutely true

block
Sara Gorman, PhD :
Tell me you can reduce my risk of something awful happening to me by 100% compared to what I am currently doing. Show me a study that says a new approach or treatment improves the chances of escaping disease by 50% compared to a standard intervention and I am impressed. Fifty and 100 are big numbers and I cannot be faulted for wanting whatever is being offered.
But in both of the assertions offered above there are two important words that need to give us pause: “compared to.” It is very common for both academic health center press releases and media headlines to declare “New treatment reduces risk of X by 50%.” We are sucked in and only by reading the study itself upon which this declaration is based do we get real insight into just how much improvement we can expect.
Let’s look at two recent such reports, one in which an intervention is said to reduce the risk of suicide by 50% and another in which a genetic test supposedly can increase the likelihood a person with depression will respond to a specific antidepressant medication by 30%. Both happen to involve mental health issues, but the problem under discussion here is widespread throughout all of health news reporting.
Preventing suicide is very much on everyone’s minds lately, so it is no surprise that a study of a suicide prevention intervention showing positive results captured a great deal of media attention. In the study, led by a distinguished suicide researcher at Columbia University, people who visited a Veterans Affairs hospital emergency department (ED) with “suicidal concerns” received either an intervention called Safety Planning Intervention (SPI) plus a telephone follow-up or treatment as usual. SPI is a brief intervention that focuses on strategies for dealing with suicidal thoughts. Of the 1186 patients who received SPI in the ED plus telephone follow-up, 36 went on to exhibit “suicidal behavior” over the subsequent six months of follow-up, compared to 24 of 454 patients in the comparison group. The results of the study are stated clearly in the paper as follows: “The SPI+ [SPI plus telephone follow-up] was associated with 45% fewer suicidal behaviors in the 6-month period following the ED visit compared with usual care.” SPI+ patients also had twice the rate of engaging in mental health treatment after the ED visit.
We do not in any way mean to criticize the media for emphasizing the importance of suicide prevention. Headlines stating that the Safety Planning Intervention reduced suicidal behavior by nearly half properly received attention. But let’s look a little more closely at the actual data.
Of the patients in the SPI+ intervention, 3.03% went on to exhibit some form of suicidal behavior. That rate was 5.29% among the comparison group patients. So the relative difference is about 50%. But the absolute difference is only 2.26%, meaning that only slightly more than 2% of patients who report to an ED in a suicidal crisis will actually benefit from SPI.
Another way to look at this is to calculate something called the “number needed to treat” (NNT), which tells us how many patients have to be treated with an intervention under study in order to have an impact on one person. That number for the SPI study is 44.43. That means that almost 45 patients with suicidal thoughts or behaviors who come to the ED would need to receive SPI+ in order that one patient had a reduced chance of further suicidal behavior compared to usual care. Usually, an NNT greater than 10 is considered of limited or no clinical significance.
There are several other details about the study that warrant consideration. This was not a randomized clinical trial; patients in one set of EDs received the SPI and in another set received usual care. If there are any systematic differences between the patients who visit those EDs, it might influence the results. Only a randomized trial could sort that issue out. Also, the paper does not say whether any of the patients in the study actually went on to die by suicide. Finally, exactly what is meant by “suicidal behavior” is not clearly explained, so it is possible that a range of severity was involved. We do not know, therefore, if SPI actually prevents seriously ill people from killing themselves.
We’ve focused on these study limitations, but it is important to emphasize that the study itself is of great interest and importance to the suicide research field. As we pointed out in a post last month, it is extremely difficult to predict who will attempt suicide, making it very hard to prevent individuals from trying to kill themselves. We have very few tools to reduce the risk of suicide, and anything that might be better than usual care is promising and needs to be developed further. And we are not criticizing the study’s investigators either; they were quite transparent in the paper about the study’s results and limitations.
Our concern with this study is the way it was reported. As noted in Health News Review, news outlets like NPR emphasized the 50% relative risk, making the intervention seem much more effective than it actually is.
We can see the same problem with another widely reported-on study, this one involving psychopharmacology. In this case, another distinguished researcher, John Greden of the University of Michigan, presented an abstract of a study at the annual meeting of the American Psychiatric Association that tested the potential benefits of using a genetic test to predict to which antidepressant medication a person with depression might best respond. Doctors treating a set of 560 patients were randomized to use results from the genetic test to guide antidepressant medication choice, while clinicians treating a set of 607 patients were randomized to treatment-as-usual and did not receive genetic test results.
Using a standard rating scale that quantifies depression severity, the study results showed that compared to treatment-as-usual, use of the genetic test to guide antidepressant medication choice resulted in a 30% improvement in medication response and a 50% improvement in complete remission from depression after eight weeks of treatment. The differences between groups were still apparent at 24 weeks.
In a press release, the company that manufactures the genetic test called the study a “landmark,” and explained that the test “can help a clinician understand the way a patient’s unique genomic makeup may affect certain psychiatric drugs.”
This is potentially important information because depending on which study you look at, as many as 70% of people with depression do not respond to the first antidepressant medication they try. These patients often are then switched to other drugs and it can take many months before an effective treatment is finally found. There is great interest throughout medicine right now in determining if an individual’s unique genetic make-up might influence what specific drugs will work for that person. Hence, using a genetic test that increases the chance of becoming depression-free by 50% could save patients months of anguish.
But the actual study results are not exactly as glowing as either the company’s press release, the study author’s statements, or the media coverage about it suggest. Let’s look at the 8-week results from which the 50% improvement in remission statistic is derived. In the group randomized to use the genetic test, 15.5% of patients achieved full remission. In the group randomized to treatment-as-usual, 10.1% achieved full remission. Yes, that’s a 50% relative difference, but it is also only a 5.2% absolute difference. For response, which means getting better but not completely depression-free, the relative difference was 30% and the absolute difference was 6.1%.
Both the remission and response differences between groups are statistically significant, but are they clinically meaningful? It turns out that after eight weeks of taking an antidepressant medication, very few patients actually responded or achieved full remission in either group and those whose doctors used the genetic test had only a very small benefit. And on a third measure, the improvement in scores on the depression rating scale between baseline and eight weeks of treatment, the genetic test and treatment-as-usual groups showed no statistically significant difference.
Once again, we are indebted to Health News Review, which in a story about the genetic test notes that “Although harms from a cheek swab [used for the genetic test] are unlikely, there is a potential harm in framing the genetic testing results as guiding the choice of one antidepressant over another. Patients may focus solely on pharmaceutical options, to the exclusion of non-pharmaceutical ones.” Several evidence-based psychotherapies, for example, have been shown to be as effective for treating depression as antidepressant medication. Telling people there that a “landmark study” now can inform their doctors which antidepressant will work specifically for them makes it sound as if antidepressant medication therapy is routinely more successful than it really is.
Once again, we are not upset that this study was done, or the results presented at a scientific meeting. Rather, we are concerned about the way industry, scientists, and the media present the results of scientific studies. It takes a lot of reading through the “fine print” to conclude that while the genetic test may be of immense scientific interest, it isn’t going to help very many patients with depression at this point and may not be worth the cost.
Neither the brief emergency department intervention nor the genetic test is likely to have much impact on the health and well-being of most people suffering with suicidal thoughts and depression in the near term. But if people with these conditions and their families are taken in by exaggerated reports of what was actually found in studies about them, there will inevitably be disappointment and anger. Reporting only on relative risks and benefits exaggerates the benefits of many outcomes and can ultimately lead to people becoming mistrustful of what scientists claim they are learning.
Headlines that state “Brief Intervention May Have a Small Impact on Suicidal Behaviors” or “A Genetic Test May Point to Some Small Clues about Which Antidepressant to Try” do not sound very dramatic, but they are more accurate than what is out there. Relative risks and benefits, then, do not tell us the magnitude of a finding or its clinical significance. We always need to know what the absolute risks and benefits of any new intervention or test are. And that means that “relatively speaking” does not give us absolute truth.
(Sara Gorman, PhD, MPH, is a public health specialist, and Jack M. Gorman, MD, is a psychiatrist).
block