Yuma 4×4

Media and Communications

How statistics can be misleading – Mark Liddell

How statistics can be misleading – Mark Liddell


Statistics are persuasive. So much so that people, organizations,
and whole countries base some of their most important
decisions on organized data. But there’s a problem with that. Any set of statistics might have something
lurking inside it, something that can turn the results
completely upside down. For example, imagine you need to choose
between two hospitals for an elderly relative’s surgery. Out of each hospital’s
last 1000 patient’s, 900 survived at Hospital A, while only 800 survived at Hospital B. So it looks like Hospital A
is the better choice. But before you make your decision, remember that not all patients
arrive at the hospital with the same level of health. And if we divide each hospital’s
last 1000 patients into those who arrived in good health
and those who arrived in poor health, the picture starts to look very different. Hospital A had only 100 patients
who arrived in poor health, of which 30 survived. But Hospital B had 400,
and they were able to save 210. So Hospital B is the better choice for patients who arrive
at hospital in poor health, with a survival rate of 52.5%. And what if your relative’s health
is good when she arrives at the hospital? Strangely enough, Hospital B is still
the better choice, with a survival rate of over 98%. So how can Hospital A have a better
overall survival rate if Hospital B has better survival rates
for patients in each of the two groups? What we’ve stumbled upon is a case
of Simpson’s paradox, where the same set of data can appear
to show opposite trends depending on how it’s grouped. This often occurs when aggregated data
hides a conditional variable, sometimes known as a lurking variable, which is a hidden additional factor
that significantly influences results. Here, the hidden factor is the relative
proportion of patients who arrive in good or poor health. Simpson’s paradox isn’t just
a hypothetical scenario. It pops up from time
to time in the real world, sometimes in important contexts. One study in the UK appeared to show that smokers had a higher survival rate
than nonsmokers over a twenty-year time period. That is, until dividing the participants
by age group showed that the nonsmokers
were significantly older on average, and thus, more likely
to die during the trial period, precisely because they were living longer
in general. Here, the age groups
are the lurking variable, and are vital to correctly
interpret the data. In another example, an analysis of Florida’s
death penalty cases seemed to reveal
no racial disparity in sentencing between black and white defendants
convicted of murder. But dividing the cases by the race
of the victim told a different story. In either situation, black defendants were more likely
to be sentenced to death. The slightly higher overall sentencing
rate for white defendants was due to the fact
that cases with white victims were more likely
to elicit a death sentence than cases where the victim was black, and most murders occurred between
people of the same race. So how do we avoid
falling for the paradox? Unfortunately,
there’s no one-size-fits-all answer. Data can be grouped and divided
in any number of ways, and overall numbers may sometimes
give a more accurate picture than data divided into misleading
or arbitrary categories. All we can do is carefully study the
actual situations the statistics describe and consider whether lurking variables
may be present. Otherwise, we leave ourselves
vulnerable to those who would use data to manipulate others
and promote their own agendas.

100 thoughts on “How statistics can be misleading – Mark Liddell

  1. Simpson's paradox is given cover story treatment in the current New Scientist, "A New Kind of Logic", 27 Feb 2016, p 36. Please check their version out, and more importantly, the conclusions they draw. 180 out of 300 men (60%) recover with a particular drug while 70 out of 100 men (70%) recover without it. 20 out 100 women (20%) recover with the drug, 90 out of 300 women (30%) recover without it. So whichever sex you are, it looks like you're better off not taking it. However the overall numbers for both sexes seem to tell the opposite story. For both sexes taking the drug we have 180+20 over 300 + 100 ( recovery 50%) but for going without we get 70+90 over 100+300 (40% recovery). Just like that going to the left hand hospital in the vid, using the drug looks OK after all. But the really intriguing thing is that New Scientist takes the opposite tack. It's these overall figures that are the misleading ones, that "masked the drug's negative effect" which is only revealed by the separate gender numbers. So who on earth's right, if either, New Scientist or Ted?

  2. He meant like confounding variables. No studies are flawless, yet people still need statistics to get substantial evidence.

  3. Always something to keep in mind as a scientist, however just from reading the first page of comments I think an emphasis on how statistics are still valuable would have been beneficial. The title states "can be misleading," not "usually misleading."

  4. 79% of stair accidents happen in stairs,and cause head injuries.The rest do damage to the groin,especially in stairs with balls at the end of handrails.

  5. I am very interested to find the link to the UK study which claimed higher survival rate for smokers. Anybody can help to point it out? Thanks.

  6. I used this paradox in a lesson i've held to highschoolers as part of my last year exam. SO thank you for introducing me to it. It was a great finishing touch that got the kids really excited. And my lesson was about philosophy of science and the role of common sense.

  7. I would like to congratulate the animators of this TED-Ed! The guest artists usually do a good job in the videos, but I presume this topic was really hard and the animation not only richly ilustrated it, but also was able to make me have fun while watching. Thank you very much. I hope to see more colaborations of this company with TED.

  8. Hospital A, 55% of patients in good health recover. Hospital B, 45% of patients in good health recover. Hospital A, 45% of patients in poor health recover, Hospital B 55% of patients in poor health recover. Exactly equal sample sizes in each of the four categories. Conclusion: go to A if in good health, B if in poor. Disregard the aggregated data that 50% of patients in both A and B recover. Hospital A 55% of patients admitted last Thursday recovered, B it was 45%. Hospital A 45% of patients admitted last Friday recovered, B it was 55%. Conclusion go to A last Thursday, and to B last Friday.

  9. I think it's important to understand that there is still a purpose for statistics. Just be selective with your sources, consider if what you are reading a biased source such as left or right wing media. What does the writer want to you to feel, angry? shocked? What is their objective? Will they be able to give you a balanced view? If you can't find an unbiased resource read opposing biased sources to get a better idea of reality. Always read critically.

  10. Theres a nice saying i wanna share: Statistics are like bikini, what they reveal is suggestive. But what's hidden is vital 🙂

  11. I very much appreciate the author's point that the issue of lurking variables isn't easily resolved. I would add: it's good to have the same phenomenon, even the same data sets, examined by statistics with multiple perspectives.

  12. I think about some other random thing for one second and then we start talking about a paradox,
    THANKS A LOT ATTENTION SPAN

  13. An easy example how statistics can being tricky:
    We have 2 men and 2 breads, and one of them eats the 2 breads, but statistically says every man eats 1 bread.

  14. Using this video,and the graph video,I saw that one tv channel promoting itself as having the most overall views was not consistent with age groups

  15. I'm advised to ask more questions, more questions a person has, more he can achieve. Snce I can't come up with questions, I can't ask… then I find out the reasons. I'm not exposed to more contexts than those who are more achieved. To spot the misleading statistics, you need to be very knowledgeable at that specific area. Okay, at the end, the only option we the viewers, or let's say, the commoners can have is to respect authorities and be misled. The only thing I hope for the science community is to not take sponsorships from environment destroying companies to remain unbiased and eventually allow the earth our home to heal.

  16. Great video. But it's not statistics to blame here because without a method to organize any of our data we would be in a greater disorder today.
    It is true that statistics have deliberately lied and data has been manipulated. But most of the time (am I saying this statistically?), it's the loopholes in statistical graphs that people (I said people, which includes politicians and companies) exploit to fulfill their agendas. I mean how can a SINGLE, error-filled, filtered, non-contextual graph prove a theory, claim or anything like that? Representation is key, especially when dealing with unobservant audiences.

  17. Persuasive stats are used by big pharmas. It’s not even a conspiracy theory, it’s a fact hidden for moral reasons.

  18. Coming from a chanel who's speakers use manipulative statistics all the time. Let's be real, most studies done in regards to social and political science have a desired result before the study even gets started. And most people will buy them. Why? Because most people are too lazy to look at how the data is gathered.

  19. As a statistician, I must agree with you, but there are more useful things in statistics. AI, machine learning, data mining is some example of it. Without it, there would be no search engines like google.
    Do not let politicians make you skeptic about statistics.

  20. He left out the skill level of the hospital staffers and doctors. Just because they went to medical schools doesn’t mean they have the skill or talent to be a good doctors.

  21. 3:10 Is "social justice" and "political correctness" is lurking in ALL TEDed videos? The control words are "slightly higher" – how SLIGHTLY, the narrator omits the details, as to avoid the obvious follow-up question: What is the MARGIN of ERROR induced by OTHER LURKING factors in data from Florida death row?…

  22. How do Statisticians go hunting??

    Three go out with two rifles. One fires to the left of a deer, another to the right. The third, in the middle, throws up his arms and shouts, "Yay! We got it!"

  23. This video missed a couple other important points.
    Controlled variables, data artifacts, and statistic p-values should also be checked to make sure the data is actually good to use.
    Even if the results show something, if your data was in no way significant, then the results are almost meaningless.

  24. الي جاي من الدحيح يشترك بالقناة مالي 😎😎

    Thanks for the info.

  25. To compare two sistems, you must apply the same treatment (in this case, the same set of patients) to both sistems

Leave comment

Your email address will not be published. Required fields are marked with *.