I've been spending the past year writing on the many errors, lies, and distortions of the fact checking industry, and noticed a number of occasions where different fact checkers would fact check the same claim and come to different conclusions. As it turns out, there is a study out documenting the prevalence of this - and when the fact checkers do examine the same claim, they agree with themselves at a rate lower low enough to be deemed incompetent.
Back in 2017, Chloe Lim, then a PhD student at Stanford University obtained 1178 fact checks from Politifact, and 325 from the Washington Post's fact checker. Of those statements evaluated, there were 77 overlap statements from Politifact and Washington Post, indicating a different selection criteria for what both fact checkers decide to examine.
Of the 77 overlap statements, the fact checkers completely agreed on 49 of them, or roughly 64% of the time. Among the 28 cases where they disagreed, most of the disagreements were minor, the equivalent of one scale point (the difference between "true" and "mostly true," for example). But on six of them, or for roughly 8% of the overlapping statements, the fact checkers either mostly or entirely disagreed with one another. That the fact checkers are reaching a different conclusion nearly one out of ten times when they examine the same statement doesn't bode well for the reliability of the rest of their work. Even a 1% margin of error wouldn't be acceptable for organizations that now have become social media's de-facto arbiters of truth.
In one such example, Washington Post's fact checker gave Hillary Clinton's claim during a Democrat primary debate that she had never run a negative ad about Bernie Sanders "1 Pinocchio" (the equivalent of "Mostly True" on PolitiFact's scale), while Politifact rated Clinton's statement false.
Lim notes that the type of statements that they tend to do the best on are outright falsehoods or obvious truths, but the agreement rate is much lower "in the more ambiguous scoring range." Lim found that "in many cases, discrepancy in ratings often stemmed from differences in fact-checkersâ subjective judgment of the significance of inaccuracies in a statement, rather than their disagreement on whether the statement itself was true or not."
As Lim notes, it's often the case that claims rated "half true" or "mostly false" are subtle claims that politicians use to deliberately deceive - such as presenting a true fact with missing context to mislead, making straw man arguments, or cherry picking - and this is where both the public would be best served by fact checkers, but where they perform the worst because it requires their subjective judgment (and they're terrible at their jobs).
Lim points to two examples of this:
Both fact-checkers evaluated Jeb Bushâs claim that âFlorida led the nation in job creation.â The two fact-checkers provided identical sets of rationales for why Bushâs claim may be misleading: (a) Bush relied on raw job totals; (b) the year 1999 was omitted; (c) much of Floridaâs job gains were due to an increase in low-paying jobs. Fact Checker decided that Bush deserved â4 Pinocchiosâ while Politifact concluded that these exact same fallacies were not nearly as egregious, rating the claim âHalf True.â
Another source of discrepancies in ratings is differences in the number of counter-examples or evidence each fact-checker uses in support of, or against, the statement in question. For instance, Fact Checker gave â3 Pinocchiosâ (roughly equivalent to âMostly Falseâ) to Rick Perryâs claim that âIn the last seven years of [his] tenure, Texas created 1.5 million new jobs,â while Politifact rated the claim âMostly True.â Upon carefully analyzing the fact-checkersâ explanation, it seems that Fact Checker gave a higher dishonesty rating because Fact Checker found an additional fault in Perryâs statement. In addition to offering the same set of evidence presented by Politifact (i.e., cherry-picked data sources), Fact Checker also pointed out that he had aggregated unemployment numbers in an incorrect manner.
Another source of discrepancies in ratings is differences in the number of counter-examples or evidence each fact-checker uses in support of, or against, the statement in question. For instance, Fact Checker gave â3 Pinocchiosâ (roughly equivalent to âMostly Falseâ) to Rick Perryâs claim that âIn the last seven years of [his] tenure, Texas created 1.5 million new jobs,â while Politifact rated the claim âMostly True.â Upon carefully analyzing the fact-checkersâ explanation, it seems that Fact Checker gave a higher dishonesty rating because Fact Checker found an additional fault in Perryâs statement. In addition to offering the same set of evidence presented by Politifact (i.e., cherry-picked data sources), Fact Checker also pointed out that he had aggregated unemployment numbers in an incorrect manner.
The fact checkers had the least agreement in overlap statements when they were statements that pertained to social policy, Hillary Clinton, and healthcare.
In addition to overlapping statements, Lim also evaluated statements she categorized as "murky." For example, if one fact checker was evaluating the claim that the âWe have the highest murder rate in this country in 45 yearsâ while another was examining the statement that âWe have an increase in murder within our cities, the biggest in 45 years,â this would be put in the "murky" category, as the claims are similar but different (the first focused on the murder rate, while the latter was about the size of an increase in murder).
Among murky statements, the fact checkers tended to agree on immigration, security, and campaign issues, while they disagreed (at a rate rendering them unreliable) on social policy, healthcare, Hillary Clinton, and the economy at rates "much lower" than what is acceptable for scientific coding.
As of writing, no fact checkers have attempted to refute this study (and would probably disagree with each other if they did).
Matt Palumbo is the author of The Man Behind the Curtain: Inside the Secret Network of George Soros
Don't miss The Dan Bongino Show
style="font-size: 1rem;">
In addition to overlapping statements, Lim also evaluated statements she categorized as "murky." For example, if one fact checker was evaluating the claim that the âWe have the highest murder rate in this country in 45 yearsâ while another was examining the statement that âWe have an increase in murder within our cities, the biggest in 45 years,â this would be put in the "murky" category, as the claims are similar but different (the first focused on the murder rate, while the latter was about the size of an increase in murder).
Among murky statements, the fact checkers tended to agree on immigration, security, and campaign issues, while they disagreed (at a rate rendering them unreliable) on social policy, healthcare, Hillary Clinton, and the economy at rates "much lower" than what is acceptable for scientific coding.
As of writing, no fact checkers have attempted to refute this study (and would probably disagree with each other if they did).
Matt Palumbo is the author of The Man Behind the Curtain: Inside the Secret Network of George Soros
Don't miss The Dan Bongino Show
style="font-size: 1rem;">