Three articles I read recently contain such valuable lessons for critical thinking that for my end-of-month analysis I want to summarize them for you and extract for you the general lessons you can learn from them, so you can apply them to every question in your life (whether political, philosophical, professional, or personal).
Election Polling
I’ll go straight to the spoiler and explain the one about election polling. Here the article to read is Ed Kilgore’s analysis in “Polling Errors Won’t Necessarily Help Trump This Time” for the Intelligencer.
Everyone is freaking out over how close the election polls are, and even over fractions of a percent or a percentile lead for one candidate or another. But the fact is: no poll has shown any difference between the candidates that exceeds their margin of error. Which means the polls are actually giving us no data about who will win (the Polymarket thing, meanwhile, is being rigged by shadow money and thus is useless data). And this is because polls are fuzzy instruments—they have wide margins of error because they do not have reliable access to the data, and have to use complicated tricks to build guesses at what the data really is. It’s rather like when we thought there were canals on Mars because telescopes were exceeding their resolution.
Stated margins are calculated expectations of this error; but a poll’s real margin of error can exceed what is calculated, because of factors they aren’t even including in the math (or aren’t including correctly). We have seen this in election polls increasingly the last few cycles: polls have proven to be way off in 2016 and 2020, when we got a complete look at the actual data (when we got to count the actual votes the pollsters were trying to get at). They were wrong by more than their combined margin of error could account for—but within or close to the margin for an average individual poll. So you need to take the margins of error seriously. If it’s close and they say it’s +/-3%, that means unless they show a candidate over six points ahead of another, you should assume the poll got “no result,” and we simply do not know who will win.
But this isn’t even the problem at issue now. Well, it is. But it’s worse than that.
Because, in fact, polls are not just a count of who said what. Maybe most of my readers know this already. But just in case any don’t, I need to explain this: polls try to find a way to ask people who they will vote for by randomizing who they will ask. Traditionally this is done by random phone dialing. Because, for example, an internet poll is never a random sample, because of the biases of who is on the internet, who is looking at that specific poll, who is motivated to take the time to take the poll, and even who has been flying-monkeyed by an influencer to swamp the poll, and so on. But phone dialing is no longer reliable: too many people don’t answer their phone anymore. Many don’t talk to pollsters (Who has the time? Who trusts strangers pretending to be pollsters?). Many don’t even have phones. Many have multiple phone numbers. Many hide their numbers. Many numbers are fake. Etc. (see “The Problem with Polls” at PBS).
So pollsters have started having to build models of the polled population and fix their results to the model. So, for instance, if after they collect all their data (the people who answered), they find there are, let’s say, half as many black people as a genuinely random sample should have collected, they need to count each black person who answered twice, to “guess” that the ones they didn’t reach will break the same way. Which is not a safe assumption. So the models get even more complex, using other data and mathematics to account for all the ways that assumption could be wrong. And so on. And they do this for every demographic they track. And all of this often isn’t accurate either (for instance, people will refuse to answer demographic questions, or lie about the answer or get it wrong). So the models get more and more complex. So when a poll says, say, 20% of registered black voters say they will vote for Candidate 1, that is not “for every 100 registered black voters we asked, 20 said that.” Rather, it is a guess at how many registered black voters would say that, if they had been able to ask them. And likewise all the way down the line (young voters, old voters, college educated, urban, rural; whatever the comparative data asked for). And notice: a poll that collects no demographic data (and just asks for the vote) will be the most inaccurate poll of all, because it is blind to all the ways it is failing to collect reliable data.
And then, here’s the kicker. You know what pollsters also do to correct for likely error? Ah. Yeah. They build corrections into their models based on past election-cycle polling errors. That’s right. Because polls undercounted Trump voters in 2016 and 2020, the polls today are being adjusted up for Trump, on the assumption the same errors are in today’s data. Think about that for a moment before it hits you. You may have been worried because, “OMG, the polls undercounted Trump by two or even four percentiles before, so a dead heat today actually means Trump is ahead by two or even four percentiles!” But that’s fallacious. Because the polls being reported to you today already made that adjustment (or whatever adjustment their model determined; it might not be a straight add like that).
It is therefore more likely Trump votes are below what is being reported. And you can give many reasons why. The electorate and conditions have so changed since 2016 and 2020 that it is folly to assume the same error trend holds today as then. For example, this is the first election cycle in American history in which literally hundreds of Republican leaders are literally endorsing the Democrat to get rid of Trump, while rank-and-file Trump voters are no longer shy about admitting it (reducing dishonest and non-response biases), due to the campaign’s “Overton Window” swing towards open endorsement of rage, racism, sexism, and conspiracy theories (giving motive and permission to voters to proudly say to pollsters whatever they want). So, one might argue, it is not as likely that we are undercounting Trump voters by much this cycle. And Kilgore reports that studies show “there are signs the Trump vote is now being captured fully.”
And there are lots of ways this could be showing up in the math. For example, Kilgore quotes Nate Silver pointing out that one technique for correcting for past polling error, called “weighting on recalled vote,” has a specific reliability problem. In this technique, the pollster asks who the polled voter voted for in the last election, which the statisticians can then check against the actual vote count in the last election, and thus have a check on whether their random sampling was off, and then correct for that deviation with more weighting (just as with my “undercounted black voters” example). But as Silver points out, we have data confirming that “people often misremember or misstate whom they voted for and are more likely to say they voted for the winner,” which would be Biden in this case. “That could plausibly bias the polls against Ms. Harris because people who say they voted for Mr. Biden but actually voted for Mr. Trump will get flagged as new Trump voters when they aren’t.” Ooops.
Another example noted is that Trump’s gains among Hispanic voters tend to be among young men—who are the least likely to actually vote. So the “numbers” suggest he’ll get a lot more Hispanic votes—but in reality, since past data show that a lot of young men who claim they will vote, won’t, that might not materialize, or not by as large a margin as pollsters counted (this is exactly what happened to Bernie Sanders). Pollsters know this, but there isn’t any reliable way to account for registered voters lying to them. They could try to build-in previous rates of deviation between claims and actual votes, but which polls are doing that? And is the previously determined rate still the case? Things have changed so much, who knows? Younger voters have started rising in participation rates, thus defying expectations from the data. But is this increased participation skewed by race? By party? By issue? Point is, the instrument is blurry. So results within even calculated margins of error are not results at all. You shouldn’t even report the numbers. You should just say “results inconclusive.”
Knowing all this about polls will help you be a better critical thinker about this kind of mass public polling data. It is never anymore a straight count of who said what. It is a generated guess as to who would say that, based on complex mathematical models, and which model any given poll is using may make a difference in any given case as to how reliable it even is on whatever you want to know. The models can be wrong. The models can overlook important factors. The models can create biases in a fallacious effort to reduce them. And so on. This is why it is always informative to look into how poll numbers are being generated and reported. You should be familiar not only with margins of error, but also confidence level. Polls often don’t report that, but it’s the probability that the true result will be “somewhere” in the margins, but we cannot claim to know where—so we should never say, for example, “52% +/-3%” but always “49% to 55%” since it is as likely to be 55 as 49, much less 52. But if the confidence level is as low as 90%, then actually, there is a 1 in 10 chance the true result will be outside that range.
Above all, knowing what they are adjusting (weighting) in their model can affect what you conclude. The big example here is: they are already adjusting for previously undercounting Trump voters. So they are more likely overcounting (or not undercounting) those voters today. So you shouldn’t assume what happened in 2016 or 2020 will happen next week. The polls you’re reading have already done that.
Generational Wealth
If you think that was enlightening, then I highly recommend you read “What’s Behind the Sudden Surge in Young Americans’ Wealth?” by Federica Cocco and Andrew Van Dam for the Washington Post. Because it is a sterling example of critical thinking in action. The authors start by debunking the meme (that even I used recently in Debunking John Davidson’s “Pagan” America) that “Me in my 30s” versus “my parents in their 30s” shows a massive decline in generational wealth (mainly illustrating the rise in wage stagnation and income disparity that is crushing the American middle class). But then they critically think themselves mid-article, and follow the first rule of critical thinking: try to prove yourself wrong, before concluding you’re right. They actively try to think of ways they could be wrong, and then go check and see. And lo. They were wrong. The last half of the article then explains why the meme is, in fact, an accurate capture of reality.
In the process you will learn a lot about how diverse the ways are to measure economic success or status, and why they aren’t all equivalent. You will also learn some things about economics generally. But above all, you’ll learn how easy it is for a pundit or politician to invent any narrative “about the economy” by simply cherry picking which fact they wish to call your attention to—and then hiding from you (or possibly even from themselves) all the other facts that entirely change that narrative. An accurate narrative thus requires multiple converging lines of evidence, not a single factoid that you then tell a just-so story about. So once you are armed against this, you can defend your mind from invasion and capture, by knowing you need to not only fact-check the cherry-picked claim (is it even true?), but also check for what other data might change the significance of that claim even when it is true. I have long noted that all apologetics, for all false beliefs (whether it’s religion or flat earthism or holocaust denial) operates on a single principle: omitting evidence that, when reintroduced, reverses the conclusion. My best demonstration of this is showing that all arguments for God, when the evidence left out is reintroduced, become arguments against God.
This technique of looking for the omitted evidence can be expanded to any issue or question, and you can learn from this example ways to figure out what evidence you should be looking for in any other case. And now you have a triangulatable second example of that here in Cocco and Van Dam’s article about generational wealth. What they do is start with just the federal measure of “wealth,” and see that, yes, before 2020, millennials were at the bottom of accumulated wealth (comparing themselves to X-ers, boomers, and silents at the same age), but by 2024, they were at the top, and by an even wider margin. Presto! Millennials are now the wealthiest generation in America! And no one has ever been so lucky!
But, no. “At least that was our first response,” Cocco and Van Dam admit. Then they thought about it. And checked some things.
The change has occurred entirely in home value. Meaning, the accelerating costs of housing in America have inflated their “on paper” wealth metric, as homes they bought a few years ago doubled in value practically overnight. And that makes this illusory, because “all flavors of assets aren’t created equal.” The “wealth created by rising home prices exists mostly on paper and is very hard to tap—which means it may not translate to a higher standard of living.” If you still can’t afford your mortgage and gas and utilities and food, the paper value of your house doesn’t help. You can borrow against it, but that just adds debt load, and thus diminishes your monthly spending power rather than increases it. You can sell it, but you still have to live somewhere—and every house’s value went up, so there is no way to translate your home’s value into cash. And even if you find some way to do that, that cash evaporates as you spend it. Your wealth then goes down. You could try to translate the difference into investments that pay out annually—so your wealth stays put while your income grows—but hardly anyone is positioned to do that (e.g. even moving to a cheaper market costs thousands of dollars, not just in moving expenses, but also a down payment and closing costs, and livable-space renovations). And even when you can pull it off, the gain is small (e.g. if you cash out a house into investments and then rent, your annual increase in rent will eat your annual investment income, and you’ve gotten nowhere).
This translates into very few people being able to benefit from this state of affairs—and most of them were already rich (because Luck Matters More Than Talent). The remaining state of affairs is that millennials have comparatively lower incomes with comparatively higher expenses. They remain screwed.
But wait. That’s not all. Yes, this dampens the claim that millennials aren’t screwed “because housing prices went up.” That’s a factor we can’t omit, and it does change the narrative when included. But Cocco and Van Dam’s fact-checking of themselves uncovered an even bigger problem: “Housing—and its investment returns” will still at least “build wealth” long-term (e.g. mortgages, unlike rents, don’t rise, and even eventually go down), “but you know what usually doesn’t? Living with [your] parents.” That’s right. The data showing large rises in millennials’ wealth didn’t include millennials living with their parents. Um. Ooops. The number of millennials who had to move back home due to relative poverty (and thus can’t buy into the exploding housing market at all) has almost doubled over their lifetime.
And this is where a problem arises in the reported data:
Due to the difficulties of disentangling a family’s holdings, the Fed combines the wealth of “financially interdependent” household units and effectively assigns it the demographics of the head of household (or more accurately, what the Fed calls the “economically dominant single individual or couple” in the home’s “primary economic unit”). So, when we say millennials have record wealth for their age, we’re really saying millennials who have become financially independent are doing well for their age.
That’s right. Data tracking generational wealth literally doesn’t even count the sixth of millennials living at home. The fact that more of them have to signals a decline in their economic prospects, not a rise. And if we could re-calculate the average wealth of all millennials, including the ones with neither an apartment nor a home of their own, their measured advantage will decline. Cocco and Van Dam don’t specifically mention it, but this should also be the case for millennials sharing rent—i.e. roommates and collectives. Since the Fed is only counting each of those as a single “household,” the fact that, say, Sally has to rent a single room off the books with split rent on a lease not in her name, and share the remaining public space of an apartment, is also a factor that needs to be counted. Have millennials outpaced other generations in having to resort to this? Oh. Yeah. They did. They are literally called The Roommate Generation. Roughly a quarter of them are in this predicament. So, what will millennial generational wealth add up to when we count these people, too, and not just the “live at homers”? Well, it will drop again. Probably by a lot again. So we should have asked, what is the federal wealth statistic actually measuring? Who is left out of that? And what other economic markers matter besides that? Do home values really translate into better quality or costs of living, the things we are supposed to be measuring?
As Cocco and Van Dam realize by the end of their analysis, they had to change their minds. “By largely focusing on the winners who have left their family’s financial support behind, we’re really only getting a picture of the most successful young Americans.” The more so by focusing on the winners who left the need of roommating behind. The overall generation is not so winning. It’s getting crushed. And in a fantastic finale, they point out that this is another example of what’s called “survivorship bias,” which usually is illustrated by the WW2 airplane armor paradox: when a survey was made of where all the bullet holes were in returning aircraft to decide where to up-armor the planes, someone had the wisdom to conclude the answer was “where there weren’t any bullet holes.” I’ll let you reason out why.
So this whole case becomes another instance of “omitted evidence” skewing perceptions of reality. That’s why looking for what evidence might be omitted is a crucial critical thinking skill. If you haven’t done that, you shouldn’t be confident in any conclusion you reach, about anything. Of course, this then gets you to the next layer, of learning to discern what even actually is evidence. Cranksters and delusionoids will invent a lot of stuff they claim is evidence, but logically isn’t—because even when it’s true, it does not increase the probability of their conclusions. This is where vetting arguments for fallacies becomes important, as well as knowing what makes any fact actually evidence for something. I actually gave examples of this in my last article, on the Rapture, where Preterists like to claim a bunch of evidence that isn’t.
Remote Work
Finally, a third example I want to draw attention to is a recent debate over whether converting to remote work is good or bad labor policy. Here you should read the brilliant article by Advait Sarkar, “Evaluating The Economist’s Claim That Remote Work is Less Productive,” which abundantly illustrates this skill, of looking for what’s being omitted (and what’s really being counted), in many different and creative ways. All can teach you how to think of what to check and what to look for in other cases. Here we are looking at Sarkar critiquing a seemingly fact-filled (yet anonymous) editorial at The Economist.
And this, in sum, is what we get (Sarkar provides many more details, analyses, and references):
- The Economist claimed one study showed remote work leads to a 4% decline in productivity. Sarkar found this to be false. They took that figure out of context and got it wrong. It was only a decline for call centers (not a representative industry). And the figure “refers to productivity decline relative to already-remote workers, not relative to on-site workers.” Which means the cause was not remoting the work, but something else. Sarkar finds it to be management bias: managers were inefficiently favoring officed workers over remote workers. For which the solution is obviously: fix management.
- The Economist claimed another study showed remote work leads to an 18% decline in productivity. Sarkar found this to be false. “The same two criticisms apply: this is a study of data entry work, which is an extremely poor proxy for office work in general,” and the study itself said “that it is not remote work per se, but constraints on choosing their work location which are the biggest detractor of productivity,” in particular, “people with children, home care responsibilities, and poorer households” are less productive at home (because they can’t pull so much overtime, and are more exhausted, are multitasking, etc.).
- Sarkar notes the study The Economist cited for that datum also came to a different conclusion than it reports: that “rather than force people into the office, employers should seek to equalize people’s opportunities through policies such as universal child care.” I would also suggest changing pay-scheme: at-home data entry might increase its productivity if the employee is paid per gigabyte than by the hour—since “productivity” here only relates to company cost per gigabyte entered (as the study only measured typing speed per hour), so it does not matter to the company whether an employee takes twice as long to do that, as long as they are paying for outcome and not clocktime. Point being, we should be looking at the causes of any decline, and addressing them, rather than concluding it is some sort of inviolable law of physics. This is also the case because the study averaged its results, which hides the fact that some people might not have been less productive (or some even more productive) at home—and if so, what is the difference? Can we reproduce or teach it? Can we measure who this is, and so distinguish them as good subjects for remote work? By not even asking these questions, we can tell The Economist was pushing a pre-desired narrative (selection bias), rather than identifying problems and their solutions (critical thinking).
- The Economist claimed another study showed remote work leads to a 19% decline in productivity. Sarkar found this to be false. This study showed “productivity in fact remained the same.” Only productivity per hour declined, and only “during the earliest and most uncertain period of the shift to remote work during the pandemic,” when people were on a new learning curve and faced with unusual challenges. Without reproducing the study after that, its result is useless. It is also moot (again) if pay-scheme changes to outcome rather than clocktime pay. Sarkar in particular notes “the researchers find that working hours mainly increased due to an increase in ‘coordination activities’ (i.e., meetings),” and in result, “Employees with children suffered the greatest decline in productivity due to an increase in working hours.” In other words, the problem wasn’t remoting the work. It was meetings. This jives with widespread findings that too many meetings, lasting too long, is what kills productivity at companies regardless of where labor works (per Forbes, the Harvard Business Review, and Inc.).
- The Economist cited a study showing chess players perform worse remotely than in person. Needless to say, this has little to no relevance to work. But as Sarkar also points out: “the authors attribute the initial decline in performance to the sudden transition induced by the pandemic, finding that these decreases get less and less pronounced as players adapted to the remote work setting.” So, the study basically said the opposite of what was claimed.
- Which reminds me to mention: always read the study. Not only is “the effect size decreases over time, suggesting an adaptation to the new remote setting” literally in the abstract, the study shows a result of “no effect” is within their margin of error on almost all trials. Which means the effect size was so small as to be, essentially, meaningless—a “no result,” just as for election polling. The study also didn’t measure what you think—how often players won or lost—but a hyper-obscure and convoluted measure of “move quality” that is prone to itself being meaningless.
- The Economist referenced increasingly less relevant studies than these, including one that showed a trivial difference in “how many” creative ideas a team could come up with “in five minutes” (seriously: virtual teams came up with 14.74 ideas; in person, M = 16.77, for a difference of one idea out of fifteen), but when it came to “selecting which idea to pursue, we find no evidence that videoconferencing groups are less effective.” Ooops. Should have read the study, Economist. Trivial effect size. And no real effect found. (Sarkar also notes this study bears little relation to remote work anyway.)
- The Economist referenced a study that shows remote teams change certain social networking behaviors, but as Sarkar points out, the study itself admitted it found no measurable changes in productivity. What it called “more static and siloed” (emotionally charged words that deviate from objective science) really just meant people slightly reduced the number of people they communicated with; no one became “siloed” or “static” at all.
- The Economist referenced a study that supposedly said “virtual watercoolers” didn’t do much. Even were that true, it’s hard to ascertain even the logical relevance of this. “We tried a thing and it was useless so we got rid of it” bears no relation to “our productivity went down.” But Sarkar says this is even “a straight up misrepresentation” since it “cherry picked one observation out of many” in the study, while “ignoring” the study’s actual conclusion which was “that virtual water coolers—or videoconference sessions for small groups of interns and a senior manager—may yield higher performance and career outcomes” under certain documented conditions. Which sounds like the opposite result to me. As Sarkar concludes, “like many other things to do with remote work, virtual watercoolers can work just fine when they are applied intentionally and with critical understanding of their uses and benefits.” And The Economist’s own referenced study proved that. Yet they chose to misrepresent what the study showed instead. You might be starting to see a trend.
- The Economist did this again, when it references a study as supposedly concluding that “feedback exchanged between colleagues dropped sharply after the move to remote work” (you might notice the curious absence of the word ‘productivity’ in that sentence), when in fact that study “actually found a 21% decrease in code output for on-site workers, versus remote workers.” In other words, it found a substantial increase in productivity, completely nuking The Economist’s narrative. So this is starting to look like lying. Sarkar discusses the comms finding in the study and what a critical thinker would learn from it (rather than an apologist trying to sell a false narrative—remember my point about that in my discussion of the article on intergenerational wealth).
- The Economist then references a study that they admit “notes the negligible impact” of remote work “on productivity” but “workers put in longer days and wrote more code when in the office,” which are irrelevant metrics. If workers can do as much in fewer hours, that is increased labor productivity (and if they were paid hourly, it would be increased cost productivity as well). And the deceptively worded “wrote more code when in the office” means hybrid workers all wrote the same amount of code—they just shifted more coding to the office when deciding how to allocate their time between locations. In other words, a triviality that had no connection to productivity. And as Sarkar points out, “What they neglect to report is the main finding of the study, which is that hybrid working improved job satisfaction and reduced attrition by 33%.”
Oh. Right. Remember how there is more than one metric for economic success, and “how much your house is worth” doesn’t really capture the reality of spending power? Well, there is also more than one metric for labor utility than productivity. Job satisfaction and retention are capital benefits businesses not only earn from. Since institutional knowledge is a dividend-paying capital asset, in contrast to increasing training budgets for continuous new hires, which is a drag on profit margins, retention is a vital metric for business success. Just as job satisfaction improves a business’s competitiveness for quality labor. Both of which, you might suspect, will likely improve its productivity relative to competitors. Access to remote work is a benefit with which to hire labor; it also vastly increases your accessible labor pool (since geography and commute time no longer limit who you can hire).
Sarkar then concludes with some examples of yet more studies finding the opposite of what The Economist claims—the simplest possible example of “reintroducing the evidence they left out.” One takeaway is that Sarkar found such a large number of omissions, mistakes, and distortions in The Economist article, all skewed toward the same position (defending the status quo and the emotional feelings of managers and owners over objective reality), that we should doubt the reliability of that periodical altogether (or at least the authors of that one article, whoever they are; maybe other authors perform better there). It is giving business leaders literally bad advice. Certainly, it goes in the “side-eye” bucket for next time you read an article at The Economist. It must be put on intellectual probation: now anything The Economist says needs to be independently checked first. You cannot implicitly trust it. (On this procedure for calibrating the level of trust in a source, see my Primer on Media Literacy.)
But notice again, the trick was not just fact-checking (though that was clearly important here), but also remembering to look at what is not being said, what is not being included, what is not being measured, what is not being asked, what is not being checked—in other words, what is being omitted. If Critical Thinking Rule Number One is “always try to disprove a claim before believing it,” Critical Thinking Rule Number Two is “always look for what is being left out—and whether bringing it back in changes the narrative.”
All three of today’s examples illustrate this, and all the various different ways to carry these rules out.