I’ve started to accumulate a lot of evidence that consistently supports a singular hypothesis: only those who don’t really understand Bayesianism are against it. Already I’ve seen this of William Briggs, James McGrath, John Loftus and Richard Miller, Bart Ehrman, Patrick Mitchell, Tim Hendrix, Greg Mayer, Stephanie Fisher, Louise Antony, even Susan Haack and C. Behan McCullagh (Proving History, pp. 272-75; cf. pp. 100-03). And I must also include myself: I was entirely hostile to Bayesian epistemology, until the moment I first actually understood it during an extended private argument with a Bayesian epistemologist (and there is a difference: see Bayesian Statistics vs. Bayesian Epistemology). And to all this we now must add philosopher of science Peter Godfrey-Smith.
A while back Godfrey-Smith wrote a fairly good textbook called Theory and Reality: An Introduction to the Philosophy of Science (University of Chicago Press, 2003). In it, he has a section explaining Bayesian epistemology, and his reasons for rejecting it “for something else.” Something else…that’s actually just reworded Bayesianism. That he doesn’t notice that, is yet more evidence that he only rejected Bayesianism because he didn’t understand it. And we can demonstrate that’s true by analyzing what he says about it.
Godfrey-Smith on Bayesianism
The relevant section is Ch. 14, “Bayesianism and Modern Theories of Evidence,” pp. 202-18. Godfrey-Smith leads that chapter by pointing out Bayesianism is actually “the most popular” epistemology now in the philosophy of science (p. 203). But he isn’t betting on it, for really just one reason he attempts to then explain.
Can We Avoid Prior Probabilities?
Godfrey-Smith’s principal objection is the problem of subjective priors. Which of course isn’t really a problem—or only is in the same way it’s a problem in every epistemology, which is no harder to “solve” in Bayesian epistemology than in any other (see my discussion in Proving History, pp. 81-85). But as Godfrey-Smith puts it, “The probabilities” in Bayes’ Theorem “that are more controversial are the prior probabilities of hypotheses, like P(h),” because, he asks, “What could this number possibly be measuring?” He complains that no one can “make sense of prior probabilities” (p. 205). That’s false; and entails he does not understand Bayesian epistemology, and cannot have really read much of what Bayesians have already said about what priors measure and how they are determined.
All prior probabilities are just previously calculated posterior probabilities. We usually just skip the long arduous math of running calculations from raw, uninterpreted experience all the way to worldview conclusions (like that solid objects and other minds exist), and thence to background knowledge (like the laws of physics and human behavior), and thence to specific starting points in reasoning about the base rates of phenomena. We just skip straight to the base rates of phenomena, and construct all our priors from observed physical frequencies. In particular, they are the frequencies drawn from how things have usually turned out in the past. (As I’ve discussed many times before, especially in respect to Burden of Proof, Worldview Assumptions, Fabricating Crank Priors, and guides to How to Reason Like a Bayesian, and why No Other Epistemology can validly Avoid Prior Probabilities.)
The famous “Mammogram Example” illustrates the point: doctors were dangerously over-prescribing mammograms because they were ignoring the role of prior probabilities. They ignored the rate of false positives and its relationship to the base rate of even having cancer in the first place, resulting in recommending so many mammograms that false positives vastly outnumbered real cancers, resulting in a rash of needlessly expensive and risky procedures, likewise causing (sometimes deadly) patient stress. Here the priors were constructed from hard data on base rates of cancers and false positives in mammograms. No one could be confused as to what the “prior probability could possibly be measuring” here. It’s measuring the base rate of the proposed phenomenon: having breast cancer (see Visualizing Bayes’ Theorem). And this is what priors always are: an estimate of the base rate of whatever phenomenon is being proposed. The only difference is sometimes we don’t have good data and thus our estimates are closer to even odds; or we have so much data showing only one result, it’s easier to just operate as if the contrary result has a prior of zero. But either way it’s still just an empirical estimate of a base rate, a frequency.
And that means whatever the hypothesis h is, as distinct from ~h (which means the falsity of h and thus the reality of some other cause of all our pertinent observations). For example, h “Jesus rose from the dead” vs. ~h “false tales and beliefs arose of Jesus having risen from the dead” are competing hypotheses. Their prior probabilities will be the most analogous corresponding base rates for this kind of question: Whenever we’ve been able to find out in any other case, how often, usually, do we observe these kinds of miracle tales and beliefs turning out to be true, rather than false? The answer is “never h” and “always ~h to the tune of thousands of times.” Worse, all the entities and powers and motivations required of h themselves have absurdly low base rates in past observation: we’ve confirmed no gods, and no supernatural powers capable of such things, nor exhibiting any such motives. Hence, “What is the prior probability measuring?” Answer: “The previous frequency of prior like outcomes.” Always and forever.
There is nothing mysterious about this. And it is definitely false to say, as Godfrey-Smith does, that “no initial set of prior probabilities is better than another” and that “Bayesianism cannot criticize very strange initial assignments of probability” (p. 209). I have no idea where he got either notion. No Bayesian I have ever read has said any such thing. And he neither quotes nor cites any doing so. To the contrary, Bayesians have an obvious objection to “strange priors”: they do not follow from the background data. You don’t get to just “invent” data that doesn’t exist; nor can the data exhibit a frequency, and you get to ignore that and declare a different frequency anyway. The prior probability in the mammogram case is simply the prior observed base rate of breast cancer. That’s a fact in b, the background knowledge every probability in Bayes’ Theorem is necessarily conditional on. If b contains data showing that frequency to be 1 in 1000, you cannot just “declare” it to be 999 in 1000. If the data show nothing other than 1 in 1000, it’s 1 in 1000. End of story. Anything else is a violation of evidence and logic.
Even frequencies that have less data to determine them by do not escape this consequence. If, for instance, we have literally no evidence showing a higher or lower frequency for h-like hypotheses over ~h-like hypotheses, then there is no logical argument by which you can declare P(h) to be anything other than 0.5. Because doing so logically entails asserting there is evidence of a different frequency than that. Conversely, you cannot assert a different frequency unless you have evidence of one in b. Even for vague frequency data, when we have to use ranges of probability to capture the fact that we do not know the prior is more likely to be any probability in that range than another, but is more likely to be one of those probabilities than any outside that range: we are still constrained in this by data. Even the total absence of data constrains us, forbidding bizarre priors by the very fact that the total lack of information renders them a priori improbable. No matter what information you have or lack, it always entails some base rate, or range of base rates, or indifference between base rates.
Consider Godfrey-Smith’s own inept example: he says there is nothing to tell us that the prior probability of “emeralds are green” is higher than of “emeralds are grue” (p. 210). By “grue” he means (for those unfamiliar with the conceit) an imaginary color defined as “the property of being green before [some specific date, whether in the future or in the past], and blue afterwards” (or vice versa). Bayesians, he claims, cannot justify a low prior for this. Which is absurd; and tells me Godfrey-Smith either hasn’t thought this through, doesn’t know anything about physics, or doesn’t understand how priors are conditional on background knowledge, b.
The reason that “emeralds are grue” has a much lower prior than “emeralds are green” is that in all our background knowledge, (a) no like property to grue has ever been observed and (b) no physics have ever been discovered by which a grue-effect would even be possible—despite by now our extremely thorough exploration of fundamental physics, including the reaction properties of materials to photons of a given frequency. No such effect is plausible on known physics, even in individual objects (apart from a corresponding physical transformation of the material), much less simultaneously among all objects “across the whole universe.” That any object is grue is an excellent example of an extraordinary claim: a claim contrary to all known physics and observation. Which necessarily thereby entails an extraordinarily low prior. Not one equal to “emeralds are green.”
Even in neurophysics, a human brain spontaneously swapping out two completely different neural circuits for producing color experience likewise doesn’t just “magically happen,” although even if it did, that would be a change in our brains, not in emeralds. There actually may well be people who experience blue when seeing emeralds, owing to a birth defect in the wiring of the eyes and brain (color inversion is a known anatomical defect), and we may even someday be able to surgically cause this, but that’s not the idea meant by grue. Because it has, again, no relation to the properties of emeralds. And in any event, we have a pretty good database of knowledge by which to estimate the prior probability of these kinds of neural defects and changes, and it certainly doesn’t register as high. It’s just not the sort of thing that usually happens.
All of which also means there is no epistemology with any chance of reliably accessing reality that can ignore prior probabilities. The prior observed base rate of a proposed phenomenon as an explanation of any body of evidence is always relevant information, and always entails limits on what we can reliably deem likely or unlikely. Any epistemology that ignores this will fail. A lot. And indeed, our brain literally evolved to use this fact (albeit imperfectly) to build increasingly reliable models of reality. All of our beliefs derive from assumptions about base rates; and indeed many of our cognitive errors can be explained as failures to correctly estimate them (and thus can only be corrected, by properly estimating priors).
Since we cannot avoid prior probability assumptions, the fact that we can only subjectively estimate them is not a valid objection. It is simply restating a universal problem we’ve discovered to be innate to all epistemologies: humans cannot access reality directly; all perception, all estimation, all model-building, all model-testing, all belief-formation is necessarily and unavoidably subjective. We can only access what our brains subjectively and intuitively construct. Our only useful goal is to find ways to get that subjective theatre to increasingly map correctly to the external world. Accordingly there can be better epistemologies and worse, as measured by the degree to which resulting predictions bear out (our discovered “failure rate”). And so we can be more or less objective. But there can never be a literally “objective” epistemology. And as this is true of all epistemologies, it cannot be an objection to any of them.
Not Understanding Subjective Probabilities
Godfrey-Smith not only fails to grasp what Bayesian prior probabilities are (and thus how they are constrained and in fact determined by data and thus not, as he seems to think, “mysterious” or “arbitrary”), he also doesn’t grasp what subjective probabilities are in general, the “epistemic” probabilities that Bayesian epistemology traffics in. When “subjectivists” talk about probability measuring “degrees of belief,” he, like all frequentists who screw this up, mistake them as saying probabilities aren’t frequencies.
Godfrey-Smith names two founders of subjective probability theory, Bruno de Finetti and Frank Ramsey. But then like pretty much every author I have ever seen arguing about “frequency” interpretations of probability as somehow standing against “subjectivist” interpretations of it, he clearly didn’t actually read either (nor evidently any probability subjectivist) on the matter of what the difference actually even is.
We can tell this by noting right away how all these critics, Godfrey-Smith included, incorrectly think the subjectivist interpretation is contrary to the frequentist one, that they are somehow fundamentally different approaches to probability. In fact, subjectivism is a sub-category of frequentism. Subjectivists are frequentists. And anyone who doesn’t understand that, doesn’t understand the concept of subjective probability. Subjective probabilities are simply estimated frequencies of error in belief-formation, which is still interpreting probability as fundamentally always a frequency of something.
The only actual difference between so-called subjectivists and so-called “frequentists” is that the latter think the only probabilities that we can talk about are “objective” ones, the actual frequencies of things in the world. When in fact we can never know those frequencies. Just as we can never know anything about the world except through our subjectively constructed models and perceptions of it. There is no direct access to reality. Just as there is no direct access to “real frequencies” in the outside world. Everything is mediated by subjective experience. Even God could not escape this; for even he can never know for certain his perceptions match reality apart from them. It’s a logical impossibility.
Once we abandon the impossible, all we have left is our subjective models and whether and how we can test them against the outside world to make them more accurate in matching it. This is true in every aspect of epistemology, including probability. We only ever know our subjective estimates of external real-world probabilities; and that those estimates can be increasingly more accurate, the more information we have pertaining to them. But we can never reach any point where we are 100% certain our estimates of frequency exactly match reality.
The most formally important of Godfrey-Smith’s cited sources is Bruno de Finetti, whose “Foresight: Its Logical Laws, Its Subjective Sources” was published in Annales de l’Institut Henri Poincaré 7 (1937). On page 101 therein, de Finetti spells it out:
It is a question simply of making mathematically precise the trivial and obvious idea that the degree of probability attributed by an individual to a given event is revealed by the conditions under which he would be disposed to bet on that event
He describes what he means on the next page: “Let us suppose that an individual is obliged to evaluate the rate p at which he would be ready to exchange the possession of an arbitrary sum S” upon “the occurrence of a given event E,” then “we will say by definition that this number p is the measure of the degree of probability attributed by the individual considered to the event E” and hence “p is the probability of E” according to that individual. Note his words here: “the rate.” That means frequency. So even de Finetti, the famous founder of probability subjectivism, defines probability as a frequency. He only differs from other frequentists in realizing that the frequency we are talking about is actually only ever the frequency of our being right or wrong about what’s in the world; not the actual frequency of things in the world.
Then de Finetti shows you can build this out to satisfy all the formal requirements of a coherent theory of probability. And he’s right. He eventually goes on to ask if “among the infinity of evaluations that are perfectly admissible in themselves,” meaning every possible subjective assessment that is internally coherent (as every individual observer may have their own estimates), is there “one particular evaluation which we can qualify…as objectively correct?” Or at least, “Can we ask if a given evaluation is better than another?” (p. 111).
He then shows that all this subjective estimating is really just an attempt to use limited and imperfect information to approximate the real objective probabilities of events. For instance, when the data is extremely good favoring a certain objective frequency of “betting correctly,” subjective frequency estimates become essentially identical to the objective frequency supported by observations. And this is the more the case, the more information one has regarding those real probabilities.
This is exactly what I argue and demonstrate in chapter six of Proving History (esp. pp. 265-80, “Bayesianism Is Epistemic Frequentism”; which benefits from having read the preceding analysis of objective probability, pp. 257-65). There actually is no difference between “frequentist” and “Bayesian” interpretations of probability. The claim of there being a difference is a semantic error, resulting from failing to analyze the actual meaning of a “subjective probability” in the empirical sense, that is, what it is people actually mean in practice: how they actually use it, and what they are actually doing when they use it.
An epistemic, or subjective, probability, is as de Finetti describes: an individual’s empirical estimate of the frequency they will be right to affirm true some hypothesis h, given a certain measure of evidence. That’s a frequency. Thus all “degrees of belief” are still just frequencies: the frequency of being right or wrong about a thing given a certain weight of information. And as such, probabilities stated as “degrees of belief” always converge on “actual objective” probabilities as the weight of available information increases. The only difference is that “degrees of belief” formulations admit what is undeniable: our information is never perfect, and consequently we actually never can know the “objective” probability of a thing. We can only get closer and closer to it; with an ever shrinking probability of being wrong about it, but which probability never reaches zero. (Except for a very limited set of Cartesian knowledge: raw, uninterpreted, present experiences, which alone have a zero probability of not existing when they exist for an observer, since “they exist for an observer” is what a raw, uninterpreted, present experience is.)
This is obvious in practice (as I also show with many examples in Proving History). When a gambler has a high confidence in the honesty of a certain game, their subjective estimates of the probability of winning a bet are essentially synonymous with what they plainly observe to be the objective probability of winning that bet. They are never exactly identical, because no matter how high the gambler’s confidence, there is always some nonzero probability they are missing some crucial piece of information—such as would indicate the game is in fact rigged, or that the physical facts affecting the real frequency of outcomes is different than observed (e.g. a slight defect in a die or deck unknown to the casino).
And the same reality iterates to every other case in life, which really does reduce, as de Finetti says, to yet another gambling scenario. We are betting on conclusions. We aren’t necessarily wagering money, but confidence, trust, and so on, but it’s still just placing bets on outcomes. And our estimates of the likelihood of winning or losing those bets, is still synonymous with what we mean by subjective probability. So subjective probability is just another frequency. Frequentists and subjectivists are talking past each other, not realizing they are actually just talking about different frequencies: the “frequentist” is focused on the real, actual frequency of a thing; the “subjectivist” is focused on the frequency of being right or wrong about that thing. But the subjectivist is correct that we never actually know “the real, actual frequency” of a thing; only ever (at best) the frequency of being right or wrong about it.
There are some “real, actual frequencies” we have nearly close to actual knowledge of, but even those we don’t really know, owing to small probabilities of our information being wrong. Frequentists become seduced by this into forgetting every probability they are sure they “know,” always might be incorrect. The subjectivist is thus using probability in the only sense actually coherently applicable to human knowledge. The frequentist is living in a fantasy world; one that often coincides with the real world just enough to trick them into thinking they are the same.
Why “Frequentism” Is Defective as a Methodology
As for example when the frequentist insists only “frequentist statistics” is valid, oblivious to the fact that no frequentist methodology accounts for even known (much less inevitably unknown) rates of fraud and error in those very methods. All frequentist methodology can ever determine (if it can even determine this at all) is the probability of random chance producing the same data, the so-called null hypothesis. Determining a probability of excluding the null hypothesis or not only tells you the probability that such data could be caused by chance. But that can never tell you the hypothesis you are testing is probable—because there are many hypotheses that could explain those same data besides random chance.
The frequentist method can’t even determine that the data was produced by chance; all it can do is tell you whether the data could have been produced by chance. But on the reverse side, neither can it tell you any tested hypothesis is probable, because it does not even account for competing hypotheses. Including fraud, experimental error, or unknown or unconsidered causal factors and models. There is no way to account for these other possibilities except by Bayes’ Theorem (or methods reductively identical to it). Because there is no way to affirm any hypothesis is probable, without affirming a probability for all these other causes. And only some form of Bayesian reasoning can validly get you there.
This is because it is logically necessarily the case that the probability of any hypothesis h can never be other than the converse of the sum of the probabilities of all other possible causes of the same observations (including, incidentally, chance). So it is logically impossible to “know” the probability of h without thereby claiming to “know” the probability of every alternative within ~h. And those latter probabilities can only be known to an approximation. Which means, ultimately, subjectively.
Thus frequentism as a methodological ideology is logically defective and can never produce human knowledge of almost any useful kind. Whereas the only useful features of frequentism—the formal logic of probability and its associated mathematical tools, and the frequentist interpretation of probability—is already fully subsumed and entailed by Bayesianism. Frequentist methodologies are still great for determining the probability that a data set could be produced by chance “assuming nothing else is operating” (no fraud, no experimental error, no sampling error, and so on—which can never be known to 100% certainty). But it has literally no other use than that. And human knowledge cannot be built on such limited information. “This was probably not caused by chance” does not get you to “This was therefore caused by h.”
What frequentists don’t realize (and this isn’t the only thing they don’t realize) is that when they infer from their results about null hypotheses that some specific causal factor or model is operating—in other words, that we can claim to know some specific hypothesis is true (even just that it isn’t “random chance”; or indeed even that it is)—they are covertly engaging in Bayesian reasoning. They assume, for example, that the prior probability of fraud or experimental error is low, low enough to disregard as a trivial probability of being wrong. They assume, for example, that the likelihood ratio would, if someone were to complete it, strongly favor the hypothesis they are now claiming is true, rather than still leave a problematically high probability that some other hypothesis is causing the evidence instead. These are all Bayesian assumptions.
And the only logically valid path from “we got a frequentist result x for h” to “hypothesis h is probably true” is Bayes’ Theorem. It’s just all going on subconsciously in their heads; so they don’t realize this is what their brain is doing; and they aren’t taking steps to analyze and unpack that inference in their head to test whether it is even operating correctly. Which means Bayesians are actually less subjective than frequentists; for Bayesians at least admit we need to account for, and objectively analyze, how we are getting from A to B. Frequentists just leave that to unanalyzed, unconscious, and thus entirely subjective inference-making.
We get the same outcome when we go back and look at Godfrey-Smith’s other named source for the theory of subjective probabilities, Frank Ramsey, who published three seminal papers on the subject, starting with “Truth and Probability” in 1926, which was published in his 1931 treatise The Foundations of Mathematics and other Logical Essays (all three papers are available in combined form online). There he does, less formally, the same thing de Finetti did, and equate subjective “degrees of belief” to bet-making—hence reducing degrees of belief, yet again, to frequencies. Only, again, frequencies of being correct or mistaken about a thing, rather than the actual frequency of that thing. Because the latter is always inaccessible to us, and only capable of increasingly accurate approximation.
Indeed, as Ramsey says (on p. 174 of Foundations): “This can also be taken as a definition of the degree of belief,” that an individual’s “degree of belief in p is m/n” whereby “his action is such as he would choose it to be if he had to repeat it exactly n times, in m of which p was true, and in the others false.” This is literally a frequency definition of subjective probability. So any frequentist who claims subjectivism isn’t just another form of frequentism (and indeed one that is more accurately describing human knowledge) literally doesn’t know what they are talking about, and thus cannot have any valid criticism of subjectivism, certainly not on the grounds that probability is always a statement of frequency. The subjectivist already fully agrees it is. And what criticism then remains? That we have direct access to objective probabilities and therefore can dispense with subjective estimates of objective probabilities? In no possible world is that true.
It also follows that no amount of complaining about “but different people might come to different estimates or start with different assumptions” can be an objection to Bayesianism either. Because the same complaint applies equally to all epistemologies whatever. The solution is the same in all of them: disagreements must be justified. They who cannot justify them, are not warranted in maintaining them. And that means justifications that are logically valid, from premises that are known to a very high probability are true (and that includes premises about the facts we are less certain of, e.g. that we are reasonably certain we are uncertain of some fact x is itself data, that also logically constrains what we can assume or conclude). Everything else is bogus. In Bayesian as much as in any other epistemology. (I thus discuss how epistemic agreement is approached among disagreeing parties in Proving History, e.g. pp. 88-93.)
Godfrey-Smith’s “Fix”
Admittedly, when he gets around to articulating what he would replace Bayesianism with (pp. 210-18), Godfrey-Smith concedes “it is not clear which of these ideas are really in competition with Bayesianism, as opposed to complementing it” (pp. 210-11). Indeed. Not a single idea he proposes is anything contradictory or complementary to Bayesianism: his fixes are Bayesian! That he doesn’t know that, confirms my hypothesis: he only objects to Bayesianism because he doesn’t understand it.
Godfrey-Smith first argues correct epistemology must operate by “eliminative inference,” i.e. ruling out alternatives (pp. 212-13). But here he even admits John Earman has provided a Bayesian framework for that, and further admits that framework might make the most sense of the logic of eliminative inference. Indeed I think it is the only framework that does, as in, the only way to construct a logically valid argument to the conclusion that one hypothesis is more likely, when all known competing hypotheses are shown to be less likely. That’s indeed one of the most important lessons of Bayes’ Theorem: that theories can only be validated by trying really hard to falsify them, and failing.
Verification doesn’t work; except when it consists of failed falsification. Because the only way to increase the probability that h is true above all known competing hypotheses, is for the probability of the available evidence to be greater on h than on any known competing hypothesis. And that’s simply the likelihood ratio in Bayes’ Theorem. Likewise that evidence cannot merely be more likely; it has to be more likely on h, by as much as the prior probability of ~h exceeds that of h—and yet even more, if you want confident results and not merely a balance of probability. But even to just get to “probable,” when h starts with a low prior, you need even more improbable evidence. As demonstrated with the mammogram case: you need exceedingly good tests for cancer, to overcome the very low chance of even having cancer, otherwise you end up with far more false positives than true results. And that’s as true in that case as for any other epistemic problem: if your methods are going to give you more false positives than correct results, your methods suck. Hence there is also no way of escaping the need to account for base rates, and thus prior probabilities. It’s Bayes’ Theorem all the way down.
The important point though, is that the only way to “rule out” a hypothesis is to reduce its epistemic probability. Because you can never get that probability to zero. There is no such thing as deductive elimination. There is always some nonzero probability, for example, that even when you see an elephant, “eliminating” the theory that there is none, you are wrong—you saw an illusion or hallucination instead, and the hypothesis that there is no elephant hasn’t been eliminated after all. And this is true for every possible observation you think eliminates some theory. So it’s always, rather, a question of how unlikely such “theory rescuing” conditions are, given the information you have (e.g. the base rate of such illusions or hallucinations, given all your background information about your situation and condition and so on). Hence, still Bayes.
Godfrey-Smith next argues that good methods must be correlated with reliable procedures, what he calls “procedural naturalism.” This is also just a restatement of a Bayesian principle. For a “reliable procedure” is simply any procedure that produces evidence unlikely on the disconfirmed theory and likely on the confirmed one—that is literally what a reliable procedure is, by definition. And the more reliable, the more it diverges those two probabilities. For instance, procedures that reduce the prior probability (the base rate) of fraud to a very low level, are more reliable than those that don’t precisely because they reduce the probability that the evidence is being observed owing to fraud and not the tested hypothesis. Otherwise, the evidence is too likely to be the result of fraud to trust it as evidence for the hypothesis. And so on, for every other source of observational error.
This is also a really important lesson we learn from Bayes’ Theorem: weak tests, do not make for strong conclusions. There is a tendency for people who want to go on believing false things, to “test” them by using weak methods of falsification rather than strong ones, so they can claim their belief “survived a falsification test.” But a falsification test only makes your belief likely if that test has a really high probability of exposing your belief as false if it is false. If it is likely to fail to find a false belief even when a belief is false, then it is not a reliable procedure. And as it leaves the evidence likely even on ~h, surviving the procedure is no longer evidence against ~h.
So all questions about “procedure” simply reduce to the effect a procedure has on the Bayesian priors or likelihood ratio.
Conclusion
None of Godfrey-Smith’s objections to Bayesianism evoke a correct understanding of it. He falsely believes prior probabilities are not constrained by background information—even though that’s what the “b” means in P(h|b): the Probability that h given b. Not “given any willy nilly assumptions you want to make.” He also falsely believes subjective probabilities (“degrees of belief”) are not frequencies. And he doesn’t know what they are frequencies of; or how this demonstrates the supremacy of Bayesianism over all other constructions of probability in respect to actual human knowledge (which can only ever achieve knowledge of subjective probability). And his “solution” is to propose principles already entailed by Bayesian logic: that evidence has to make alternative hypotheses less likely, that the only way to do this is to collect evidence that’s unlikely to be observed on alternative hypotheses, and that reliable procedures are by definition those capable of doing that.
Consequently, Godfrey-Smith would be a Bayesian, if ever he correctly understood what Bayesianism was and entailed. And as I have found this to be the case a dozen times over, in fact in every case I have ever examined, I am starting to suspect this is the only reason anyone ever isn’t a Bayesian.
I disagree, I didn’t understand a bloody word but I know you do and given the coherence and transparancy of your other arguments, I trust your judgement and don’t think you are pulling the wool over our eyes.
But then I have no axe to grind.
Pd I lied. I did understand this… Not a single idea he proposes is anything contradictory or complementary to Bayesianism: his fixes are Bayesian!
Oh, indeed, that’s not even a disagreement. That “if p, then q” does not entail that “if q, then p,” as that would be the Fallacy of Affirming the Consequent.
Ergo, that everyone who is against Bayesianism does so because they don’t understand it, does not entail that everyone who does not understand it is against Bayesianism.
And that doesn’t even require understanding Bayesianism. Just standard logic.
Brilliant LOL!
Another truly great blog by Dr. Carrier!
About 25 years ago, one of my jobs was working in a book store, which included trying to sell membership cards to customers. One time I said to the other employees, “May heaven strike me dead, if I don’t sell membership cards to the next four customers in a row!” This was a very bold statement, because I was betting my life on the subjective probability that God doesn’t exist or, at least, doesn’t strike people dead. And that subjective probability was based partly on the balance of evidence against God’s existence and the absence of any evidence making it probable that God strikes people dead. The situation made my bet even bolder, because employees previously very seldom sold as many as two memberships in a row, and never three or more in a row. The surprising thing about this situation is that I actually sold memberships to the next 4 customers in a row!
In section Can We Avoid Prior Probabilities? paragraph 6, how do sentences 2 and 6 cohere with each other?
With admiration for Dr. Carrier's work,
Barry Rucker
If I’ve counted right, you are asking about these sentences:
“If, for instance, we have literally no evidence showing a higher or lower frequency for h-like hypotheses over ~h-like hypotheses, then there is no logical argument by which you can declare P(h) to be anything other than 0.5.”
And
“Even the total absence of data constrains us, forbidding bizarre priors by the very fact that the total lack of information renders them a priori improbable.”
A bizarre prior would be at an extreme, e.g. 0.01 or .99. That would definitely violate the rule that in the total absence of data (hence lacking any data making h more or less likely than ~h), the prior must be 0.5 (which includes ranges balanced around 0.5). Because 0.5 is not a bizarre prior; especially in the absence of data, when indeed it is the only logically defensible prior.
But let me know if I’ve missed what you are asking about.
The idea that frequencies lead directly to degrees of belief in some straightforward way (note that these are two different things, so equating their values needs an argument) has been criticized, notably by David Lewis if I recall correctly.
The “mammogram example” deals with items which have a frequency, and hence by random sampling, an a way to get a probability. But note there the probability is a propensity (of the test instrument, etc.) not a frequency never mind a degree of belief without further ado.
So take a hypothesis like: “hydrogen and oxygen react to form water”. What frequencies apply? What should be my degree of belief if no frequencies apply? Bayesian epistemologists have answers, but they vary. My colleague from CMU Teddy Seidenfeld and his “school” were investigating imprecise probabilities precisely because not “everything” has a relevant frequency. Note: nonBayesians like me and Bayesians like Seidenfeld are not denying the background knowledge here; we just dispute how it should be used.
“No Bayesian I have ever read has said any such thing.” – Yes, they have. Richard Jeffrey, Peter Gardenfors, etc. have written this at one time or another. (I am not sure if the latter still does; Jeffrey is dead, IIRC.) This is where people appeal to appropriate convergence theorems.
” He only differs from other frequentists in realizing that the frequency we are talking about is actually only ever the frequency of our being right or wrong about what’s in the world; not the actual frequency of things in the world.”
And we know now this runs into severe problems. In fact in 1939 Jean Ville’s book shows that probabilities are not frequencies (one needs other ingredients, effectively). The notion of the “collective” is mathematically ill defined, to cut to the chase.
“Verification doesn’t work; except when it consists of failed falsification. Because the only way to increase the probability that h is true above all known competing hypotheses,”
Only on a model where a single probability attributed to hypotheses applies. One cannot assume this.
If you are interested in the critique of Bayesian epistemology and contemporary theories of belief dynamics, etc. from someone who studied it in graduate school, you can let me know. There’s much more than I really can do here. (I hate to toot my own horn, but …)
The most important thing to my mind is the background knowledge – I agree vehemently that this is to be represented somehow when evaluating new claims, but how to do it is the question.
All this is addressed in Ch. 6 of Proving History. So you may need to consult that.
But in respect to the specific points you bring up:
“The idea that frequencies lead directly to degrees of belief in some straightforward way…”
No one said it was straightforward. You might be confusing semantics with process here. That epistemic probabilities are semantically just assertions of (one’s estimate of a) frequency is straightforward. How you derive and justify those assertions is not. I show the complex relationship at the end of Ch. 6.
“So take a hypothesis like: “hydrogen and oxygen react to form water”. What frequencies apply?”
If you mean “will do so in this particular instance I’m concerned with” (e.g. that at the next launch of an H-O powered rocket that combustion outcome will occur as predicted) that would be different from “will always do so in every instance,” but since the observed frequency of failing to do so is zero, they are effectively the same probability at present (though not literally).
Hence the question reduces to:
In b are billions of observations of h and zero observations of ~h. The prior probability of h (absent interfering circumstances, e.g. the presence of a sufficient quantity of halon) is therefore effectively zero. Even before we get to the background theoretic support for h which only further lowers the prior for ~h. So we disregard it. Although if someone wanted to seriously challenge it, one can estimate a Laplacean bound for it (just as Laplace did for sunsets). See PH, index.
But note that the inventor of halon did not need to overcome such a prior; he already had a background-theoretic reason for a higher prior of ceasing combustion. And even if someone were to discover something like halon by accident, the amount of observations of its effect that would be made before scientists agreed it was actually producing ~h would itself have a demonstrably lower likelihood ratio favoring h than P(~h) on any plausible Laplacean bound for P(~h).
Particularly as ~h is predicated on “absent interfering circumstances,” and the introduction of an interfering substance alters the prior probability, which now must account for “how often interfering substances are found to change chemical reactions,” which has a much higher prior on any informed b. Whereas if someone were claiming to do it with their mind, this is no longer the case. It would take a very great deal for someone to prove their demonstrations of such a power was not fraudulent. But if it was indeed real, eventually the probability they were fraudulent would fall below the Laplacean boundary and we’d have enough evidence to believe it.
But yes, when there are disputes as to what ranges of frequency fit the data (which requires resolving what, semantically, is being asserted about the data), they must be resolved with recourse to justifications, same as any other claim, in which uncertainties must also be acknowledged and quantified. This is the same in all epistemologies. So there is no way of escaping it. It therefore cannot be an objection to Bayesianism. That which is true of all epistemologies, argues against none.
“Richard Jeffrey, Peter Gardenfors, etc. have written this at one time or another.”
No, they didn’t. So I am pretty sure you are not reading what they said correctly. And we can test this: give me a quote (with citation so we can examine the context) of either man saying that we can assign literally any prior we want to any h without constraint. See what happens.
“Jean Ville’s book shows that probabilities are not frequencies (one needs other ingredients, effectively).”
Probabilities are semantically just an assertion of a rate. Which is just a frequency. Perhaps you are confusing the semantic definition of a rate “as such,” with what resources are needed to determine a given rate?
“Only on a model where a single probability attributed to hypotheses applies.”
False. I show in Proving History that we routinely operate with ranges of probability (which, when we want to, is formalized into a confidence interval for every given confidence level), and single probabilities are just a convenience of notation. No one actually relies on a single probability; they just assume the range around the probability they are working with is too small to matter (which will vary by how much is required in the circumstances “to matter”; which means, when the variances are below our margin of error, semantically they cease to matter, and pragmatically they cease to matter in any particular case when the margins of error cease to make any difference we care about in that case). This poses no problem mathematically or logically. It’s dealt with all the time.
“If you are interested in the critique of Bayesian epistemology…”
Please recommend the top three critiques (books, chapters, sections, or articles) in your best judgment. But please, only examples written by authors who actually understand Bayesianism and thus don’t make mistakes like Godfrey-Smith. I need steel men. Not self-made straw men.
“That epistemic probabilities are semantically just assertions of (one’s estimate of a) frequency is straightforward.” Ville shows this cannot be correct, because probabilities cannot be frequencies. The reference here is to Ville’s whole book. David Lewis’ critique of the Principal Principle is also relevant here; for that see the SEP entry on Lewis: https://plato.stanford.edu/entries/david-lewis/. Also, Kyburg, Smullyan and the so-called lottery paradox applies here too. For that one, people have even proposed a non-adjunctive logic to get out of the mess: my late teacher Horacio Arlo-Costa has written about that. (For example: “Non-Adjunctive Inference and Classical Modalities”, The Journal of Philosophical Logic, 34, 581–605.)
As for the water example, sorry to do the Socrates thing, but I was deliberately elliptical. Water and oxygen do not appreciably react at standard pressure and temperature, so the frequency is not of that event: it isn’t even well defined what the event is. (Here one needs a metaphysics of events and event types, arguably.) The aside aside, what is one counting? One has to state that and then runs into the problem with the appropriate collective again (or reference class, sometimes). Does one count molecularly or macrosopically? I bet if you took a match and lit hydrogen and inspected over a short period there would be be billions of unreacted molecules of hydrogen and oxygen. But that’s still not a frequency. One can, maybe, turn this into a rate (that is, introduce additional hypotheses to calculate one), but no chemist does this. Why?
“This is the same in all epistemologies. So there is no way of escaping it. It therefore cannot be an objection to Bayesianism. That which is true of all epistemologies, argues against none.”
And this is where you make your mistake, as far as I am concerned: as I said I have no problem with representing the background knowledge and understanding that it is a big part of the puzzle of belief and knowledge. But that’s not Bayesianism. As you say yourself, that’s epistemology’s puzzle generally. But one needs to argue that B. is the way to go – and worse still, which Bayesianism. I actually was tempted to run a followup paper (to my critique of certain Bayesian epistemologies) about the problem of how to apply to Bayesianism about “it”self.
“Probabilities are semantically just an assertion of a rate.”
Perhaps that is correct in ordinary language. (It wasn’t at the time of Locke, who seems to think that “probability” is a predicate of arguments – see the Essay.) But an appeal to ordinary language doesn’t cut it with me, at least, especially with Ville, Popper, Bunge and others providing counterarguments.
“which, when we want to, is formalized into a confidence interval for every given confidence level” – not last I checked in on Seidenfeld (and other’s) work on imprecise probabilities. The wikipedia article does cover a bunch of alternatives: https://en.wikipedia.org/wiki/Imprecise_probability – note especially that some combine propensities (or measured frequencies as a proxy for this) with degrees of belief explicitly.
As for Jeffrey and Gardenfors: Jeffrey (and I disagree with his arguments, but he does again address the frequency thing) https://www.princeton.edu/~bayesway/Book*.pdf seems to cover it in the very first exercises – I have other references if necessary – I’ll get those later.
Gardenfors: Knowledge in Flux covers this (and also other problems that I have not mentioned like epistemic probability without Bayes rule as the update “mechanism” [!]). See the introduction (all that is needed here) as the specific target belief change not initial views). Horacio Arlo-Costa (see above) also always made that clear: we start from where we are, and that our revisions matter rationally. (I am not sure if he ever argued that in print; it was one of our interesting disagreements.)
Seriously?
Cannot be. Do you really mean to use that word? Because this is starting to sound like crankery.
Obviously probabilities can be defined as rates (m per n), which is a frequency. It is rather harder to show they can be anything else. But to suggest they cannot be that, is demonstrably absurd.
Either Ville is a crank, or you have not correctly stated his assertions.
The other stuff you cite, at a glance, does not appear to be relevant here. So you are going to have to try harder to explain what it’s supposed to show that has anything to do with the semantics of real-world human discourse on probability.
Of course it is. We have vast quantities of water next to oxygen; here on earth, and all through the cosmos observed by astrophysicists. We’ve even put pure water in pure oxygen in lab conditions. The “number of sunsets” here in Laplacean terms is astronomical. The thing being counted is well defined. And the count is so vast it’s beyond our means to even tally. Which means the probability that oxygen reacts with water is vanishingly small, small enough to ignore it.
You seem to be incapable of comprehending how we even built the Periodic Table and formed the foundations of modern chemistry.
Qua Feynman, per molecule.
Which we can proxy by mass.
I don’t know what you are even talking about here. You’ve switched now from water to hydrogen. Chemists certainly can calculate how many molecules of hydrogen will react to how many molecules of oxygen under a given energy input (e.g. a lit match). They do this all the time. Practically the entire field of industrial chemistry is about doing this!
We also of course now know why oxygen never reacts with water. So the probability is even lower than mere observations would leave it.
Bayesianism is the realization of what we are doing when we do this. Indeed, it is the realization of the only thing we can doing when we do this.
See Ch. 4 of PH.
Those alternatives either do not describe real world human knowledge, or reduce to frequency definitions. Everyone in history who has said otherwise is simply wrong. If you want to maintain they are not wrong, you have to do more than cite their existence.
Oh? Where? Citing the whole book does not validate your claim as to what’s in it. Your claim as to what’s in it is false. If you wish to prove otherwise, you’ll have to find the quote and page number. Or admit that you can’t and that what you claimed about it is false.
Ditto.
I just wonder why people would have objections against Bayesianim ? As dr. Carrier proved in many of his writings and beyond any reasonable doubt, Bayes’s Theorem is just the formal description of a logical process. You can perhaps have criticism on where and when to apply it (just as you could in the case of frequency analysis), but criticizing the method is like criticizing calculus or logic itself. You can only do that using illogical (and thus fraudulent) arguments !
It would be interesting to do a proper study finding out. But informally, I have so far found:
The most common reason is fear of math. Historians and philosophers don’t want to learn math and so they have to deny anything they are saying is analytically mathematical. (Like Bart Ehrman.)
The next most common reason is not understanding math. Confronted with a Baysian analysis, they interpret what is happening incorrectly, correctly find fault with their incorrect model, and then reject what they just misunderstood. (Like here with Godfrey-Smith.)
And of course, those two reasons really always collapse to the second. And all other reasons (below) also rely on that second reason to resist the conclusion.
The next most common reason is Status Quo Fallacy. People are disinclined to admit their thinking on a subject was incorrect and they need to change. Conservative thinkers (even centrist conservatives) most of all. Resisting change, they rationalize any reason to reject Bayesian epistemology because “their” epistemology cannot have been wrong.
After that is a basket of random weird reasons. Like John Loftus, who thinks Bayesianism entails “conceding” so much to theists that it will doom the world and we’ll be overrun by theists and atheists will be crushed under the hammer of their own making and hell will reign on earth for a thousand years. I’m exaggerating. But that is essentially what he thinks. Which is just another iteration of the second reason: he doesn’t understand what he’s talking about, and refuses to (I suspect, really, for reason three; although he often states reason one: he “doesn’t want to do math,” while ignoring my repeatedly telling him Bayesianism doesn’t even require “doing math” in the sense he means).
Great article! Incidentally, a couple of years ago Loftus blocked me on FB for arguing against one one of his anti-Bayesian posts. It was especially strange because I wasn’t being rude or insulting. It was purely a dispute over the issues, with no invective. I was kinda shocked he felt the need to block me over it. Seemed like he was on an emotionally charged jihad against it. I also a similar problem with Richard Miller, although he didn’t block me. Peculiar..
I wrote to you earlier, on behalf of our Lancaster, PA atheist group about Freke & Gandy’s book, “The Jesus Mysteries’, and we all appreciated your reply. Now we want to ask about the book “Deciphering The Gospels” by R.G. Price (Forward by Robert M. Price). The idea that the first gospel was originally written as fiction, with many allusions to the OT, was new to us–what do you think? Will you be publishing your ‘easier to read’ work on the origin of the gospels soon? We are awaiting it. Thanks! Elaine Olson, RN, MA
My briefer book on historicity should be out by the end of this year.
R.G. Price’s theories are worth considering (I even mention him in OHJ), but I don’t think he has an adequate theory of the origins of Christianity, only perhaps of some of the symbolic intent of the Gospels (and much of that more speculative than provable). But the Gospels were created a lifetime after the religion began. And their pertinent content isn’t found in earlier Christian literature (the letters of Paul, Hebrews, 1 Clement, etc.). But I have not had time to fully examine his case so my opinion is only preliminary, not definitive.
Thank you for this help, I’ll share it tomorrow evening. We are all looking forward to your book. Elaine Olson
In my case it is because I am not convinced that a probability calculus is the correct way of representing knowledge nor am I sure of which one of very many different Bayesianisms to use, which is itself an argument. (See elsewhere on this thread.)
That there are “very many different Bayesianisms” is no different than there are “very many different logics.” It would be irrational to conclude that therefore “logic does not provide a correct description of knowledge.” Ditto Bayes’ Theorem. The more so when most of those “different” Bayesianisms are really just incorrect; that there are nevertheless still many valid ways to operate it is the same as the fact that there are still many valid logics and mathematical axiom systems we can use to describe the same things.
Meanwhile, that Bayes’ Theorem has to be the correct way of representing knowledge can be shown deductively:
(A1) All knowledge consists of assertions regarding the probability of a proposition’s truth.
(A2) All assertions regarding probability must obey the logic of probability.
(A3) All knowledge must obey the logic of probability.
(B1) Bayes’ Theorem is entailed by the logic of probability.
(B2) That which must obey the logic of probability cannot disobey that which is entailed by it.
(B3) Therefore, knowledge cannot disobey Bayes’ Theorem.
(C1) Assertions regarding the probability of a proposition’s truth reduce to no other assertions than those contained in Bayes’ Theorem.
(C2) That which reduces to no other assertions than those contained in Bayes’ Theorem will be fully described by it.
(C3) Knowledge is fully described by Bayes’ Theorem.
I have a more formal syllogism demonstrating the third premise 1 above in Ch. 4 of Proving History, which is even more inescapable and leads to the conclusion with insurmountable force.
(The water + oxygen example above should have been hydrogen + oxygen everywhere.)
Meanwhile: A1: Non-subjectivists dispute this – in fact, if one wants to appeal to language, one can see this in the terminology used uncritically by many didactic remarks about probability theories: “events” are the domain of the probability function.
As for non-classical logics: our debate is analogous to the debate over certain classes of them. Meaning: even if one accepts that propositions have a probability, one still doesn’t get Bayesian notions out, if only because there are several notions of probability applied to propositions. This is why I referred to Gardenfors and that tradition: there’s a whole other possibilities of update.
The “imaging” (and other variants) folks do not deny the theorem (though it, but its application. For example, in the theory of projectiles one can calculate that the landing time of a projectile is negative.This is a theorem; however one denies that it applies to the world in the way the positive root of the equation does. One needs an argument for conditionalization as the rule.
As for calling Ville a crank, well, read the arguments: a (limit) frequency needs a collective. Collectives are not well defined, so probability is not a (limit of) frequency.
(One needs limits to make sense of probabilities like 1/square root (2), which are vital in statistical and quantum mechanics and elsewhere because no finite collective works.)
So called frequentist statistics do not require this assumption as far as I can tell.
The mistake most make (perhaps you too – I don’t know) is something like: “we can estimate probabilities through a measurement of frequencies, therefore that’s what they are” – i.e., operationalism. But this is a mistake – it is somewhat like saying an electric current is an angle because that’s what an old school ammeter uses to show current intensity.
I don’t know what you mean by “Non-subjectivists.” All knowledge is subjective; there is no objective access to the real world outside the theatre of the mind. So I don’t know who you could be referring to. Perhaps this explains your whole problem: you seem not to understand what an epistemic probability is.
You also don’t seem to have a coherent point. There is no such thing as “events” that are not themselves the collection of billions of other events (or facts, as not all frequencies relate to time). What we choose to put in a set and call an “event” is an arbitrary human decision. So it is logically impossible that frequencies don’t apply to all sets however they are chosen. As soon as you select a set, that collection will have a frequency in background data.
Read Chapter 6 of Proving History to understand the nature of reference classing.
And I’m telling you all interpretations of probability either have no applicable use in human epistemology or reduce to the same definition applied in Bayesian probability theory: frequency. You have yet to present any evidence to the contrary, or even to address my deductive proof of this in Chapter 4 of Proving History.
I see no relevance of this to anything we are discussing.
If you are confusing idealized models (which are only useful fictions) with the complexity of real systems, you need to catch up.
That is not a logically valid argument. You just stated a non sequitur.
Language requires words. Words are not well defined, so a language is not words.
If you think that’s a valid conclusion, you need a refresher on how logic works.
What sets we choose to demarcate and observe the frequencies of is arbitrary. That in no way affects the conclusion that probability is a frequency. This is to confuse semantics with application, and imprecision with error. So if this is what Ville is doing, he is an A-1 crank.
I see no relevance of this to Bayesian epistemology.
No human knowledge exists with an error margin smaller than an infinitesimal. So infinitesimal variances in probability are irrelevant to human epistemology.
If scientists want to get at such things by various devices (like limit theory), they are welcome to. But it has no bearing on epistemic probability. You thus seem to be confusing epistemic probability here, with objective frequencies in constructed models. Which gets us back to the start: you don’t know what epistemic probability is.
Cite me an example of a peer reviewed frequentist work that derives an epistemic probability for a hypothesis from a probability calculation of 1/square root.
That’s a nonsense analogy.
You are confusing semantics again with applied model-making. Semantically, probability is synonymous with the affirmation of a frequency. Always and forever. In any context relevant to human epistemology (and all other contexts are irrelevant here, as we are discussing epistemology, not physical systems independent of it).
You might be uncomfortable with complex frequencies like 1/square root; most people are uncomfortable with infinitesimals. But that doesn’t make them “not” a frequency. These kinds of frequencies are of course already irrelevant to human epistemologies, which never deal in infinitesimal distinctions between probabilities, because no human precision comes anywhere near such a thing. But even if we wanted to explore imaginary situations, like some sort of god-being who can actually be infinitesimally certain of a frequency, all we are doing is dealing in transfinite frequencies. A frequency is a ratio, a rate: m to n. In no way does that require the ratio to be finite or rational. Any rate, like 1 black floor tile per every “square root of 1 square yard” of floor tiles, is still a frequency. How we would work with those awkward frequencies mathematically may involve certain limits and tricks or even be out of our means. But that doesn’t change semantically what that is: one tile per a given area; one photon per a given area; a.k.a. a frequency.
Even time ratios can be irrational, yet still a rate, a frequency: if we have a detector receiving one photon per second, and another detector beside it receiving one photon per square root of a second, we still have a ratio between them that is a frequency. But even the second rate is a frequency all by itself. That that second rate of photons doesn’t ever land “exactly” on a second mark on an arbitrary human clock is irrelevant. That human mathematical tools have a hard time running calculations with a frequency that off-kilter is only a limitation of our tools; it has no effect on what is being measured semantically, which is still a frequency.
You probably covered this somewhere. When I read John, I have the distinct impression that John is making a historicist argument. If so, that would entail that he’s countering either docetist or mythicist arguments. Yet Christian apologist claim that docetism came later and that nobody in early times ever claimed Jesus did not exist. What position do you think John is arguing against?
The edition of John in the canon is definitely pushing historicism, but it’s something like the third edit of the original, and the earlier drafts appear allegorical like Mark (we don’t have those editions, but we can tell from redaction analysis). I discuss all of this in my section of John in Ch. 10 of OHJ.
As for Docetism, see the index of OHJ. I think the reality is way more complicated than the usual line. Like Gnosticism, which has now been rejected as a modern fabrication (no actual such movement existed; what is loosely called “Gnostic” refers to elements present in various quantities or degrees in all Christian sects including so-called orthodoxy, or to polemical inventions of heresy hunters), IMO what is usually called Docetism today also did not even exist in the ancient world; and what is mistakenly called Docetism now is actually a slew of unrelated unorthodox sectarian teachings, some of which may in fact be mythicist. The anti-heretical writings that modern scholars “consruct” Docetism from also only address heresies of the late second and early third century; they have no evident knowledge of any of the first century sects (e.g. that Paul was calling anathema and so on), or sects such as attacked by Ignatius or 2 Peter, which actually do not align with any of the sects addressed by the heresy hunters (or else their polemics disguise the real teachings of those sects).
So, in short, no “date” for Docetism is valid. We cannot ascertain when any of the doctrines variously now called Docetic began, or what they looked like when they began. We can barely accurately reconstruct what they looked like by the time we have any writings against them. And those anti-heretical writings are too late to inform us of early heresies at all, much less “accurately.”
Hey there Dr. Carrier.
I have read “On the Historicity of Jesus” and i find it very informative. I often share some questions that you raised in the book on a facebook group that we have here in Poland. We discuss everything related to the christianity. We got folks who are simple believing christians, atheists, historians, philologists.. Having discuss various topics from the book, some folks interested in buying it so it became a hotter topic. We were discussing elements 13 and 14 from chapter 5 “Elements of Christian Origin”. You quote Origen extensively to show that the gospels should be taken allegorically, not literally. They are only meant to be taken littery by the “simpleton” because they are not educated enough to understand the true meaning. So you quote Origen, Clement of Alexandria and Eusebius to prove that the gospels are allegorical, one of the historians in the group accused you of ignoring ”the school of Antioch”. Intellectuals of “the school of Alexandria” indeed emphasized the allegorical reading of Scripture, while “Intellectuals from “the school of Antioch” held to a more literal reading.
So people that you cite, Clement of Alexandria, Origen, Eusebius, they come from the School of Alexandria. His argument:
Tradition starts from a hellenistic Jew, Philo of Alexandria who tried to explain Jewish faith using Greek philosophy, mostly using Plato.
This tradition matured in the first third centuries of Christianity to the point that man was considered a soul who descended to Earth and was temporally associated with matter. Just as the human soul descended from above, so too was the Scripture handed down from above and also in this tradition they understood the (previous) descent of Jesus to Earth for the purpose of redemption and his subsequent return (ascension) . For this reason, this tradition is called (descending). However, Antioch theology, created mainly by Jews-converts forming a diaspora in this city, was referred to as “ascending”, because more important thing was the moment of Jesus’ ascension, referring to the ascension of Elijah, Enoch, Isaiah (“Ascension of Isaiah”) and even Melchizedek. But for them it wasn’t important that Jesus was the son of God, it was important that he ascended to heaven, just as previously mentioned prophets. The school of Antioch was more specific and less philosophical, and the Alexandrian school was just the opposite. It is important not to forget about the context of where Origen is coming from. Even the quoted Eusebius was influenced by the school of Alexandria, he was a disciple of Pamphilus of Caesarea, the defender of Origen’s work. Eusebius sucked out a positive attitude to the idea of Origen with his mother’s milk. From this tradition, which is also under the influence of Gnosticism, probably comes the Gospel of John, which we see is full of theology of all gospels. In this case, Carrier forms his arguments just from one current
It would be great if you could comment on that and maybe elaborate a bit more about the school of Antioch. Maybe a good idea for a blog post?
I do not know what you are talking about. We have no adequate evidence of any such school. Mainstream scholarship has now even abandoned Gnosticism as a valid term. And you are describing strange ideas. I don’t know what your question is, or what you are basing any of this on.
“School of Antioch” is the term reffering to to the style of theology formed in Syrian Antioch. Heres what wikipedia says (source: Cross, F. L., ed. The Oxford Dictionary of the Christian Church. New York: Oxford University Press. 2005, article Nestorius)
“The School of Antioch was one of the two major centers of the study of biblical exegesis and theology during Late Antiquity; the other was the Catechetical School of Alexandria. This group was known by this name because the advocates of this tradition were based in the city of Antioch, one of the major cities of the ancient Roman Empire.
While the Christian intellectuals of Alexandria emphasized the allegorical interpretation of Scriptures and tended toward a Christology that emphasized the union of the human and the divine, those in Antioch held to a more literal and occasionally typological exegesis and a Christology that emphasized the distinction between the human and the divine in the person of Jesus Christ. The school in general tended to what might be called, in a rather loose sense, an Adoptionist Christology.[1] Nestorius, before becoming Patriarch of Constantinople, had been a monk at Antioch and had there become imbued with the principles of the Antiochene theological school.[2]”
“The Catechetical School of Alexandria was a school of Christian theologians and priests in Alexandria.[1] The teachers and students of the school (also known as the Didascalium) were influential in many of the early theological controversies of the Christian church. It was one of the two major centers of the study of biblical exegesis and theology during Late Antiquity, the other being the School of Antioch.”
In OHJ when you argue that the elite thought scripture should be taken allegorically, not literally, you quote Origen, Eusebius and Clement of Alexandria. But apparently these folks represent only “the school of Alexandria”. There was also the other elite, other church fathers who taught just the opposite, they insisted that scripture should be taken literally. When arguing that Christianity started as a judeo-hellenistic mystery cult, you don’t talk about both competing theologies. You left readers with an impression that what You argue is what the elite taught and it was rooted in the origins of christianity, but it seems like it was nothing more then two different competing theologies?
Also, what makes you think that gnosticism is abandoned? Many scholars still talk about it, including Ehrman. Is there an article You can recommend?
I am not aware of any such texts before the 4th century.
Can you identify a text where “literalism only” is defended?
It is now a defunct concept. This report explains why.
Thanks to Eusebius, we know about Papias who did interpret Scripture literally.
Also: Tertulian, Paul of Samosata (bishop of Antioch), we have Testaments of the Twelve Patriarchs, ascension of isaiah, II Enocha. Literal interpretation was common, Origen and others who represent Alexandrian theology are possibly late to the game with their with their interpretations.
They may not say “literalism only” because why would they do that? literal interpretation is the naturalistic way of reading a text. It’s later when Scripture was taken allegorically.
Is westar institute leading experts in gnosticism? Correct me if I’m wrong, but I don’t see how their decision made “mainstream scholarship abandon Gnosticism” ? I’m pretty sure many mainstream scholars still hold to the idea of Gnosticism. Especially experts in this specific field.
The Westar Institute is the largest organization of mainstream scholars in the world voting consensus resolutions. Their experts on this issue are some of the world’s leading experts on the subject of Gnosticism. And scores of voting fellows, representing hundreds of scholars, all agree they are right on this. So do I. I had come to the same conclusion independently years before during my research for OHJ. And as this ruling just came down, more and more are coming over to its conclusions who examine the case for them.
We have none of the works of Papias. And we have no quotes from Papias saying scripture had to always be interpreted literally (even by the elite, remember—he appears to have been writing principally for the non-elite). And nothing linking Papias to Antioch, either.
Tertullian is also not Antiochene. He isn’t even from the same continent. And he also never says scripture had to always be interpreted literally even by the elite. He is perhaps the most literalist reader of the Bible prior to the 4th century, but even he admitted allegorical readings were intended and should be used when appropriate. He only argued against heretical allegory; and did deploy appeals to selective literalism against those; but he was still not a literalist in the modern sense. And was also very conscious of what could be admitted publicly lest it turn the masses away from salvation.
We have no texts from Paul of Samosata. And no accounts of him peg him as exclusively literalist.
The Testaments of the Twelve Patriarchs, Ascension of isaiah, and II Enoch do not discuss any matters pertaining to Biblical interpretation, much less Gospel interpretation, that would indicate for or against an exclusive elite literalism.
And it simply isn’t true “It’s later when Scripture was taken allegorically.” Paul is already discussing scripture allegorically in his epistles, and indeed it was already standard in Judaism by then. Mark explicitly has Jesus recommend it. Revelation is explicitly written in an overtly allegorical style. And many early commentators on the Gospel (even Tertullian) accept allegorical readings when they deem them not dangerous to dogma or as necessary to its public defense.
Scripture had been taken by Jewish thinkers as both literal and allegorical in whatever way suited the interpreter since before Christians inherited the same principles and those are the same principles espoused by Origen centuries later, who is far closer to Philo and Paul in this respect than to later literalists of the 18th century (the first exclusive literalists I know of on record).
Thanks for the reply. I did some reading, the argument seems to make sense. If you don’t mind, there is one more thing that bothers me.
I was extremely surprised when you said that Inanna was crucified:
“For her, a clear-cut death-and-resurrection tale exists on clay tablets inscribed in Sumeria over a thousand years before Christianity, plainly describing her humiliation, trial, execution, and crucifixion, and her resurrection three days later”
Nowhere else anyone is saying that she was crucified. Seems like she was hanged, not crucified?
Im not sure if you are aware of it, but there’s an article attacking your position regarding her crucifixion. I’m not asking you to respond to the whole article, but if you can explain your interpretation of the evidence since you are apparently alone on this issue?
http://www.biblicalcatholic.com/apologetics/JesusEvidenceCarrier.htm
The words for “crucified” in antiquity always included nailing up of corpses for display, and she is explicitly said to be nailed up for display (I quote the text verbatim in OHJ). The notion that such words only meant nailing up while alive is false. No such assumption ever attached to any ancient word. See my citation of abundant scholarship on this very point in OHJ, pp. 61-62.
I have heard you say that the probability statements (i.e. the priors & consequents) in a Bayesian equation are like premises of an argument. That made me think of the whole thing in a different way. Can we consider Bayes’ to be the deductive structure underneath inductive/abductive reasoning. I know that deduction and induction are usually juxtaposed categorically, but Bayes’ seems to suggest a different relationship is possible. A ‘valid’ deductive argument is one in which the premises guarantee the conclusion. In a Bayesian formulation, this would just mean that the posterior probability is valid so long as the math is correct. This doesn’t mean that the probability assignments in the premises are justifiable or warranted on the evidence and background knowledge. A valid Bayesian argument would only entail that the posterior probability was correctly derived from the calculation of the given premises. A ‘sound’ deductive argument is one that is valid and has true premises, and therefore a true conclusion. Perhaps we could say a ‘sound’ Bayesian argument is one that is valid (i.e. has a mathematically correct calculation of the posterior prob), and has priors and consequents that are warranted on the background knowledge. This seems to be a point of intersection between deductive and inductive/abductive reasoning. I was wondering if this makes sense. Is it accurate or appropriate to think of it this way? Just curious. Thanks!
Yes. That’s spot on, IMO.
The form is mathematical and not syllogistic (although one could, if one wanted, awkwardly construct a syllogistic form for it, that would be grossly inefficient and unnecessary). But it is still formally deductive (conclusions necessarily follow from the premises).
Analytically, BT explains what we are doing when we do inductive reasoning. And it explains why the conclusions of inductive reasoning are valid, when they are (and to also be sound, as usual, the premises also have to be known to a reasonably high probability).
I show this in Ch. 4 of PH where I take several common models of inductive reasoning (like Inference to the Best Explanation and the Hypothetico-Deductive Method) and demonstrate they are actually reductively Bayesian.
You need to respond to this egregious new article:
https://digitalcommons.georgefox.edu/cgi/viewcontent.cgi?article=1285&context=ccs
Technically I already did (its argument isn’t actually all that new; see e.g. OHJ, p. 506 and the example I cover in my Carrier-Evans debate analysis). But I have also had this on a list of possible blog subjects for a while now. I may eventually do something on it. But it’s not a big priority. No mainstream scholar buys arguments like this. It’s pretty much just an apologetics thing.
Richard I have a new book you will greatly appreciate. Here is a summary of chapters of the book by the author.
https://errorstatistics.com/2019/03/05/blurbs-of-16-tours-statistical-inference-as-severe-testing-how-to-get-beyond-the-statistics-wars-sist/
Also worth checking is Nature’s latest article: Scientists Rise Up Against Statistical Significance.
Another one: “800 Scientists Say It’s Time to Abandon ‘Statistical Significance’,” at Vox.
FYI, some reviews of Mayo’s book.
https://statmodeling.stat.columbia.edu/2019/04/12/several-reviews-of-deborah-mayos-new-book-statistical-inference-as-severe-testing-how-to-get-beyond-the-statistics-wars/#.XLElsVXf2Z8.wordpress
Hi Richard, just a question for clarification / reference sake. In this article, your mention of John Loftus and Richard Miller links to an article titled “How Not to Be a Doofus about Bayes’ Theorem”. I was interested to see some examples — specifically from Loftus (not knowing Miller, though I’d be interested in those too) — but when I searched for their names in that article I couldn’t find any mention. Perhaps they got polished away during some editing or something?
I noticed a couple of comments in this article that mention Loftus and Miller, but unfortunately they don’t supply very much context and I wasn’t around during the time of the disagreement(s), so I can’t seem to figure out what they might have said. (E.g. I can’t find any quotes or links to their statements/articles. The one quote of the word “conceding” is again in reference to the “Not Be a Doofus” article, but I couldn’t find “conceding” at that article either.)
So, I’m just wondering if you could clarify or reference what the context of Loftus’ (and Miller’s I suppose 🙂 ) statements/arguments/articles/whatevers were? Maybe a quote, link, or even just an article title and/or date?
Oh, and in case reference to them was accidentally polished out of the “Doofus” article, I thought you might want to know about that.
It’s curious. A while back (several years now) I recall Loftus starting to take probability seriously, so I was kinda surprised to read that he seems to have thrown Bayes overboard. Would be interesting for me to get some perspective on how/where he went off track.
Cheers!
It isn’t relevant. They can either accept the conclusions here argued and thus renounce what they previously argued, or reject them and be a Doofus.
So if you want to know which now it is, and for some reason still want to know, you should ask them.
Their Doofus statements I lift as examples—but don’t credit to them, since I am making a generic and not a personal argument—appeared on Facebook principally and are scattered and buried in complex threads in numerous places. It isn’t even worth your bother digging through all that. It’s easier to just ask them point blank: what do they think of the positions here argued and why. Then you’ll know straightaway whether they are still a Doofus or not.
In short, this is advice on how to detect a Doofus is for you, the reader. Not for them. They had their chance to learn all this already. I cannot tell you whether they learned that lesson by now.
Hi Richard,
I was going to email you this question but maybe it’s a good one to ask on the forum so others can chip in.
Am I correctly using the Bayesian calculator on CHRESTUS?
Here’s my inputs; what I think are the strongest bits of evidence FOR Jesus.
Prior probabilities that Jesus existed:
worst case 10% best case 99%
Evidence 1: Brother of the Lord.
Worst case 95% best case 99%
(I wasn’t sure how to integrate my two assumptions here. I think it’s about 95% likely that Paul meant a fraternal brother, but also 5% likely that the Gal 2 passage is an interpolation.)
Evidence 2: Tacitus – which makes it into Dawkins’s new book as the best evidence FOR.
Worst case 20% best case 80%
(Considering possible interpolation vs Christians who knew the truth telling Tacitus that Jesus was historical)
Evidence 3: Silence of Paul/Epistles/Church fathers
Worst case 5% best case 50%
Why so low on both sides?
It’s not just Paul who seems to know little to nothing about Earthly Jesus. Even Irenaeus reports hearing Polycarp talking about seeing the Lord in an oddly detached manner:
“I can speak even of the place in which the blessed Polycarp sat and disputed, how he came in and went out, the character of his life, the appearance of his body, the discourses which he made to people, how he reported his intercourse with John and with the others who had seen the Lord, how he remembered their words, and what were the things concerning the Lord which he had heard from them, and about their miracles, and about their teaching, and how Polycarp had received them from the eyewitnesses of the word of life, and reported all things in agreement with the Scriptures.”
‘Seen the Lord’ is followed by stories only of THEIR miracles and THEIR teaching, but nothing about the miracles and teachings of Jesus. Odd!
The next 2 pieces are included because I asked the Prof of the History of Christianity at Oxford what he thought was the best evidence FOR historicity.
Evidence 4: The Christian Mission
Would there have been a Christian mission without a physical Jesus?
Worst case 33% Best case 95%
The worst case is based on the fact that if Jesus was a celestial being that makes THREE religions founded by angels, so Christianity would have a 1/3 chance of being one. (is this circular? It feels like maybe…)
I’m not sure what I’m doing with the best case number. I guess that reflects that angel-inspired missions don’t seem to be much barrier to a religion.
Evidence 4: Paul disagrees with Jesus about divorce.
My professor friend reckons Paul’s disagreement corroborates the teaching re divorce (NONE!) that is related in Matthew, which is nowhere in Scriptures, so would be new and therefore most likely from a real person.
Worst case 80% Best case 99%
I tried to make this one heavily in favour.
According to the calculator on CHRESTUS:
Overall: Worst case 0% Best case 27%
That’s still pretty low…I was surprised.
Could you please let me know if I have done something wrong?
Thanks!
Another round, this time really trying to make the strongest case, leaving out things like the silence of Paul/epistles:
Priors: worst 50% best 99%
Brother of the Lord
worst 95% best 99%
Mission:
worst 95% best 99%
Tacitus:
worst 80% best 95%
Overall – worst 26% best 48%
Wow, it’s really hard to get over 50%!
Can anyone who understands Bayesian reasoning explain why this happens?
I mistakenly thought the app was broken, but it isn’t. What I need to know again is rather, what Odds are you entering, not percentages. The calculator doesn’t use percentages. It uses odds.
For example, you can do it all by hand. Odds Form is easiest, rather than percentages, which is why the calculator is built that way. Basic formula is Final Odds (on Historicity) = Prior Odds x Likelihood Ratio. Prior Odds for you on the low end will be 1/10; high end, 99/1. For “Brother of the Lord” you then have to ask: how much more or less likely is that piece of evidence on historicity than on non-historicity, and enter the resulting odds.
For instance, in OHJ I put odds on this on the low end of 1/2, meaning this evidence is twice as likely on nonhistoricity than on historicity (given the peculiarities I note that are usually overlooked); but on the high end I put 2/1, meaning this evidence is twice as likely if Jesus existed than if he didn’t (and is therefore evidence for historicity), again for the reasons I articulate in the text.
So when you say the probability of the Brothers of the Lord data is 95-99%, what do you mean? If that’s the probability of that data on historicity? Or on nonhistoricity? The odds will be the ratio. So if you think that evidence is more likely on historicity (and thus is evidence for historicity), you might enter 99/95, but that would mean it is extremely weak evidence, making almost no difference to the question. More likely you want to say something like 3/1 or 2/1 or 10/1 or something (though whatever you choose, you have to be able to defend why you think that and not something else). On the high end. And on the low end maybe 1/1 or 2/1 or even 1/2 or 1/5 or whatever (ditto).
The calculator will give you results as percentages rather than final odds (it does the conversion for you). But if you do it by hand, you will get odds (converting that then to percentages is a task). So if you set the low end at 1/10 prior and 1/2 Brothers, your final odds will be 1/20, or 20 to 1 odds against historicity. Or if you set the low end at 1/10 prior and 2/1 Brothers, your final odds will be 2/10 = 1/5, or 5 to 1 odds against historicity. And if you set the high end at 99/1 prior and the high end for Brothers at 2/1, your final odds will be 198/1, or almost 200 to 1 odds in favor of Jesus existing.
Etc.
First, the Bayesian Calculator in CHRESTUS is Odds Form, so I’m not sure you are inputting correctly. When you say here “10%” as the prior for historicity, do you mean you entered odds of historicity of “1 in 10”? And when you say “99%,” that you entered “99 to 1”?
Second, when you say “Evidence 1: Brother of the Lord” etc. do you mean on historicity or on non-historicity? You have to enter two sets of values, one set of “best” and “worst” for each. I now just looked in the app, and its missing a whole section (the second one, i.e. the best and worst odds on non-historicity section, necessary for the calculator to work). So something is broken. I’ll get the tech on it ASAP. We’ll need to find the problem, fix the problem, then run updates on both platforms; all that can take a few weeks, alas.
Wait, no, ignore that second point. The app is working fine. It just requires Odds Form entries. See my comment on your second query.