Tim Hendrix wrote a critical analysis of my book Proving History two years ago, and recently made it available online. Coincidentally I also just discovered a review of the book in College & Research Libraries Reviews, which had been published in June of 2012 (pp. 368-69). That was only one long paragraph, but I was surprised it understood the book and took a positive angle on it, concluding:
The use of a mathematical theorem to establish reliable historical criteria can sound both threatening and misguided. However, Carrier describes and defends the theorem in layman’s terms, demonstrates that historians actually think in terms of probabilities while rarely quantifying them, shows how all other axioms and rules in historical methodology are compatible with the theorem, and then gives it a practical workout on recent studies on the historicity of Jesus … [in which] Carrier shows how the criteria for judging whether or not Jesus was a historical figure (coherence, embarrassment, multiple attestation, contextual plausibility, etc.) are replaceable by Bayes’s Theorem, which “if used correctly and honestly . . . won’t let you prove whatever you want, but only what the facts warrant.”
Hendrix (who has a Ph.D. relating to Bayesian studies) gives it a much closer look on its technical aspects in applying Bayes’ Theorem. There are some issues of grammar that suggest English might not be Hendrix’s first language (he also uses British spelling conventions), but his writing is good enough to work around that (most of the time).
Overall, Hendrix concurs with a lot. On taking a Bayesian approach to the historicity of Jesus, his conclusion is that, “I think this is an interesting idea, and while I am uncertain we will find Jesus (or not) at the end, I am sure Dr. Carrier can get something interesting out of the endeavour” and “the sections of the book which discuss history are both entertaining and informative.” He also approves of my defeat of certain approaches to history in Jesus studies, such as over-reliance on the Criterion of Embarrassment. But the bulk of his analysis is critical, though only of a few select points. All of which he bizarrely misunderstood. To those I now turn.
Does All Historical Reasoning Reduce to Bayes’ Theorem?
Hendrix starts with a descriptive introduction, both of the book and of Bayesian reasoning. Then he analyzes my formal demonstration that all historical reasoning must reduce to Bayes’ Theorem. The first issue he raises with this is that I only implicitly mean by that the probability that factual hypotheses are true or false (and given what starting assumptions we put in, a separate issue). But what about, he asks, other kinds of statements?
For instance, suppose we define Jesus as a “highly inspirational prophet”, a great many in my field would say the modifier “highly” is not well analysed in terms of probabilities but requires other tools. More generally, it goes without saying we do not have a general theory for cognition, and I would be very surprised if that theory turned out to reduce to probability theory in the case of history.
I’ll just say, if you can’t define it, then you can’t answer it. So these kinds of unanswerable questions are moot. But even if we do define the terms usefully in a question like this and what we end up with is not a factual statement but an evaluative statement, then we are no longer making a claim about history. We are making a claim about what value people should assign to something. And that’s a different field of inquiry than history. And that is why Proving History does not address those questions.
Meanwhile, alternative interpretations of a question like that are straightforward historical claims. For example, “the teachings of Jesus were widely valued in historical period P” is a hypothesis that will have a probability value derivable from Bayes’ Theorem based on the likelihoods when we collect evidence of people in P saying they value those teachings, and/or acting on that value, and checking for how widespread these things were. This might of course end in valid nuanced outcomes like “we can prove lip service was near universal, but actually following the teachings of Jesus is virtually nonexistent.” That statement is true to a certain probability, and that probability would derive from the two consequent probabilities in the theorem: the probability of the evidence we gathered on that statement being true, and the probability of that same evidence on that statement being false. Prior probabilities might factor in as well, depending on how you model the problem, but the output will be the same (whether evidence affecting the probabilities goes in b or e). Likewise for any other statement like “the figure of Jesus was highly affecting of the culture in period P” [using “highly” in the sense of “widely” or “non-trivially,” for example], which is another historical claim whose probabilities derive from the evidence again.
So I don’t see any actual problem here. And Hendrix does admit his concern is minor.
His greater concern is that though Bayes’ Theorem does decide outcomes from inputs (so all historical methods do reduce to it at that level of analysis), it doesn’t help us decide the inputs. That’s not entirely true (see the index of Proving History, “iteration, method of “), but it is relevantly true, in that, as you walk the math back, eventually you leave the realm of history and enter the realm of physics and philosophy (with all its Cartesian Demons and Hilbert’s Hotels and Holographic Cows), but more importantly, unlike in, say, particle physics, in history we can’t do the math precisely in the first place. We can only at best reach a fortiori estimates (see index, “a fortiori“). Because historians simply have to “guess” at what the inputs are. Just as intelligence analysts must do when they use Bayes’ Theorem to anticipate the behavior of foreign nations and hostile parties.
This I fully acknowledge in Proving History and provide tools to work with, in fact the tools historians already routinely use, they just don’t realize they are using them. Because this problem exists regardless of what methods historians use. What BT does is force us to admit it, and to spell out where and when we are doing it, so that our inputs can be identified, analyzed, and critiqued. That is an enormous advantage over every other method historians have attempted to model their craft with (as I demonstrate in Chapter 4). The tools that help us maintain validity for history in the face of the subjective estimating experts must perform include the method of a fortiori reasoning (pp. 85-88; plus, index), avoiding false precision and instead mathematically accounting for uncertainty and margins of error (pp. 66-67), and not confusing subjective estimates with arbitrary estimates (pp. 81-85), but instead making debates about the inputs a fundamental part of history as a field, where subjective estimates have to be justified by data and validated by peers (pp. 88-93, 208-14).
In other words, historians can’t get away with saying “x is probable” without explaining what they mean by “probable” (55%? 95%? 99.99%? What?) and why they think it’s that probable and not some other probability—or why they think it’s probable at all, a question that ultimately can’t be answered by any historian without sound Bayesian reasoning (whether they are consciously aware of it or not). The crucial function BT serves here is to settle what inputs we are supposed to be looking for in the first place. Historians can talk probabilities all day long, but often have no idea what probabilities they should be looking for or asking about. BT shows us the role of priors (historians routinely rely on prior probabilities without even being aware of it, and often can’t even tell the difference between a prior probability and a consequent probability) as well as the role of the likelihood ratio (that we must estimate the probability of e on h, and the probability of e on ~h, and the ratio between them is determinative of the output, a complexity most historians are completely oblivious to, as I explain is a problem, for example, in On the Historicity of Jesus, pp. 512-14).
This is why Hendrix’s concern here is misplaced. He thinks it’s obvious that the posterior probability is entailed by the combining of a prior and a likelihood ratio and that it is obvious that the likelihood ratio consists of relating the probability of e on h against the probability of e on ~h. Therefore, he thinks it’s vacuous to say that all historical methods are Bayesian, because that’s already obvious, so what use is proving it?
Well, guess what. None of this is in fact obvious. In fact, too many historians screw the pooch on every step of this reasoning: they don’t know they are relying on priors, and don’t know how the priors they are relying on should be affecting their conclusions (they also don’t know what a prior actually is or how to validly derive it from data: e.g. see Proving History, pp. 229-56; or how doing so requires demarcating the contents of b from e: index, “demarcation”); they also don’t know that the probability of h is determined by the probability of e if h is true, and are even less aware that it is also determined by the probability of e if h is false. In fact, failing to properly test their theories against alternatives has been commonly pointed out as an error historians are prone to, and many historians who even know they are supposed to do it, don’t know how to.
In other words, Proving History is about explaining to historians exactly what Hendrix is saying: that BT determines the outputs from your inputs. So throw away all other methods of generating an output from your inputs that you’ve been using, and learn this mechanism instead, because it is the only one that is valid. And once you know how it works, you will finally know how to validly derive an output from your inputs, and what inputs you are supposed to be looking for and guessing at and arguing over in the first place.
Historians don’t know these things. Consequently, they don’t believe these things. Some even adamantly deny it. Thus necessitating that I provide a formal proof, one that they can’t weasel out of. And so I was required to do so. And so I did. Hendrix might find it frustrating that we have to do this. I share his pain. But alas. Hendrix agrees with me: knowing whether a claim about history is true requires Bayesian reasoning. He seems only to be annoyed that I had to prove that.
Does Proving History Teach Us How to Apply Bayes’ Theorem?
Hendrix is concerned that I don’t prove any new facts about history by applying the theorem. In fact that wasn’t the function of Proving History. A test application on a serious problem is in the sequel, On the Historicity of Jesus, as is repeatedly stated in PH (although when he wrote this review, the latter had not yet been published, so he couldn’t evaluate it).
What I do in PH is show that all the methods historians already use reduce to BT (Chapter 4) and that when they realize this, they can better understand and apply those methods, and avoid mistakes in using them. And I then use BT (in Chapter 5) to show that the methods used in Jesus studies either violate BT (and thus must be abandoned as illogical) or fail to get the results claimed for them (if you apply them in agreement with BT). I then provide tools for how to build BT arguments and avoid mistakes in doing so (in Chapters 3 and 6).
Throughout, what Proving History is about is not how to do math, but how to understand the logical structure of historical reasoning. Which structure happens to be described by Bayes’ Theorem. But the aim is not to build and run differential equations on plotted graphs. The aim is to understand the structure, and thus understand the logic, and thus understand what probabilities you are supposed to be estimating, and what is then entailed once you’ve estimated them. One doesn’t even have to do math to do that and apply it soundly (PH, pp. 286-89), but even insofar as one uses math, it need only be as crude as sixth grade arithmetic, and that with wide margins of error. Precision is not required. Complex calculations are not required. Historians simply need to learn how to interrogate their own statements like “x is very probable” or “x is somewhat likely” and then understand, once they’ve explained to themselves what they are even saying with such words, what then necessarily follows by the logic of probability. Proving History equips them to do that. They also need to know how such input statements can be justified by the evidence, or how they can be debated within the field once exposed, and PH gives some guidance on that, too.
So I think Hendrix’s concern here is misplaced as well. Proving History does what it aims to do. Nothing more.
It is notable that Hendrix agrees with my applications of BT in these respects. He concurs with how BT, and probability theory generally, collapse applications of the Criterion of Embarrassment by Jesus scholars. Indeed, Hendrix does an excellent job of re-demonstrating one of my points about this with a full application of Bayes’ Theorem, which in the book I kept much simpler (as a discourse about ratios) to illustrate every step of reasoning and not overload the reader with unnecessary modeling. Proving History was written for humanities majors, whose eyes would have completely gazed over and not at all understood Hendrix’s revising of the argument into a BT form. I think his revision is excellent, and a good addition to the point. It just wouldn’t have worked well in PH, given its actual target audience.
One thing I do think he does wrong, though, is make the problems of history far more complicated than they need to be. He says at one point that we need “10-20-(100?)” variables in any equation. That simply isn’t true. You can bypass all of that with broader definitions and allowing minor concerns to be washed out by a fortiori reasoning (PH, p. 85).
For example, Hendrix thinks it matters to the probability of preservation whether a Jesus-friendly preserver of an embarrassing story knew that story was true, but that’s not the case. As I wrote (emphasis now added):
[A]ll false stories created by friendly sources have motives sufficient to preserve them (since that same motive is what created them in the first place), whereas this is not the case for true stories that are embarrassing, for few such stories so conveniently come with sufficient motives to preserve them (as the entire logic of the EC argument requires).
So the probability that a Jesus-friendly preserver of an embarrassing story would preserve that story is entirely a function of whether that Jesus-friendly preserver saw enough value in the story to preserve it. It totally wouldn’t matter whether it was actually true or false for it to have that value. It also wouldn’t matter to the math whether the reason the Jesus-friendly preserver valued it was that they believed it was true. Because the reasons don’t matter at all. That it had enough value to them to preserve it is the only fact we need measure, not why. We don’t need to know why—only if we could prove the “why” was “that it was actually in fact true” would it matter, but (a) we can’t in any of these cases and (b) their believing it was true wouldn’t tell us that, either, even if we could prove they believed it was true (or even cared whether it was true), which (c) we also can’t do. Hendrix is thus needlessly over-complicating the math. Our objective rather should always be to simplify the math, as much as possible that still gives us a logically sound conclusion. A fortiori reasoning, and careful defining of measured terms, accomplishes that.
The rest of Hendrix’s critique consists of insisting historians need vastly greater precision and vastly more complex models of history to say anything about history at all. That doesn’t make any sense. To the contrary, they can’t and never have and never will have the kind of precision Hendrix wants. That sucks. But welcome to history. Moreover, more complex models are almost always useless. When historians reason about history—and I mean all claims about history, made by all historians, in all works of history in the last sixty years—they have not used, nor needed, any of the complex models Hendrix wants. The role of understanding BT is not to make history needlessly more complicated. The role of understanding BT is to look under the hood of arguments historians are already making, and using BT to model those arguments, and thus understand what their inputs actually are, and what that historian is basing them on (and thus whether they should be replaced), and whether the output they are getting is consistent with those inputs (in other words, consistent with BT).
This does not require increased complexity. Unless you can demonstrate an argument is invalid or unsound by virtue of its excess simplicity. But in that case you should focus solely on the one point you are making. And in doing so you’d be doing something useful, and thus applying BT to improve historical reasoning. Most cases won’t suffer that problem. Because most complexities can be dissolved within broader terms (e.g. an h that is inclusive of 100 h‘s; carefully constructed binary definitions of h and ~h; etc.) or ignored because their effect is already smaller than the margin of error (the function of a fortiori reasoning). Indeed, we need to be looking for all the ways of doing this: making these complexities become irrelevant, so we can make useful and clear statements about history with the limited data available, which can be analyzed and vetted and productively debated.
Likewise, if (as Hendrix rightly proposes could happen) you think someone’s Bayesian model is wrong, showing that is indeed what is useful about reducing historical arguments to their Bayesian form. Because the wrong model will then have existed in their argument even if they didn’t articulate their argument in Bayesian form. It will exist even if they have no idea what Bayes’ Theorem is! So avoiding BT does not get you out of the problem of incorrectly modeling a historical question. To the contrary, avoiding BT only makes that mistake invisible. That’s worse. Once we compel historians to build Bayesian models of their arguments, then we can more easily see if they are faulty, and then critique and correct them. Progress in historical knowledge is the only result.
So these are not valid criticisms from Hendrix. These are actually agreements with the very things I already say in the book.
Does Proving History Get Wrong How Probability Works?
I argue in PH (pp. 265-82, although crucially building on pp. 257-65) that one major part of the Bayesian-Frequentist dispute dissolves upon analysis, once you correctly model what Bayesians and Frequentists are actually saying about probability as a matter of epistemology. In short, a Bayesian “degree of belief” is in fact an estimated frequency: the frequency with which claims based on such a quality of evidence will turn out to be true; and that frequency reduces to an estimate of, and is thus derivable from and limited by, the same physical frequencies of entities in the world (actual and hypothetical) that Frequentists insist all probabilities must be built on.
I confess I found little on point in what Hendrix attempts to say about this. He goes weird right away by saying that the demarcation of physical and epistemic probabilities is circular because they both contain the word probability. That makes no sense (“mammalian cats” and “robotic cats” is a valid distinction; it does not become circular because the word “cat” is used in both terms). But more importantly, it seems to be ignorant of the fact that I did not invent this demarcation. It has been a standard and famous one in philosophy for over a century, and is fundamental to the field of epistemology. So I have no idea what he is talking about here. Maybe he needs to read up on the subject in the Stanford Encyclopedia of Philosophy.
Hendrix goes on to essentially restate everything I say about probability in PH, only often in a manner far too advanced for the intended readers of PH. But he continues to make confusing statements of disapproval of things that are actually established facts in the philosophy of probability. The most I can fathom is that he thinks that Chapter 6 is badly written. Which is a complaint I am sympathetic to. I’m not satisfied with it myself and already knew it needs improvement. But his arguments against it all simply restate exactly what Chapter 6 argues, so he seems to have confused himself into thinking Chapter 6 says something different. Which I suppose goes in the evidence box for it being badly written. Although then the evidence that he is not entirely facile with the English language might come to be relevant. In any event, he does not offer any useful way to improve this defect. He does exactly the worst thing instead and makes the discussion far too complicated for a humanities readership. What we need is a better written Chapter 6 that will be easily understood by a humanities readership. I welcome anyone producing such a thing!
But some of Hendrix’s complaints miss the point. For example, he objects to my saying that when we roll a die the probability of it rolling, say, a 1, will either be the actual frequency (rolling the actual die a limited number of times and counting them up) or the hypothetical frequency (what we can predict will happen, from the structure of the die and the laws of physics, if the die were rolled forever and counted up). Why does he object to so obviously correct a statement? Because the die might not be perfect (its unknown imperfections will affect its rolls). But that is already covered by my model: those imperfections are part of the physical model of the die, and thus will be included in the hypothetical extension of its results.
What he seems to mean is that there is a third probability to account for: a hypothetical infinite series of rolls of a die whose precise physical structure is not known to us. That is, in fact, what we mean by an epistemic probability. Which I cover later. He is thus ignoring the fact that I do indeed agree with exactly his point, and add it in later as an extension of the subject. Where I discuss the “actual vs. hypothetical” frequency question, I am explicitly discussing physical probability, not epistemic probability. Again, a distinction that is standard and universal in philosophy, and which again he claims to think is circular (even though countless published philosophers have not).
So Hendrix’s complaint here is baffling to me. I get to explaining why epistemic probability will vary from the physical probability (including a hypothetical physical probability) subsequently in the book (I have a whole section on it, pp. 265-80, exactly following the section he is talking about, pp. 257-65). And my explanation is basically the same as his. So in claiming to critique my book, he actually ends up just repeating what it says.
It gets worse when Hendrix even more bafflingly fails to get the entire point of those two closing sections. Even though I carefully explain that an epistemic probability of any s is the probability that any s would be true given the kind and scale of evidence we have, and that as the evidence increases the epistemic probability converges on the true frequency, he confusingly says, “But what is the true frequency of the 8th digit in pi being a 9? Why should we think there is such a thing? How would we set out to prove it exists? What is the true value of the true frequency?” This is just a really strange thing to say. He is asking about a statement of (I presume epistemic) probability, that he believes it is 80% likely that the 8th digit of pi is a nine.
Okay. Let’s walk him through it. What does he mean by “it is 80% likely that the 8th digit of pi is a nine”? He must mean that given the data available to him, he is fully confident (I suppose to near a 99% certainty) that there is an 80% chance of his being right. To be so uncertain that you know you have only a dismal 80% chance of being right about this, I can only imagine some scenario whereby he doesn’t know how to calculate that result, and thus is reliant on, let’s say, a textbook, and apparently this is a post-apocalyptic world where it’s the only textbook that mentions pi that survives, and the text in the texbook is damaged at that point, and damaged in such a way that there is an 80% chance the smudged or worm-eaten symbol on the page is a 9 and a 20% chance it’s 6.
In that scenario, his 80% would have to be his estimate of the frequency with which characters damaged in just such a way will originally have been a 9 instead of a 6. But “the true frequency of the 8th digit in pi being a 9” would then be the actual (or hypothetical) frequency with which pre-apocalyptic textbooks read a “9” in that position instead of (the correct) 6. And he is badly mis-estimating that true frequency because of the damage to his evidence. True, there is also a “true frequency” of a character being damaged in such a way originally having been 9 and not something else, and his 80% is an approximation to that, such that he could be wrong even about that, but that is supposed to be accounted for by his confidence level and margins of error. What he is actually trying to get at with the 80% is the frequency with which textbooks actually read as such, and not differently.
Alternatively, maybe he doesn’t care what textbooks said, and wants to know what the actual probability is of the 8th digit of pi being a 9 as a matter of mathematical fact. How then could he think that probability is 80%? I guess, perhaps again we are in a post-apocalyptic world, where no knowledge at all has survived, and he is trying to freshly determine this question with some sort of mathematical device, a device he knows from prior use gets the correct answer 80% of the time (even though I can’t imagine what sort of thing that would be), and this device gives him a result of 9 for the answer. In this case the true frequency is simply 0%, but his mathematical device sucks so badly it has fooled him into thinking it’s 80%. Well, yeah. That can happen. Welcome to the conundrums of epistemic probability.
If that is what he meant, though, then Hendrix has chosen a bad example, because he is ignoring the fact that historical questions are not at all like the question of what the eighth digit of pi is (as I explain in PH, pp. 23-26), whose answer does not have a nonzero probability of being false, unlike all claims about history, which do. The epistemic probability that that digit is 9 is rarely these days going to come out 80%. It’s going to come out near zero. Because we have really damned good evidence for this, and therefore our epistemic probability will converge very closely to the true probability, which is zero. But our epistemic probability will still not be zero. Because there is a nonzero probability everyone in earth history to now has done the math wrong (see not only p. 25 with n. 5 on p. 297, but also my epistemological remarks in general in The God Impossible).
It’s all the more baffling then that Hendrix reduces his complaint to “the notion of ‘true frequency’ in” such cases as “the probability Caesar crossed the Rubicon, or a miracle was made up and attributed to a first-century miracle worker” becomes “very hard to define” whereas “if we accept [that this] probability simply refers to our degree of belief,” then “there is no need for such thought experiments.” Uh, yeah. That’s the entire point of my two sections on this! We never do know the true frequencies. So at no point do I ever say we need them to do history. But we do need to be able to get close to them if we are ever to legitimately say anything is likely or unlikely. Knowledge is impossible, if we can never know when our epistemic probability is probably close to the true probability.
Thus all we can do is do our best to get close to the true frequencies. And what epistemic probability is all about, and what the function of evidence is, is to get as close to that truth as we are capable of, given that evidence. And that is what Hendrix is doing with his degrees of belief, which are his stated measures of how likely he thinks it is that he is right. Which is a statement of how frequently he is sure he will be right, on all matters comparably evidenced. That’s simply the fact of the matter. And it doesn’t seem at any point that he understands this.
In other words, I am explaining in PH why epistemic probability (what he calls “degree of belief”) never exactly equals physical probability (the “true” probability) but why we can sometimes trust that it gets close to it, and when. All knowledge consists of nothing more than this: not knowing the true probability of anything (which no one can ever know; I explain this several times in the book), but instead knowing to a high confidence level that it lies between some x and y (our confidence interval). And we get there by accumulating evidence such that it becomes highly improbable that we are wrong about that (but never impossible—not even in the case of the digits of pi).
At no point does Hendrix ever appear to understand this. And not understanding it, his objections to it either make no sense, or actually affirm exactly what my book says.
Finally, Hendrix spends a lot of words trying to deny that when you say you are 80% sure of something, you are saying 1 out of 5 times you will be wrong. But that is literally what you are saying. At no point does Hendrix appear to understand this. At all. And none of his attempts to deny it make any mathematical sense. In fact, Hendrix doesn’t even seem to grasp at any point what it is he is denying. This I can only count as an epic fail in the domain of semantics. In any event, he does not confront any of my explanations or demonstrations of the fact. He instead just confuses physical with epistemic probabilities again. So there is nothing further to discuss.
Nor will I bother with his silly attempt to insist we need to account for infinities and irrational fractions in probability theory. Nope. A fortiori reasoning does away with any such need. And his discussion of my libraries example is too unintelligible to even understand. As best I can tell, he seems to not be aware that that is an exercise in a fortiori estimation of the margins of error. He seems to think it’s some sort of attempt to do particle physics with uncertain data, which indeed would warrant his complaints. But since it’s not, it doesn’t.
Conclusion
In the end Hendrix thinks the subjectivity of the inputs will make progress in Jesus studies impossible. I disagree. If the method I propose is followed, all disputes will be analyzable in a productive way. Even disputes about input. And once we bracket away Christian apologists whose opinions are of no merit in this matter owing to their insurmountable bias (a bias that is to them literally existential), secular scholars can then have productive debates that will end on a common range of conclusions that everyone in that group will agree is most likely ballpark correct (a scarce few fringe nuts aside). They just have to actually do this. Right now they are just all publishing disparate armchair opinions based on unanalyzable intuitions whose soundness or even logical structure they have no idea of and thus cannot even in principle validate.
Hendrix also thinks we need hyper-granularity of language and hyper-complex models. This has never been true in history. And yet all history reduces to BT already. So admitting what has always been true is not going to suddenly make all of historical reasoning vastly more complex. To the contrary, it will allow us to explain why broad and simple models have always worked, and how to keep doing that even better than we already have been. And that will be owing to the tools of careful definition and a fortiori reasoning. Most complexities simply don’t matter. They either are too trivial to have any visible impact on our math at the resolution we are actually working at, or they are too irrelevant to prevent their being subsumed and thus dissolved under more broadly defined hypotheses and descriptions of evidence.
Hendrix also thinks “to convincingly make [a] case [that] Bayes theorem can advance history one needs lots and lots of worked-out examples” is simply not true. Indeed, that was already proved untrue before me, by Aviezer Tucker in 2009. As both he and I show, in different ways converging on proving the same fact, all historical reasoning is already being advanced by Bayes’ Theorem, and has been for half a century at least. Historians just didn’t know that. And consequently, they haven’t been able to productively tell when it’s being done well or poorly. Proving History gives us a lot of tools for finally doing that. Thus the argument of PH is that historical methods currently being used are already Bayesian (Chapter 4) and are only valid when they are (Chapter 5). And that we can tell the difference between valid and invalid applications of a method by understanding how it operates on Bayesian logic (Chapters 4 and 5).
Hendrix also thinks historians can’t use Bayes’ Theorem unless they can do transfinite mathematics or solve irrational fractions, but that’s not only false, it’s silly. It requires no further comment. Likewise his unrealistic requirement that the book should be twice its current length by thoroughly explaining fundamental phrases and terms that a reader who doesn’t already know them can already ascertain through a judicious use of Google.
Finally, when Hendrix says “the proof [in Chapter 4 that] historical methods reduce to the application of Bayes theorem is either false or not demonstrating anything which one would not already accept as true if a Bayesian view of probabilities is accepted as true” he isn’t saying anything useful about the book. What he means by “is either false” is simply that the book does not address how to answer evaluative claims about history, but since the book isn’t about how to make value decisions but how to determine what we should think the probability is of “what happened and why” (Chapters 2 and 3), he isn’t saying anything relevant to the book’s function. Meanwhile his disjunctive alternative, that Proving History does not demonstrate “anything which one would not already accept as true if a Bayesian view of probabilities is accepted as true,” is wholly circular: that historians who already accept that their conclusions should follow a Bayesian model do not need it proved to them. That’s true…as a conditional statement floating around in Plato’s realm of ideas. But it’s irrelevant. Because historians have yet to be convinced that their conclusions should follow a Bayesian model. So they do need it to be proved to them. That’s why I wrote the book!
So in all, Hendrix doesn’t have any relevant criticisms of Proving History. By not understanding the points he aims to rebut, his rebuttals either don’t respond to anything the book actually argues, or end up verifying as correct what the book actually argues.
You don’t have to publish this comment, but you may want to rewrite your analysis of his “8th digit of PI challenge”.
I haven’t seen his text you are referring to, but I am guessing that what he might have meant was: “in the (never know to be repeating) sequence of decimal digits in the irrational number pi, what is the probability that every 8th digit that you encounter is the number x?”?
Hm. I don’t think that’s what he meant.
True, though, that would be yet another and different claim to be making (that “pi has a non-randomly repeating effect every eighth digit such that instead of a 9 appearing every 1 in 10 times, a 9 appears every 4 out of 5 times”). But I don’t think that’s what he means. He is trying to critique an epistemic probability statement and asking how it is attempting to approach a true frequency of something.
Hi Simen,
I have tried to clarify the proposition I consider in my response below. It is simply:
H : “The 8th digit of pi is 9”
and then p(H). Unless you know what the 8th digit is now then you too (according to the Bayesian view) would have a degree of belief different than 1 or 0 in the proposition H. I tried to use the example in the review because I find the example difficult to connect to a frequentist view (this is a textbook example).
Degree of belief is a statement of how frequently you would expect to be right about what you are asserting. You derive that expectation from how close you have grounds to believe you are to the physical frequency you are asserting. So if you say you believe there is an 80% chance the 8th digit of pi is 9, you are saying the evidence available to you entails there is a 4 in 5 chance you are right that it is necessarily true that the 8th digit of pi is 9. And necessarily true means 100% probability, as that is the frequency with which necessary truths in mathematics are actually true; albeit not the frequency with which mathematicians are right about that, but again, that’s what the epistemic probability is measuring. That the truth is that it is necessarily true that it is not a nine (and therefore the true frequency is 0%) remains the case, no matter how high (and thus how wrong) your assigned epistemic probability is.
Thus, if you believed that it was logically necessarily the case that that digit was a six, you would not assert, indeed could not assert, that you believed it had an 80% chance it was a 9. To the contrary, you would adjust your epistemic probability toward the true frequency of that (as for all logically necessary truths), which is zero. Your epistemic probability would continue to deviate from zero only by as much as measures your uncertainty whether you are right that it was logically necessarily the case that that digit was a six. Thus, epistemic probabilities are attempts to estimate actual frequencies. And the former are always adjusted toward the latter as more evidence directs. That’s what epistemic probability is for. If it didn’t do that, it would be useless.
That sounds to me he is a bit confused about how the word (or concept of) “frequency” makes any sense in a situation where there is only ever going to be one thing, not a set that can be sampled.
Usually when we say “frequency” it’s related to a data set, like pulling marbles out of a jar full of marbles, and looking at the frequency of red marbles VS green marbles so far, and using that to predict the color of the next marble we draw. But there aren’t other occurrences of an actual pi we can draw from. We can’t reach into a jar full of pis (the plural of pi. I was hoping it would be pii or something…), and find data about the last 100 pis that we can use to predict the properties of the next pi we pull out of the jar. Because only one pi exists. Maybe he thinks that this presents some kind of problem.
Also he’s specifically concerned about the “true frequency”, maybe he thinks that compounds the problem.
There could also be confusion with the fact that (granting determinism) the “true probability” of every fact of history is 100%, in roughly the same sense as for the digits of pi. But that would be making a more fundamental mistake (the “fallacy of foregone probability,” error 2 in the link) that I don’t think a Ph.D.’d Bayesian would make.
[BTW, in that same link I also show (in “concluding with an example”) that Bayesians who think “degrees of belief” are disconnected from physical probabilities can’t possibly have observed actual applications of Bayesian reasoning in the real world (such as in evaluating medical test results).]
Hi Richard,
That’s quite a long response which raises many points I am not sure I can cover here, at least without a lot of repetition from the review. I do think many of the things you write amounts to a misunderstanding of what I wrote or at least intended to say. If there are particular points in my review you feel I did not address and would like my responds to please just raise them as questions.
Re. “does all historical reasoning reduce to Baye’s theorem”
I raised a few points regarding the claim all of historical reasoning reduces to Bayes theorem.
Firstly, that probability theory is one way to handle uncertainty or vagueness (for simplicity I will use these terms interchangeably even if some may not consider this completely accurate), and there exist other valid theories for uncertainty (the term is understood broadly), for instance Dempster-Shafer and multi-valued logics (For instance Petr Hajek has written extensively on this subject). As I believe it is the case these have a place when analyzing uncertainty it raises the question why we can distinctly rule these, or some other notion we might come up with in the future, out as *not* being relevant for history (I will accept I would not know how to practically apply them). You responded:
However, even if we could not point to any another method for handling uncertainty that does not mean our current method is the only one, and in fact we have several. Broadly speaking, do you accept we have several mathematical theories for handling uncertainty/vagueness?
You continue:
The issue is this type of proposition would be a textbook example of a proposition with graded truth value (because of the term: “widely valued”) and would, strictly speaking, fall outside the scope of those propositions probability theory analyses. See for instance the first two chapters of Jaynes 2003 (which is referenced in PH) where he limits himself to classical (Boolean) truth-values or see:
https://en.wikipedia.org/wiki/Fuzzy_logic
for further information.
Do you accept this type of proposition could plausibly (by proponents of fuzzy logic) be said to have a graded truth value?
Secondly, as I understand this comment (from your conclusion):
Now, I suspect this conclusion might rest upon a misunderstanding and I agree that your summary of my argument is circular. What I considered in my review was this alternative statement: Bayesian inference describes the relationship between probabilities of various propositions (c.f. Jaynes, 2003). In particular it applies when the propositions are related to historical events.
To take the first part of this statement, I take it you accept Bayes theorem as not in dispute for a historian, that is, we can grant a historian accept the Bayesian account of probabilities. If he does not, would the proof not have to first set out to establish that Bayes theorem was true?
However if that is the case, my argument is very simple: Bayes theorem also applies when the propositions are historical events. Can you point out which parts of my argument which is circular and where your argument departs from mine?
I agree it is worthwhile (as a matter of instruction) to point out to historians how historical arguments can be mapped to a probabilistic form (Bayes theorem if you like), and it would follow the various elements of a historical argument would then find a probabilistic counterpart as priors/likelihoods, however I see this more as a matter of explanation than a formal proof.
Re. the scope of PH and the applicability of Bayes theorem to history
My concerns is more that I would have hoped to see more practical applications of Bayes theorem to existing historical questions to see how Bayes theorem is imagined to apply in practice. It is of course your choice what to include in the book, however there were many issues with applying Bayes theorem in a setting such as history that I was and am not sure how to address even after reading the book (I give some examples). However I think these issues are best discussed with a practical example in mind.
One thing I would like to point out on this note, regarding my analysis of the criteria of embarrassment, is that as far as I understood your analysis it did not have ways to specify if a text was actually embarrassing or not to the person doing the preservation, and as far as I could tell your analysis pointed to the conclusion the text would always have a probability less than 0.5 of being true. I tried to re-do an analysis where I allowed embarrassment to enter as a variable and got a (qualitatively) different result than yours. My main point was that, as best as I could tell, our results were in qualitative disagreement so who are the more correct? What do we do from here in practical terms?
Also, I admit I had a very hard time telling how you defined the various Boolean propositions that was used in your analysis. Furthermore it would be good to have them specified here explicitly for further reference?
The unification of Bayesian/frequentist view of probabilities
I understand the section to provide an account for how probabilities should be defined which is considered different than what e.g. Jaynes or Cox does.
It should be stressed the question how probabilities should be “defined” and its relationship to (and meaning of) subjectivity has different answers in the literature which I cannot survey here (for instance an axiomatic approach such as Jaynes and Cox vs. a rational choice approach of e.g. de Finetti). I have taken Jaynes (2003) to be the standard reference because of the way it is sited in PH and because it gives a fairly accurate description of my own view.
However this makes it sometimes difficult to understand the definition of terms. For instance I tried to provide the reader with a definition of probability (is the “degree of belief” in your respond your definition of probabilies?) and I claim I think at least some of the discussion may be circular to which you reply:
The quote from PH I was referring to was this:
Where it was the first phrase which puzzled me: “by probability here I mean epistemic probability, which is the probability that we are correct when affirming a claim is true.”
Consider the phrase: “By Bar I mean the Foo-Bar, which is the Bar that…”. I think this phrase may actually be circular since Bar is Foo-Bar which is Bar.
As you mentioned, the chapter is quite difficult to follow, and I would rather not spend too much time discussing issues which relates to a misunderstanding on my part rather than the proposed unification of Bayesian and frequentists probabilities. However I would like to address some items from your response:
I can accept my complaints might miss a point; however it is important to be very precise when we offer a definition of these terms. When I consider the above statement I think, regarding the first situation, that I can imagine rolling it (say) 100 times and it coming up 1 for instance 14 times, and so the “probability of it rolling 1” is the actual frequency or 14/100.
Then we can consider the second situation. In this we consider what we predict will happen and lets just assume I know nothing about the die (which is the case) and conclude the probability is 1/6.
What confuses me about this sentence is the probability is now at least two things (if the sentence is taken at face value). I say at least because in the first definition it was not stated what “a limited number of times” was and I could readily roll the die again to get (for instance) 18/100 (did you mean the limit frequency?).
So what is “the probability” (singular)? Do you believe probability refers to different things and has a situation-dependent definition?
It is very important to keep these terms absolutely precise for us to understand each other and *I still do not understand what probability truly refers to*
To take another case:
To be clear about the terminology, I am considering the probability of the proposition
And the information stated in my review amounts to p(H) = 0.8. My question is simply how we define this probability if we equate probability with “true frequency” (as the first half of your above quote doe). You continue:
What you are suggesting amounts to considering a different proposition:
For which we can say: p(H1) = 0.8. However notice my question was related to how we define the probability *of the proposition I proposed*. Is the proposal for defining probability then to switch the proposition? Do we agree your proposition (which I write as H1), and mine (H) are quite different?
I want to stress on a Bayesian view where probability are subjective and refers to a lack of knowledge the example is easy to treat without hypothetical situations. If I considered it right now I would say: p(H) = 1/10 because I cannot remember the 8th digit of pi at all. This has nothing to do with frequencies or any other beliefs I may hold and can in fact be derived from a symmetry argument (see Jaynes). The important thing is on this account probabilities are subjective and *not* as such *rooted* in what will actually turn out to be true or false in the real world. This is the key point in Jaynes.
Now, returning to the more substantial points which can hopefully clarify how probability should be defined. In PH as quoted in my review there is the statement:
Try to walk me through this example: Suppose I believe 4 things with probability 0.512. Then, can we agree that it will be the case either: 0, 1, 2, 3 or 4 of these will be correct limiting the frequency at which i am correct to five values? My point is simply how you define probability, for the probability of 0.512, in this particular situation? Do you envision some sort of limiting procedure? Do you define the word frequency differently than Jaynes or I (see the review for a definition)?
Then, as an introductory point I pointed out a fraction-based definition of probabilities would have a problem representing probabilities like 1/sqrt(2) which cannot be written as fractions and it is very easy to imagine situations where these arises (I gave one example in the review) to which you respond:
I am only trying to explore the definition you provide and I am sorry you will not take the example serious. I take your respond to agree with my point that there are a great many probabilities which cannot be represented using your definitions (in fact, those which can be represented have measure 0). Probability theory already accounts for these probabilities very well no matter which definition one subscribe to; do you feel the work on probability theory which relies on probabilities as being a subset of the reals (this would be any textbook I have seen) can be dismissed?
Thanks. This is useful. I will break this up into separate parts and open new threads below for each.
Hi Brian,
Well, on the standard Bayesian account of probabilities (you can find my definition of the terms probabilities and frequencies in the review), I think the situation is very simple to analyse, but I am having some difficulties following it with regard to the proposal in chapter 6 of PH. If there are mistakes in what I have written or points you would like clarified i would be happy to point those out.
Re. Richard,
Just to clarify, I do not disagree at all you can make probabilistic statements about frequencies or that frequencies can inform probabilities (technically, we can infer for instance the probability* a coin comes up heads from a sequence of flips and prior information). I gave an instance of the former type inference in my review and examples of the later can be found in any book on Bayesian data analysis (this is one of the few places where I do not think Jaynes is the more apt reference for more advanced aspects) and is in fact what I do every day.
But I think it is very convincingly argued in the literature that this is the more accurate relationship between probabilities and frequencies, especially when considering one-time events like history.
* technically a probability density.
All probabilities are frequencies. The question is only, frequency of what?
Degrees of belief are statements of frequency. Frequency of what? The frequency of being right about the assertion x (when presented with evidence of such a kind as warrants that degree of belief about x).
This is a matter of semantic necessity. If degrees of belief don’t mean that, then they are meaningless, and thus useless. They would then assert nothing useful about you or the world.
“We can’t reach into a jar full of pis”
In a way we can. There is only one true pi, but we don’t have direct access to it. I can look up pi in any number of books, but there is a non-zero chance of a typo in any of them. I can (and have) calculate pi myself using any number of different methods, on a computer, and by hand, there is a non-zero chance of a mistake in those calculations. Thus, we can say we have a frequency of 8th digits of pi, none of which is *guaranteed* to be the true digit.
Tim Hendrix: …there exist other valid theories for uncertainty (the term is understood broadly), for instance Dempster-Shafer and multi-valued logics…
Richard Carrier: That we can model uncertainty in different ways does not answer the demonstration in Chapter 4 that all historical arguments nevertheless still reduce to Bayes’ Theorem. As I note of Dempster-Shafer in PH (p. 303, n. 19), it’s simply far too complicated to be of any use to historians. Likewise other modes. I think you would catch on to this if you treated the syllogism as a syllogism and sought out a premise in it that you could challenge. Then we’d be talking about the soundness of that premise. Not the irrelevant question of alternative modeling.
Because I nowhere argue these things can’t also be done. They just aren’t what I’m talking about and recommending in PH. Just because x reduces to y, does not mean x does not reduce to z. Just because English can be translated into German does not mean it can’t be translated into French. And the reason we need to know the reduction to y is that historians can make use of that fact. Whereas that it reduces to z is largely useless to humanities majors.
Treating it as a syllogism, from a technical standpoint, I think we should first clarify what we mean by “a historical method”. Very generally speaking, I take it the argument assumes a historical method is something which allows us to reason in situations where we cannot be certain (specifically given partial information and about statements which cannot be known to be true or false). There are different theories for reasoning in situations where we cannot be certain, and these operate under very different assumptions and sometimes about different types of propositions. Then considering the conclusion of the argument “not C” where
however from this it does not follow:
because such a method would not necessarily be expressed in the semantic of (epistemic) probabilities (for instance a fuzzy logic). In addition one could consider hybrid approaches and a range of different things. I accept the claim (as stated in my review) if you by historical method limit yourself to a consistency requirements for probabilities, but then I am not sure I see why we should need the argument.
Keep in mind I am not claiming you should apply fuzzy logic to history; do what gets the job done! I just don’t think it has/can be proven Bayes is the unique historical method.
You are at too high a level of abstraction.
A historical method is something which allows us to ascertain how likely a given claim about history is to be true. That is, first, a subset of “situations where we cannot be certain,” and, second, relating to a different goal. Though uncertainty is a factor in knowing the probability that a claim about history is true, “uncertainty” is not synonymous with the probability that a claim about history is true.
This comment is unintelligible to me. Analogy: It does not matter whether it’s in German. If you can write it in German, you can write it in English. So saying it can be written in German does not mean it can’t be written in English. You seem to be saying that if something can be written in German, then I cannot say it can be written in English. That makes no sense.
Secondly, fuzzy logic is just another way of manipulating confidence levels and intervals. One that adds levels of complexity wholly useless to historians, who are working with margins of error too large to need such precision of calculus in mapping the options.
So, (a) that all valid historical methods reduce to BT does not entail that they can’t also be reduced to equations in fuzzy logic and (b) there is no use to historians to reduce historical methods to protocols in fuzzy logic.
Hi Richard,
But that’s just it. You assume a probabilistic semantic (how likely a claim is). I could say:
and to the extend either of these two quotes are saying something definite I am simply asserting a historical method is about the (degree of) truth. This does not amount to a proof of anything, just an expression of personal opinion.
Yes, but your analogy is false since different notions of uncertainty/vagueness are not equivalent.
I simply have to disagree. From the standford encyclopedia:
http://plato.stanford.edu/entries/logic-fuzzy/
If you operate under the premise probability theory and fuzzy logic express the same thing, well, I think we will simply have to agree to disagree.
That the truth of a historical claim is a matter of how probable it is that it occurred is not a personal opinion, it’s an objective and indisputable fact.
And your quote about fuzzy logic isn’t talking about what I am. Fuzzy logic allows multiple truths, essentially creating something analogous to a quantum field of truths analogously to the positions of an electron in a field. That does not change anything when you start asking what the probability is that the electron (or analogously, the truth) is at position x.
This is why fuzzy logic is generally useless for history: it doesn’t even ask the questions historians want to answer, much less answer them.
Tim Hendrix: The issue is this type of proposition would be a textbook example of a proposition with graded truth value (because of the term: “widely valued”) and would, strictly speaking, fall outside the scope of those propositions probability theory analyses.
Richard Carrier: Untrue. That you can gradate things does not mean you have to.
You just define the boundary of “widely” (e.g. “with more than trivial frequency,” “found in more than half of materials in each genre,” etc.) and then it’s a straightforward binary question of whether the evidence matches that boundary or does not, hence either h or ~h. No further gradation is required. Each historian can look at the data and see whether it matches the defined boundary condition or not.
If historians want to debate just “how” widely, then they can define the terms of their debate and see whether their demarcation is even discernible in the evidence. And often it will not be, in which case there is no knowable truth of the matter whether it was x widely or y widely, only that it was, for example, somewhere in that general range. But even when it will be discernible, that becomes another binary assignment: z widely or not z widely. No other gradation is needed.
Note this is how historians have always done history. So I am not proposing anything new here. When a historian says Jesus was a widely influential character in the Middle Ages, they don’t care about such questions as precisely how widely. They only care about it being widely enough to warrant the use of the word ‘widely’. Which is a binary question. Not a graded one. And again, this has always been the case. BT does not change that. To the contrary, it explains it.
The statement we are discussing is:
The problem is when you introduce boundary of widely to say for instance:
then H is not H1 (an even better example of constructing a binary proposition from a fuzzy one: H2 : ‘Celcius said H’; still the case that H is not H2). I am very well aware the distinction between probability and (graded) truth appears superficial and pedantic when one first encounters it, but it has very deep formal roots. You can check out the wikipedia page or the standford encyclopedia which you yourself cited previously for more information. I stress I am not claiming you are doing something *wrong* by focusing on probabilities, that’s what I do as well, however I want to highlight this is not the only notion of vagueness and in fact for statements such as H it is most likely not the appropriate one. Anyway this is a digression.
This doesn’t make any sense again.
I can only suppose you meant to say something like:
H1 : “if the teachings of Jesus were said to be valued by more than 50% of historical documents in the historical period P, then the teachings of Jesus were ‘widely valued’ in the sense intended by historian Z”
Because only that would be a correct analysis.
But once you correctly state the matter, your objection vanishes.
So I don’t know why you are convoluting the matter beyond common sense here. Maybe you didn’t realize that the H1 I just reconstructed is what it means to reduce the problem to a binary question of probability?
Well, the point still remains that my original proposition
H : “the teachings of Jesus were widely valued in historical period P”
is not the same as the proposition you introduced:
H1 : “if the teachings of Jesus were said to be valued by more than 50% of historical documents in the historical period P, then the teachings of Jesus were ‘widely valued’ in the sense intended by historian Z”
(actually it is difficult to see why H1 is a binary proposition at all and not expressing a definition but i digress).
Once again, what the example illustrates is we have two different types of systems for handling uncertainty and they operate on different types of propositions. That you can take a proposition expressed in one systems and point to an expression in another system which express something which you feel is reasonably similar (but not equivalent) is not bringing us closer to anything which would amount to a proof only one system is needed. It’s simply not a proof, however once again I think we have arrived at a point where we might disagree about the fundamental premises in the argument (what amounts to a proof and if multi-valued logic and probability theory are equivalent) and I am not sure how I can proceed at this point.
I am telling you the “new” proposition I introduced is exactly what I was talking about that you were responding to. So you are just affirming here, again, that you don’t understand what I’m talking about.
And my syllogism says nothing about “only one system is needed.” So that is not a relevant rebuttal to it. It still proves what it actually proves: that all historical methods actually used reduce to Bayes’ Theorem.
If someone thinks some other system is needed for something historians actually do or want to know, some system that accomplishes something Bayesian reasoning can’t, then they can write their own book proving that. But I won’t hold my breath. Because I’m a historian. So I know the only thing historians want to know is how likely it is that any given h is true. No other information is of any use to them.
Tim Hendrix: To take the first part of this statement, I take it you accept Bayes theorem as not in dispute for a historian, that is, we can grant a historian accept the Bayesian account of probabilities. If he does not, would the proof not have to first set out to establish that Bayes theorem was true?
Richard Carrier: Most historians aren’t even qualified to doubt whether BT is true; competent historians would defer to the field that has verified that. So all I need do is what I did: my citing mathematics scholarship establishing it is accepted as proven by the relevant experts.
What historians doubt is whether BT applies to them. Or indeed, that it describes everything they do. Granted, it should be obvious that (a) all their discourse is a discourse about probabilities, (b) all theorems that are demonstrated to be true about discourses about probabilities therefore apply to your discourse, therefore (c) BT applies to your discourse. But nope.
Humanities majors get lost in that line of argument, and find it dubious (like someone is trying to pull a fast one) unless demonstrated in the way they have any familiarity with (syllogistic demonstrations in English, being the basic foundation of philosophy, the department of the humanities that bridges the humanities to the sciences). Hence my doing so. It has the added advantage that they can walk through the steps and see how BT actually does describe what they are doing, and there is in fact no way to avoid it—no way, in fact, to do what they do, and not be doing Bayesian reasoning.
And that forces them to face the fact that, since they are already doing it, and can’t ever avoid doing it, they had better understand that fact, so they can finally better understand what they are doing (and have been doing all along). The goal in Chapter 4 is opening eyes to that fact. I first do it by doing it for their favorite methods. Then I show they can’t weasel out of the conclusion by claiming they are using some other method instead. Because whatever method they try to duck into…guess what, it’s Bayesian. Showing the futility of their trying to avoid this conclusion is the function of the general syllogism in Chapter 4.
So…
Tim Hendrix: I agree it is worthwhile (as a matter of instruction) to point out to historians how historical arguments can be mapped to a probabilistic form (Bayes theorem if you like), and it would follow the various elements of a historical argument would then find a probabilistic counterpart as priors/likelihoods, however I see this more as a matter of explanation than a formal proof.
Richard Carrier: Hence my pointing out that this is not a relevant criticism of PH. You agree with what PH is arguing. You just are bewildered that it has to be proved. Hence I pointed out (and as I’ve elaborated just above), yes, it has to be proved. And yes, that is annoying. Like I said, I share your pain. It would be so much easier if humanities majors saw this as being as obvious as you and I do. But alas, they just don’t.
Okay, I think we can close this discussion. it seems we agree the arguments in PH has as a premise that BT is true. That is my alternative statement:
holds and what the proof in PH sets out to establish is the second sentence. I don’t really want to argue if this is easy to see or not for someone in the humanities.
I know what you mean. Trust me, I’m as exasperated as you are by this. I tried getting historians to grasp what you and I see as obvious on this point for years before I published PH. And the arguments in PH were built out of my attempts to do that and the lengths I had to go to.
Tim Hendrix: My concerns is more that I would have hoped to see more practical applications of Bayes theorem to existing historical questions to see how Bayes theorem is imagined to apply in practice.
Richard Carrier: I think there are enough of those to help historians venture out and do their own: e.g. showing why our judgment about the sun vanishing is sound; showing why historians’ judgment about the murder of William II is sound; showing why our judgments about miracles and arguments from silence and literary emulation are sound; and then likewise all the reasons the Criteria fail by failing to conform to BT (and that when conformed, they can’t be applied to the scant and problematic materials we have for Jesus), etc.
Although I wish I had known of Tucker’s book when I published PH. I’d have referenced it as a good collection of further examples. Although I did reference the materials exhibiting application examples from the intelligence and archaeology communities.
But overall, the goal is conceptual, not mathematical. If a historian can use the flowchart on p. 287, they are going to be miles ahead of where they were before they knew this was the engine of their reasoning all along. Being able to run some simple numbers (as with the canon: p. 286) to get a ballpark picture of what they are doing in cases too nuanced for the chart, even better.
But conceptual examples are littered throughout the book. I’ve considered publishing some papers showing the modeling of existing arguments (e.g. prior published papers by other historians). But PH I considered already over-long to add such an appendix. Math is scary enough to humanities majors. A thicker book would have scared them thoroughly.
I think modelling existing arguments would be a really good idea. In particular it would be interesting if you could find someone with a different opinion about the conclusion who could provide different estimates of the probabilities.
Absolutely! We need a lot more of that in the field. I think the field (all branches of history) would advance considerably after, say, ten years of experimenting with that kind of thing in the way historians debate claims about history.
Tim Hendrix: …regarding my analysis of the criteria of embarrassment [:] as far as I understood your analysis it did not have ways to specify if a text was actually embarrassing or not to the person doing the preservation, and as far as I could tell your analysis pointed to the conclusion the text would always have a probability less than 0.5 of being true. I tried to re-do an analysis where I allowed embarrassment to enter as a variable and got a (qualitatively) different result than yours.
Richard Carrier: Which you then retracted by agreeing with me. So I didn’t understand your point there. As I implied in my post, the bottom line is we don’t get different results by adding such variables unless we are cheating (i.e. making their effect much larger than we have any warrant to claim). Otherwise their effect is too small to matter. It disappears under our margins of error.
In the specific case you mention: q does not measure whether a story is embarrassing or not; what it measures is the subset of all embarrassing stories that have sufficient motives to preserve them. Why would over half of all actual embarrassing events have sufficient motives for a friendly writer to preserve them? That would be a truly remarkable coincidence. In fact, that would contradict the basic premise of the Argument from Embarrassment: that it would be unusual (thus less than 50%) for an embarrassing story to have enough reason to be preserved (as opposed to being cut from the story or altered so as not to be embarrassing or simply not recalled or transmitted at all).
I defined all the terms on pp. 163-64. If you look at them carefully, you will see that it doesn’t matter at any point how many “actually” embarrassing stories there are. There will still always be more fake stories that appear embarrassing that are preserved, than true ones. Unless more embarrassing things actually happened to Jesus than anyone would ever have cause to invent (p. 165). I showed that even when those numbers are equal (there are as many useful embarrassing stories to invent as true ones that actually happened), most preserved embarrassing stories will be false. By a significant amount. Due to it being statistically impossible (indeed, contrary to the entire reasoning of people using the Argument from Embarrassment) that q is greater than 0.5. There isn’t any legitimate way to escape that conclusion by adding in more variables and faking numbers for them.
Meanwhile, as I wrote in PH: “one can fabricate countless embarrassing but useful myths, whereas there are only so many embarrassing things that can actually have happened to someone.” So the idea that the latter would outnumber the former is not only not demonstrable, it’s not believable. The conclusion follows.
Finally we are getting to the interesting stuff…
Just to clarify, I have not retracted anything I have written and I am sorry if something I have written has given this impression (English is not my first language and I am partly dyslexic in my primary language).
To begin to clarify the argument in PH. You are carrying out a Bayesian analysis over Boolean propositions. I just wonder if you could write out what those Boolean propositions are? in particular, if the word “embarrassing” refers to if a christian TODAY believes the story is embarrassing, or if it refers to the same story being embarrassing to someone writing in the first/second century.
Secondly, as I understand your position, sometimes the criteria of embarrassment does work right? Isn’t it more believable when someone tells us something private and embarrassing than if he tells us something which makes him look good? The point of my analysis was to allow this situation.
Of course the argument works either way (in fact it gets worse for the historicity of stories if we use what moderns think are embarrassing).
But “the word “embarrassing” means a christian TODAY believes the story is embarrassing” is immediately eliminated as irrelevant. A historian cannot judge the past by the background knowledge of the present (I discuss this in Chapter 6 even, with the pies in windows being stolen by robots analogy). So someone has to be able to show that a story was embarrassing then. And if they can’t, then they cannot assign the attribute “embarrassing” to that story. Certainly, that nixes most Arguments from Embarrassment before we even get to any calculations. Specious anachronism kills a hypothesis before it even leaves the gate.
But that’s when addressing an actual specific story. The argument built on ratios does not refer to any specific story, but the numbers of stories. It thus is not about what moderns believe. It’s about what actually was the case: how many true embarrassing stories likely could there have been that would be so valued they would be preserved even by friendly tradents vs. how many fake embarrassing stories could have been imagined for Jesus that served a purpose so valued it would be preserved even by friendly tradents (like we see all over in ancient religion: Attis being castrated, Inanna being crucified, Jesus being called crazy, Romulus murdering his brother, Jupiter being a rapist, Hephaestus being ugly, Prometheus being eternally tortured for being good, etc.).
It should be obvious that we have already eliminated stories that weren’t even embarrassing at the time, before even getting to the final argument about what then remains true of the stories that are left, which is what I close with using the ratios argument. And I’m pretty sure I explicitly already say that in PH, that we have to eliminate from even consideration stories we cannot even say were embarrassing in the first place. Those are already gone, well before we get to asking what ratios then obtain with what is left.
Yes. Although I could not see how your analysis did that. My analysis does that: PH, pp. 158-62, . The ratios argument then goes to show that we can’t do it for Jesus, pp. 162-66, because we don’t have the kinds of sources we need, or the kinds of claims we need: pp. 166-69. The Gospels are such that they would not contain “private and embarrassing” things. At all. They will only contain things that make the gospel look good. So the prior probability is already very high (as the ratios argument shows) that anything in the Gospels that looks embarrassing even by ancient standards is there because it makes the gospel look good, and not because the author was forced to admit it even though they should like not to have. And anyone who wants to argue otherwise, needs to overcome that prior against it with enough evidence to outweigh it. But we have no evidence for any example to do that with.
That’s the argument of that whole section, pp. 158-69 (in which the ratios argument is but one part).
This is a problem peculiar to documents like the Gospels, and because we have no other evidence to work with outside of them. Not to other genres and types of documentation or even other hagiographies written in the midst of other documentation we can use to evaluate them with.
I know this is taking it slightly out of context but try to read the last sentences again :-). Here is what I am getting at (and please keep in mind I am not a historian). Consider one of the letters we both agree Paul wrote with near certainty. We can imagine the following process: Poul writes the letter, Paul gets old and dies, then the later church preserves, redacts and mangle Pauls letters often to fit it’s theology and this is what we have today.
As you have pointed out in OHOJ (and I think very convincingly) Paul says things in his letters which are likely at odds with the later church (the first century church). For example in Romans 15:3-4 he says he learns everything from revelation, and as far as i understand it the writer of acts (i.e. the later church) properly goes out of the way to contradict this. So we can imagine we define:
T : Paul wrote romans 15:3-4
Tem : Romans 15:3-4 was truly embarrassing to the 1st century church
Em : Romans 15:3-4 seems embarrassing to us today.
Pres : Romans 15:3-4 was preserved
Then, in my understanding, it would be fair to say that the fact Romans seems embarrassing (Em) increases the chance that Paul actually wrote the passage (and it is not a later invention), T. Symbolically:
p(T | Em, Pres) > p(T | Pres)
The point is in evaluating the above expression we have to use Tem (I show how in the review) because *thats whats relevant for the early church which did the redaction* but we do not have access to Tem directly but only indirectly from Em (it appears embarrassing today), which is especially relevant for passages which appears embarrassing today but might have had a literary point in the first century (such as the gospel examples you provided). I included both variables because it appears to me to be needed to make the analysis work in both cases.
You are free to disagree with any of this and I am sure you can reduce the expression I discussed to give qualitatively consistent results with yours. I simply wanted to make the point in the review that I did not feel there was a variable which actually expressed the same as Em and this had an effect on the result, as well as discuss certain issues relating to introducing such a variable. But I suggest postponing this discussion until I have read OHOJ. If you wish to discuss the relationship between our two expressions further I would still hope if you could provide a list of the Boolean variables you make use of such that the argument can be put in standard “P(X|Y) = …” form. Perhaps it is a language issue but I am not sure I can do so correctly as it is (is my translation wrong in the review?)
We are only interested in whether something was embarrassing to the original author. Because after that the chain of custody is broken. As a result, “statement against interest” reasoning no longer works.
So if indeed you can prove a statement Paul wrote was embarrassing to him (and not someone else) and thus something he was forced to admit or admitted against his own interests (and also prove he was in a position to even know the statement was true), then you have a valid argument. And we can show why it is valid with Bayes’ Theorem. And that’s indeed exactly what I say and explain in PH.
Tim Hendrix: The quote from PH I was referring to was this: “…by probability here I mean epistemic probability, which is the probability that we are correct when affirming a claim is true. Setting aside for now what this means or how they’re related, philosophers have recognized two different kinds of probabilities: physical and epistemic. A physical probability is the probability that an event x happened. An epistemic probability is the probability that our belief that x happened is true”
Richard Carrier: Translation:
“…by probability here I mean epistemic probability, which is the frequency of times that we are correct when affirming a claim is true. Setting aside for now what this means or how they’re related, philosophers have recognized two different kinds of probabilities: physical and epistemic. A physical probability is the frequency with which an event x would happen. An epistemic probability is the frequency with which our belief that x happened would be true”
I think you went wrong by thinking probability can ever mean anything other than a frequency. It cannot. The only question is what it is a frequency of. Epistemic probability asserts a frequency of being right. A physical probability is the frequency we are claiming to be right about.
I agree if you actually think there is an intelligible understanding of probability that is not a frequency, you would get confused here. But anyone who has that misunderstanding I correct in Chapter 6. There is no other intelligible understanding of probability. Degrees of belief, for example, are just assertions of the frequency of being correct.
Tim Hendrix: Then we can consider the second situation. In this we consider what we predict will happen and lets just assume I know nothing about the die (which is the case) and conclude the probability is 1/6.
Richard Carrier: That’s a third situation, not the second. Once you grasp that (as I explained in the post), your criticism dissolves.
Tim Hendrix: Do you believe probability refers to different things and has a situation-dependent definition?
Richard Carrier: Of course. I explain this many times in the book. There is epistemic probability (our frequency of being right). There is actual physical probability (the frequency shown in an actually produced set of data). There is hypothetical physical probability (the frequency that would be shown in a hypothetically extended set of data). There may be even more.
All words differ in meaning by context. That’s why it’s important when stating a measurement, that you state what it is a measurement of. This is as true of probability as much as any other form of measurement.
Tim Hendrix: So what is “the probability” (singular)?
Richard Carrier: That is as meaningless a question as “what is ‘the height’ (singular)?” The height of what?
Height is a vertical linear distance. A vertical linear distance of what? Probability is a frequency. A frequency of what?
That height is a vertical linear distance (singular) is true. But not enough to know what you are talking about. Likewise that probability is a frequency (singular) is true. But not enough to know what you are talking about.
Thus…
Tim Hendrix: However notice my question was related to how we define the probability *of the proposition I proposed*
Richard Carrier: Which you didn’t explain. Hence I had to guess at two possibilities. There may even have been more.
You can’t just assert a probability without explaining what it is you are saying it is a probability of. Moreover, when you are saying you are guessing something about x, you have to say what it is that you are trying to guess about x. You didn’t say either. Forcing me to guess at an analysis. This may well stem from your inability (?) to understand that probability does not exist in isolation. There is no “what is the probability (singular).” There is only a probability of something.
If you are making an assertion about the frequency with which the 8th digit of pi will be 9, then the true frequency is simply zero, because that is the frequency with which logically impossible propositions are true. But that requires you to know (and thus also be asserting) that whether the 8th digit of pi will be 9 is a question of logical necessity, and therefore the true frequency can only ever be 1 or 0. You didn’t specify that as what you were stating by that proposition. But if you did, then by saying you believe there is an 80% chance it’s a 9, you are saying there is a 4 in 5 chance you are right and the true frequency of that is 1, and a 1 in 5 chance you are wrong and the true frequency of that is zero.
But this only falls out upon analysis if that is explained to be what you were asserting. Thus, we have to know what you are asking the probability of, before we can answer you.
Tim Hendrix: [Regarding your book’s argument on pp. 267-68]…
Try to walk me through this example: Suppose I believe 4 things with probability 0.512. Then, can we agree that it will be the case either: 0, 1, 2, 3 or 4 of these will be correct limiting the frequency at which i am correct to five values? My point is simply how you define probability, for the probability of 0.512, in this particular situation? Do you envision some sort of limiting procedure? Do you define the word frequency differently than Jaynes or I (see the review for a definition)?
Richard Carrier: As I noted before, this whole response is unintelligible.
You don’t believe 4 things. You have a vast quantity of beliefs. But more importantly, the number of beliefs you have is irrelevant. We are talking about conditional probability, not unconditional. Thus, the question is “if you had evidence of kind x, then how often would you be wrong about x?” which means “how often” in an extended hypothetical set of infinite runs. You could have an answer to that question without ever polling how many beliefs you have. Just as you can answer “how likely is it this die will roll a 1” without ever once rolling the die.
When you say “I am 75% sure I’ll win” you are simply saying “I have a very high confidence level in the claim that I will win 3 out of 4 times that I’m in a relevantly similar situation.” In other words, you are saying “there is a 1 in 4 chance I’m wrong about winning” which means “there is a 1 in 4 chance I would be wrong about winning when in similarly evidenced conditions.”
That’s just a semantically necessary fact. Your reply makes zero sense in light of it, and does not even begin to respond to it.
(Meanwhile, talk of irrational fractions is moot. We are working with margins of error wide enough that those dissolve within any a fortiori confidence interval. Thus, we never need concern ourselves with them.)
I will respond to the past 3 posts here.
Firstly, what is your definition of the word “frequency” as it is used for defining the terms relevant to probability? (you can compare to my definition, which corresponds to that of Jaynes, in the review)
To take my example:
to which you respond:
All probabilistic statements I am making are conditional (I follow the same convention as in PH). Now, as I understand the reply, you are saying that when we make sense of the statement: p(H|x) = 0.512 for some hypothesis H and background evidence x, the actual number of beliefs I have is irrelevant (in your words). So according to this definition, we imagine an “extended hypothetical set of infinite runs” and (i assume) consider the limit of the frequency of the true propositions to the total number of propositions in this series. As I mentioned in my review, I anticipated we would be talking about infinities at some point and I guess this is now :-). The issue is that this view — that probabilities are defined from repeatable events — is just the frequentist view which is what (e.g.) Jaynes argues against. If you have not had a chance to read his book i would strongly suggest for you to do so. To re-iterate some of those points, for this definition to make sense requires two things:
1) what is this series exactly?
2) it must be defined (including the limiting procedure) without making references to probabilities or probabilistic language (otherwise the definition would be circular). Notice this includes “random”, “chance”, etc.
It is in particular for question (2) the issue of having a clear and unambiguous definition of “frequency” is important.
To take an example, consider:
H : ” The 8th digit of pi is 9″
What is then this “extended hypothetical set of infinite runs” exactly? Is it a possible-world scenario where in some worlds pi is different? is it something else, for instance random flips of a biased coin? (in this case would we not have to say: the coins is biased to produce heads with probability 0.521?
On the standard Bayesian view there is no such problems. I simply have a degree of belief* of 0.521 of H for the example of pi (or more relevantly, 0.1 because I really don’t know). I don’t have to imagine anything else like a hypothetical series of infinite runs (of what?) and moreover, *I can actually use this definition to derive the rules of probability theory* (as Cox argues). This is a major point. The definition is not just a semantic we can place on top of an existing mathematical framework, it has a normative effect.
To re-consider the relationship between probabilities and frequencies from a Baysian view, as i pointed out, if i believe m things “on the same strength of evidence”, e.g. with the same probability, then I can compute the probability of the statement:
H_n : n of the propositions H1, … Hm are true.
and e.g. the probability of n/m (the “frequency of being right” if you will). The formula is in the review. This requires *no* consideration of infinite runs or anything else and follows completely naturally from the basic laws of probability theory (iirc. this is chapter 3 of Jaynes). I would argue this is far more natural.
On a final note, regarding the quote about confidence intervals, I wonder if you had read about the Bayesian procedure for obtaining the same result (i.e. infer the probability density of for instance the probability a coin comes up heads)? If you are interested I did write a few pages about this in my review which i subsequently removed. If you are interested in how this is normally done I could send it.
Rate of occurrence.
So, again, frequency of what? If you can’t answer that question, then you can’t even talk about a frequency of anything in the first place.
Epistemic probability is: “when I have z scale of evidence, then I will be right that x is true at frequency f.”
That is not a measure of past success/fails in being right/wrong with z scale evidence. That is a prospective measure of infinite future runs: every time you will ever have z scale of evidence, you will be right that x is true at frequency f. That is what you literally mean when you say “I believe there is an 80% chance x is true.” You certainly did not stop and spend a year mapping all past beliefs you had and their success/fail rates and evidence scales before you said that. So obviously that is not what you are basing that 80% on. So there is no point in asking about that. What you actually are basing that 80% on is your estimate that whenever you have z scale evidence for x being true, you will be right that x is true 4 out of 5 times. This is not calculation. This is semantics. This is what you literally mean by that. Whether you are correct is an entirely separate question, which depends on how you are getting that 80% from z.
Not really. Jaynes never says anything about a degree of belief being an estimate of his frequency of being right (when given a particular scale of evidence). Neither pro nor con. So he can’t have said anything against it.
You are again still stuck confusing epistemic with physical probabilities. The pi case has you tied in knots and not getting it. The true frequency of the 8th digit of pi being 9 is zero. Period. So insofar as your epistemic probability that the 8th digit of pi is 9 is not zero, by that much your epistemic probability is in error. Sometimes that error is recognized. For example, there is a vanishingly small probability that the 8th digit of pi actually is 9 (that it’s true frequency is 100% and not 0%), because of the possibility that everyone who has ever calculated it has been wrong, including all the computers ever. Sure, nearly zero chance of that. But it’s not zero. And we recognize this by saying our epistemic probability includes this tiny chance of pervasive error, so our epistemic probability that the 8th digit of pi is 9 is near zero but not exactly zero.
If instead we have somehow found ourselves in a place where we actually have a legitimate epistemic probability of 80% (and not “near zero”) that the 8th digit of pi is 9, even though in fact it isn’t (and it is logically impossible for it to be), then we have to be in some extremely bizarre science fiction contrafactual scenario. I struggled hard to invent one as best I could. But the thought experiment entails that we would have to have evidence greatly misleading us as to what the 8th digit of pi is. Only then could our epistemic probability (the frequency of our being right) be so far from the truth of the matter. Although even then, the fact that we are wrong is still included in the converse 20%, so that the 80/20 is a measure of how far the evidence has misled us (unbeknownst to us). It is not a declaration that we can’t be wrong.
You do not “simply have a degree of belief.” It comes from somewhere. It is not conjured by magic. Moreover, it is not only regulated by evidence, it is regulated specifically toward the true frequency of the thing being claimed. Thus, the more evidence we collect regarding pi, the more your epistemic probability (your “degree of belief”) that the 8th digit is 9 will drop from 0.521 to ~0.000. It drops toward the true frequency. Because it is bound by the true frequency. It is in fact an attempt to estimate the true frequency. And (on average) gets closer to it as more evidence accumulates. And thus the variance between the true and estimated frequency (your degree of belief) is a measure of your error.
Thus, in hindsight, you would agree, once you have adjusted your degree of belief to near zero on this, that your earlier estimate of 0.521 was wildly off and quite in error.
What I think you are failing at is to demarcate ontology from semantics. When I say “I believe there is an 80% chance that x” is translatable to “I will be wrong about things like x 1 in 5 times,” I am semantically unpacking the meaning of the first sentence. This has nothing to do with whether the sentence is true. That’s a completely separate question.
But you cannot even begin to test or know whether a statement is true until you know what it means. Thus, once we understand the meaning of “I believe there is an 80% chance that x” is in fact “I will be wrong about things like x 1 in 5 times,” then we can start asking why we believe we would be wrong about things like x “1 in 5 times.” Where does that “degree of belief” come from? And why is it warranted? Why couldn’t we just say 90% or 2%? What is causing us to reject those and pick 80%? What is constraining our choice?
That then gets us to ontology: the answer is the truth, mediated by evidence.
We do not have direct access to the truth. That’s why we can never say the probability that the 8th digit of pi is a 9 is exactly zero. It may be. It very probably is. But we can never know that. We can only know that “it very probably is.” We only have access to evidence. The more evidence (in quality and quantity) that we have, the closer our degree of belief will get to the truth (most of the time: that it can go the other way is accounted for by our converse probability, hence the frequency of our being wrong—that the frequency of our being wrong goes down, does not mean we can’t still be wrong).
That is (part of) the argument of the last section of Chapter 6.
Okay I think we are finally getting to the hearth of the matter.
Lets break this down. The first issue is the definition of “frequency” with “rate of occurence”. The issue is the phrase “Rate of occurence” has a probabilistic connotation especially when you take it in the context of a hypothetical “infinite run”. For instance the first technical reference on google for “rate of occurence” is:
http://www.jstor.org/stable/3215190?seq=1#page_scan_tab_contents
where you can tell rate of occurence is defined using probabilities. This is a general feature whenever you have words like “expected”, “average”, “random”, “limit frequency”, etc. that it is very hard to give these a precise definition without resorting probabilities; this is a very old problem and you can find plenty of instances where people give circular definitions. I am sorry to ask the question again, but what *exactly* is the frequency/rate of occurrence defined as *in this exact situation*.
The second issue here is what “scale of evidence” is. Presumably a “scale” is something which is measured using a single number which sounds very much like a probability, but lets leave this issue aside for the moment.
These lead to the main difficulty here which is what the above definition refers to exactly. I am sorry that I keep asking about this, but it is not clear at this point, especially how you consider the “infinite runs” and what the events are. When I attempt to consider the above definition from a formal perspective I end up with something along:
However these events must from a formal perspective be *random variables*. And for the limit to converge to 0.521 they must each be true with *probability* 0.521 (for simplicity i have assumed independence). *what other definitions which are exact could be offered?* But in that case the statement is now circular — it is defining probability from a (true!) statement about probabilities. But this is not a definition any more than it is a definition of a number x to say it is equal to twice x divided by two (I discuss this in the review).
A way out is to define probabilities by making use of a physical process. That is, the probability a die comes up 3 is obtained by repeating a certain experiment, and thus it is a physical feature of the die. This definition is not circular, but problematic for other reasons Jayens discuss. At any rate *THAT* definition can only apply to repeatable events, because that’s how it avoids the circularity.
The case of Pi is fairly interesting as I think it might highlight some of the differences in our way of thinking. I can tell you have cheated a bit and looked up Pi. Please try not to cheat and consider, as you are sitting in the chair right now, the following proposition:
H : “Digit 117 of Pi is 9”
What is then p(H | b) where b is all of your current background evidence available?
In my view p(H | b) = 0.1. This is because *I do not know what the 117th digit of pi is* More about this in a moment.
You wrote:
Should I take it from this quote the only way to arrive at the answer p(H|b) = 0.1 rationally, on your view, is if we are in “some extremely bizarre science fiction counterfactual scenario” and thus you do not think this is a rational assignment of probability for me to have? What is your assignment of probability? (don’t use google!)
I wrote:
To which you responded:
I agree of course. But I have set out very precisely where it comes from. If you look at the section “numerical values” from Jaynes, chapter 2 it set out the idea behind the symmetry arguments which allow us to derive that the assignment of probability should be 0.1.
This is all off the rails. You just aren’t understanding me, and nothing you are saying here is responding to what I am saying.
We don’t need to have actual data sets to construct a hypothetical set from the information you have. Period. Until you understand this, you won’t be discussing what I am talking about in PH.
What you mean, semantically, is not the same thing as whether what you are saying is true or false. Period. Until you understand this, you won’t be discussing what I am talking about in PH.
That “I’m 80% confident that x is true” semantically means “I think I will be wrong about things like this 1 out of 5 times” is true. Period. Regardless of whether it’s true. Or even why. It’s simply what you are saying. Until you understand this, you won’t be discussing what I am talking about in PH.
Scale of evidence is shorthand for whether the evidence collected in whole increases the total likelihood ratio. We talk about quantity and quality of evidence in this regard. Quality is how large it’s likelihood ratio is. Quantity meanwhile increases the total likelihood ratio by cumulative multiplication of individual likelihood ratios. Until you understand this, you won’t be discussing what I am talking about in PH.
A likelihood ratio is a ratio between how likely the evidence e is (how much we expect it) if h is true, and how likely the evidence e is (how much we expect it) if h is false. This I’m sure you understand.
Once all the above it put together, you should see why you are not even talking about what I am. I am talking about those things. Not other things.
Meh, you are right and I am wrong. I should have bothered to click the link and read his review before making my suggestion above.
He clearely frames his challenge in a context, and he seems to be expressing a concern that this probability doesn’t map to any meaningful frequency (not even a hypotethical one) and that the only reason for assigning a probability is “a lack of knowledge” or degree of belief.
Now, I think the assumtions are that
a) The statement is made by a random person (x) who isn’t confidently aware of the anything but perhaps the first 3-4 digits of PI (I’m guessing he picked digit no. 8 because it’s something most people don’t have memorized).
b) That person isn’t allowed to use a calculator, Google or start working manually with a Gregory Leibniz series (i.e. cheat).
c) When said person makes a statement of probability regarding the 8th digit, that statement will be made fully from a lack of knowledge and (according to him) not from any meaningful “hypothetical frequency” of 9s as digit 8 in PI.
So when such a person assigns any sort of probability (e.g. 0.8), then that is a reflection this person’s “subjective belief” only and in no way linked to any understanding of a hypothetical frequency which according to him makes it “..non-Bayesian and … plagued by all the problems Bayesian has been raising for nearly a century.”
While I understand that matematicans and philosophers feel inclined to want to challenge claims made in a book with the word “Proving…” in the title, I still wonder why the debate hasn’t been more focused on comparing your proposed Bayesian reasoning with the exisitng criteria of authenticity. I think that If you’re living in a (methodological) straw hut and you’re offered free transition to a (methodological) concrete house, you shouldn’t be worrying too much about legal disclaimers regarding the concrete house’s capability to withstand earthquakes in excess of 8.0 on the Richter scale. Particularly if you aren’t building in a tectonic fracture zone.
That’s all correct. But not the issue. We all agree that it’s a measure of subjective belief. The question is what that measure of belief means. What exactly is being measured, and what does it mean to say “80%” vs. “20%”? That is, what’s the difference between those two measures of subjective belief?
I argue in PH that these are just statements of how frequently we expect to be wrong about the belief in question. More generally how frequently we expect to be wrong about any beliefs when presented with the same quality of evidence (hence why we assign 80% to some claims and not others relates to the quality of evidence being comparably good for those claims, which is why they get the same degree of belief assigned). But certainly specifically: to say “I am 80% sure that x” is literally to say “4 out of 5 times I’ll be right that x.” Then the question becomes, how do you know it’s 4 out of 5 and not 1 out of 5? In other words, what are you keying your confidence on? The answer of course has to be evidence. Thus the difference between 20% and 80% is better quality evidence. The rest follows.
But these estimates are bounded by the actual thing we are talking about. Once we know the actual probability that x is 0%, we can no longer say our subjective belief is that it’s 80% likely that x. We will adjust our estimate of the odds of being wrong to match the actual facts now known to us. We will adjust it from 80% to near 0%. Our subjective belief thus starts approaching the true frequency of the thing in question, the more information we have. And in practice this turns out to be what we are always doing when stating a subjective belief in the probability that x: attempting to get as close as the data allows us to the true frequency of x being true.
This just starts to look strange when we switch to logically necessary truths (propositions in mathematics, where the true frequencies are always 0 and 1) from historical contingencies (propositions that might have true frequencies quite far from 0 and 1, like for example the frequency with which religious officiants are women or the frequency with which generals win wars or politicians lie or kings’ deaths are murders).
Simen:
I am sorry but there is a misunderstanding. The Bayesian view (which I think is very persuasive and which have been the subject of all my research) states that probabilities should be seen as a measure of (subjective) degree of belief (see my review). This has been argued from de Finetti and forward (de Finetti is a very famous mathematician and the de Finetti theorem is arguably the most influential single theorem in my field). The Bayesian view should be seen in opposition to the frequentist view in which probabilities are thought to reflect limits of repeatable events, like rolls of a die (I am simplifying by a great deal here).
The 9th digit of pi is an important example since this would normally fall outside the scope of ordinary frequentist theory whereas it is easily analysed in Bayesian terms. I thus do not think this presents a problem on the Bayesian view — on the contrary! And when I state that my degree of belief (by which I mean probability; they are equivalent*) is 0.1 in the proposition “the 8th digit is 9” this is not saying that I consider the example degenerate — I am saying the exact opposite: The example CAN be analysed in Bayesian terms, that analysis WILL yield the single objectively sound assignment of probability and THAT assignment is 0.1. The important caveat is the assignment is subjective to me; but all probabilities are subjective and they have to be since we all know different things. For instance if you knew that for a fact the 9th digit was a 1, your probability assignment would be 0, and if you were a christian living in the 40s CE you would likely have a very clear probability assignment if Jesus existed or not.
But do not take my word for it, just try to skim the introduction to Jaynes book:
http://bayes.wustl.edu/etj/prob/book.pdf
Cheers,
-t
Tim, Bayes’ Theorem is used with actual frequencies and not degrees of belief all the time. Surely you know the stock example of mammography dancer testing. At no point in the Bayesian analysis of how likely it is you have cancer if you test positive for cancer do “degrees of belief” ever enter in. At all.
It’s starting to sound like you don’t know this. And I don’t know how it’s possible you don’t know this.
It seems to me that in order to be able to assign meaningful odds to past events, we should have a testable method which has been shown to produce , with measurable odds of success. For example, suppose we take a large sample events from recent history that we are certain of (e.g Lincoln was fatally shot in the Ford theater, or Neptune was discovered in 1846, etc.) and remove some of the information from our data-base, to simulate the often fragmentary ancient annals. (We may consider improving the simulation by adding contradictory accounts, or accounts that could be interpreted to be non-contemporary.) We could then apply various methods to test the methods themselves, in terms of how often they produce accurate conclusions. In this way we could have some confidence in the odds-values when these methods are applied to ancient history. The odds that a coin I dropped yesterday came up heads is 1 or 0, not 0.5. The odds that your estimate of it is correct is 0.5. I suggest that ancient history as a science needs to make use of mathematical methods and modern tools, especially computers.
How does the fact that there is no independend contemporary accounts of Jesus and his doings appear in form of Bayesian reasoning? I mean, if Jesus was that famous, we should have someone saying something about the things he did. But we don’t have that. How would you put these facts in the form of BT?
And what do you think these facts are evidence for? I think it would be harsh to say its evidence that Jesus didn’t exist. I would say it’s evidence that if Jesus existed, he wasn’t as regocnized as Bible tells us.
(I of course assume here that historical Jesus existed)
You just answered your own question.
“If Jesus was that famous, then we should have evidence that we don’t.”
That is a statement in Bayes’ Theorem. You just made a Bayesian argument!
Just look under the hood:
P1 = “If Jesus was that famous, then [very probably] evidence would exist that doesn’t.”
“If h, then [very probably] ~e.”
=
“If h, then e is very improbable.”
=
P(e|h) = [very low]
Better Bayesian reason just requires you to be more honest about what you mean by “should have” in the first statement. It can’t mean “absolutely certainly,” because you’d be forced to admit it’s possible the expected evidence didn’t get produced or didn’t survive, so you have to allow some probability of that, and thus have to state how likely you think that is and why.
In practice, though, you are right that this is what happens:
P1 is true. And effectively refutes the existence of a Jesus that famous.
But P1 does not reduce the probability a Jesus existed who wasn’t that famous.
Therefore, Jesus wasn’t that famous.
This is actually a painful admission for bible scholars, especially fundamentalists. That Jesus was a total nobody!? Oh no!
But alas, what we learn is precisely that: if Jesus existed, then he must have been an unfamous nobody.
So I only test “nonhistoricity” against that thesis, of an unfamous Jesus.
I discuss what happens then (with respect to external evidence) in Chapter 8 of On the Historicity of Jesus.
One still then has to deal with the internal evidence (the New Testament), which is covered in Chapters 9, 10, and 11.
Chapter 2 specifies the minimal historicity hypothesis. Chapter 3 the best competing nonhistoricity hypothesis. Chapters 4 and 5 cover background evidence. Chapters 1 and 12 then provide the introduction and conclusion.