1This history of modern mathematical statistics retraces their development from the “Laplacean revolution,” as the author so rightly calls it (though the beginnings are to be found in Bayes’ 1763 essay [1]), through the midtwentieth century and Fisher’s major contribution. Up to the nineteenth century the book covers the same ground as Stigler’s history of statistics [2], though with notable differences (see infra). It then discusses developments through the first half of the twentieth century: Fisher’s synthesis but also the renewal of Bayesian methods, which implied a return to Laplace.
2Part I offers an indepth, chronological account of Laplace’s approach to probability, with all the mathematical detail and deductions he drew from it. It begins with his first innovative articles and concludes with his philosophical synthesis showing that all fields of human knowledge are connected to the theory of probabilities.
3Here Gorrouchurn raises a problem that Stigler does not, that of induction (pp. 102113), a notion that gives us a better understanding of probability according to Laplace. The term induction has two meanings, the first put forward by Bacon [3] in 1620, the second by Hume [4] in 1748. Gorroochurn discusses only the second. For Bacon, induction meant discovering the principles of a system by studying its properties through observation and experimentation. For Hume, induction was mere enumeration and could not lead to certainty. Laplace followed Bacon: “The surest method which can guide us in the search for truth, consists in rising by induction from phenomena to laws and from laws to forces” [5]. To my knowledge, he never cited Hume, though Hume’s work had been translated into French by 1758. For Laplace, probability was a new way of reasoning, on the basis of partial knowledge of the phenomena under study. His “riseofthesun” example should of course be understood in connection with the hypothesis that that phenomenon has only been observed for five thousand years. But as Laplace clearly indicates, knowledge of the regulating principle behind the phenomenon enables us to make a much more precise estimate. Moreover, the assumption here of a uniform a priori distribution is not a blind metaphysical assumption, as Gorroochurn seems to think, but always a reasonable one, and Laplace uses nonuniform a priori distributions in other examples (cf. Stigler 1986, pp. 135136). Here, since there are only two possibilities (the sun will either rise tomorrow or it will not), the principle of indifference applies perfectly. None of the critics Gorroochurn cites seems to have understood this point; all seem to have accepted Hume’s understanding of induction.
4Part II, “From Galton to Fisher”, focuses on how a fundamentally frequentist approach was adopted, an approach opposed to Laplace’s and based on Hume’s induction principle, though the author does not clearly state this. The researchers who developed it were trying to devise a statistical approach in the biological and social sciences. Though they were interested in several areas in them, Quételet and Lexis can be associated with population science, Galton and Pearson with the study of heredity and biometry, Edgeworth and Yule with economics, Fisher with biology and genetics, and so forth.
5Laplace’s methods were applied either to astronomy or geodesic data, fields that had already been theorized, or to simple data for which the probability law had already been established, such as sex ratio at birth (binomial law). In the life and social sciences, the problem is linked to the mass of causes of the phenomena under study and their nontrivial effects: given that the hypothesis of population homogeneity is untenable, how can we take account of observation complexity? The entire effort of these statisticians was to devise tools – correlation, regression analysis, multivariate analysis, contingency tables, and others – to disentangle causal ties. This analysis culminated in Fisher’s theory of statistical estimation, which Gorroochurn describes in great detail, including the different disagreements between Fisher and a number of other statisticians: Bartlett, Jeffreys, Pearson and Neyman, to name a few. For Fisher, probability represented the limiting frequency of the event under study in a hypothetically infinite population. The point was to state the properties through a simple enumeration rather than to discover the principles behind an event. Ultimately, then, a frequentist probability theory was most suited. Fisher was decidedly an objectivist, and he stood opposed to Laplace, [6] whom he criticized repeatedly in his writings.
6It is interesting to note that neither Fisher nor Gorroochurn ever cites or discusses Kolmogorov’s axiomatisation of objective probabilities, [7] whereas Part III presents Ramsey’s, de Finetti’s and Savage’s subjective probability axioms in detail. Kolmogorov’s axioms bear on the occurrence of events that are likely to be repeated in what are judged identical conditions but that are not usually associated with a general theory. Though this definition is slightly different from Fisher’s, it comes very close to it.
7Part III, shorter than the others, explores the extensions of Fisher’s theory of statistical estimation and the renewal of Bayesian methods to the early 1960s. First, as Gorroochurn judiciously remarks, Fisher was opposed to the “mathematization of statistics” (p. 593) and his own demonstrations were sometimes lacking in “mathematical rigor”. A number of the statisticians Gorroochurn cites devoted themselves to demonstrating Fisher’s ideas with greater precision and to extending them. Here we may cite the author’s discussion of Wald’s statistical decision theory, which generalizes Fisher’s estimation problem and connects up with the discussion of the NeymanPearson hypothesis test. However, the author defines Wald’s approach as entirely frequentist whereas Wald himself described his decision rules as “Bayesian strategies”.
8The return of Bayesian methods in the 1920s, a time when Fisher’s frequentist theories were enjoying great success, marked the beginning of a new period, but one that only came into its own fifty years later, with the beginnings of computer technology, which made inference possible for all a posteriori parameters through the use of simulation methods involving a great number of calculations. Here the author cites Keynes, Ramsey, de Finetti, Jeffreys, Savage and Robbins for the period 1920 to 1956, discussing Ramsey’s, de Finetti’s and Savage’s axiomatisations but leaving aside that of Jeffreys, already cited at length in Part II in connection with his disagreement with Fisher.
9While the discussion of subjectivist epistemic probability seems to me a welcome contribution, relatively little is said on logicist epistemic probability. [8] This last approach, followed by Jeffreys [9] and axiomatized by Richard Cox [10], is perfectly compatible with Laplace’s and provided his theory with a solid foundation. Jeffreys returned to Bacon’s notion of induction, explaining that it leads to postulates or axioms drawn from all the information available to the researcher at the moment they are made, the difference being that that information no longer pertains to events, as in Kolmogorov’s thinking, but to propositions.
10In direct contrast, subjective probability has continued to follow Hume’s definition of induction, specifically his notion that an individual is free to adopt any evaluation of the probability of an event from zero to one and that “each of these evaluations corresponds to a coherent opinion” [11].
11In conclusion, this book offers a richly detailed presentation of the work of statisticians over two centuries. Indeed, I would go further than Gorroochurn and call his thinkers probabilists, for many of them have enhanced our view of that more theoretical discipline. It is regrettable, however, that the author chose to discuss the issue of induction – a crucial one in probability – on Hume’s terms, which suggest that probability and statistics involve little more than listing the properties of a supposedly infinite population. Those disciplines go far beyond such enumeration; they developed a new type of logic, based on reliable axioms, for understanding an empirically observed population.
Notes

[1]
T. R. Bayes, 1763, “An essay towards solving a problem in the doctrine of changes”, Philosophical Transactions of the Royal Society of London, 53, pp. 370418.

[2]
S. M. Stigler, 1986, The History of Statistics: The Measurement of Uncertainty before 1900, Cambridge, Ma., The Belknap Press of Harvard University Press, 432 p.

[3]
F. Bacon, 1620, Novum Organum, London, J. Bill.

[4]
D. Hume, 1748, Philosophical Essays Concerning Human Understanding, London, A. Millar.

[5]
[Laplace, A Philosophical Essay on Probabilities, trans. F. W. Truscott and F. L. Emory, New York, Dover, 1951, p. 182.]

[6]
On this question see J. Aldrich, 2008, “R. A. Fisher on Bayes and Bayes’ theorem”, Bayesian Analysis 3(1), pp. 161170.

[7]
A. Kolmogorov, 1933, “Grundbegriffe des Wahrscheinlichkeitsrenung”, Ergebisne der Mathematik, vol. 2, Berlin, Springer.

[8]
On the difference between the two approaches, see D. Courgeau, 2012, “Probability and social science. Methodological relationship between the two approaches”, Dordrecht, Heidelberg, London, New York, Springer, Methodos Series, vol. 10, 35 p.

[9]
H. Jeffreys, 1939, Theory of Probability, New York, Clarendon Press.

[10]
R. Cox, 1961, The Algebra of Probable Inference, Baltimore, The Johns Hopkins Press, 114 p.

[11]
Cf. B. de Finetti, 1937, “La prevision: ses lois logiques, ses sources subjectives”, Annales de l’Institut Henri Poincaré, 7(1), pp. 168. Quoted in English in G. Parmigiani and L. Inoue, Decision Theory: Principles and Approaches, John Wiley and Sons, 2009, p. 22.