1INED has published a new book presenting a unique survey of the French population in the nineteenth century, based on a sample of all individuals present in metropolitan France [1] whose name began with the letters TRA. To mark the release of individual-level data covering the nineteenth century and their accompanying documentation, Jérôme Bourdieu, Lionel Kesztenbaum and Gilles Postel-Vinay retrace the construction of the database, looking at its originality, its strengths and weaknesses, and present an example of results concerning the assets left behind by individuals who died between 1825 and 1939. A second volume will be published on the data collected for the twentieth century. Based on the collection of data from civil records, and supplemented by data from the fiscal administration (Enregistrement) on the assets of deceased persons, the database can be extended to include information from a range of sources, providing that the individuals are identified by name.
2The TRA project is a research tool aimed at building a representative nationwide sample of individuals having lived in France in the nineteenth and twentieth centuries. Containing an extensive set of data on these individuals, including their occupation, place of residence, education, marital status and assets, the sample is an exceptional instrument for analysing long-term historical developments and the deep-seated changes that have affected French society – and other societies besides – over the course of two centuries.
3The TRA sample was initially based on the “3,000 familles” sample designed some 30 years ago by Jacques Dupâquier, but has since been further developed. It will continue to be enriched as access widens to additional sources of information on mobility, income, education or health, among others.
4Besides the data it provides and the range of historiographical questions it poses, the sample has strong potential for a number of other uses. Combined with other existing or future sources of data collected on the same name-based principle, it constitutes a common matrix for extensive work on a range of topics, both qualitative and quantitative, that can be associated with existing data as part of a cumulative process. New projects will benefit from the data already acquired while in turn expanding the data set at a number of levels. The sample thus provides a two-faceted tool: it is both an autonomous analysis instrument, which can be used to ask new questions, and an evolving one, which can be enhanced through cross-matching with all possible nominative sources and with a considerable range of historical sources (including data on municipalities and occupations) that characterize the socioeconomic environment.
5To achieve this goal, the data produced by the sample must be made accessible to the scientific community. This is the goal of the dedicated website (www.enquetetra.fr) and the book published recently by INED containing all the TRA data for the nineteenth century (Bourdieu et al., 2013). The database, presented in the form of a series of entries providing information on the demographic characteristics and assets of each TRA individual, can be used as it stands, but two sets of constraints should be taken into account when using the information, the first arising from the sources used and the second from the way the information is organized. We review these points in the first part of this article before going on to discuss the representativeness of the TRA sample. We then conclude with an illustration of the sample’s potential by analysing trends in primary wealth inequality – i.e. the share of individuals who die without any assets – over a period of 150 years.
I – A national sample of individuals
6The TRA database was developed against a singular historical backdrop. The sources it uses are inextricably linked to the construction of the state in France, i.e. to the existence of an institutional framework providing both continuity and homogeneity. Throughout the study period, wealth as defined and measured, and even the most basic demographic phenomena such as births and deaths, are seen not simply from the viewpoint of the state but were also produced with a view to furthering the state’s efforts to unify and standardize, both continuing the work of the Ancien Régime and informed by the changes wrought by the Revolution. In addition, because it was built up gradually over time, the TRA sample has a history of its own and has a particular relationship with the work of French and international historians and demographers (Bourdelais, 2004).
The first steps in the TRA project
7The TRA project was initiated by Jacques Dupâquier in the early 1980s as an extension of the sample constructed by Louis Henry, known as the “enquête Henry”, and its remarkable intellectual yield. Henry’s analyses established the initial conceptual framework of historical demography in its early days, in particular through the creation of a national sample of the French population from 1740 to 1829 that immediately became a reference (Rosental, 2003). Carried out from 1959 to 1966, this work resulted in a series of studies on the French population (Blayo, 1975a, 1975b; Henry and Blayo, 1975; Henry, 1978; Gutierrez and Houdaille, 1983; Dupâquier, 1988; bibliographies of Renard, 1997; Séguy, 2001, pp. 75-78).
8The sample successfully drew on the extensive documentation to be found in parish registers (Rosental, 1997). Despite the sample’s success, the method had its limits, however, resulting as much from the nature of the sources as the use made of them. Although the Henry sample is based on a nationwide sample (Séguy, 2001, pp. 91-133), it is local by construction since it studies the demographic behaviour of individuals only so long as they remain in the same place during their childbearing lives. Individuals who relocated between marriage and death are observed only partially as the other phases in their lives are not recorded in the sample; individuals cannot be followed beyond their parish or municipality.
9Based on this observation, and with a view to studying migration patterns in the nineteenth century, Dupâquier set out to construct an observatory that followed individuals (and their families) in all their movements [2] and was not limited by the geographical scale of the parish or municipality. To that end, he drew on an invention of the French Revolution, the so-called “decennial tables” (tables décennales) of births, marriages and deaths. These tables, established once every ten years by each municipality, sorted in alphabetical order each person who was born, married or died in the municipality. They can thus be used to follow individuals from one place to another, and thereby to build a sample of the French population unimpeded by the geographical movements of each individual.
10The sampling method chosen for this approach consisted in selecting a set of family names and collecting all the individuals with those names in the civil records. Any other method would have required going through millions of tables across France in search of a few individuals. This type of work may be carried out for a handful of people or for particular areas, such as a town (Pinol, 1991) or a small region (Bideau and Brunet, 2007), but obviously not at national level.
11Three criteria were considered when selecting the family names. They had to be “robust”, i.e. resistant to written and oral distortion; sufficiently represented in the early nineteenth century in all French départements and ethnic groups (even foreigners); and not too numerous, so as to avoid overly-extensive processing operations that would be only marginally useful. After a series of preliminary studies, family names beginning with the TRA trigram, such as Trabit and Trarieux, were selected (Dupâquier, 2004, pp. 9-10). Using these three letters offered the best guarantee as to the stability and regional representativeness of the family names while keeping the sample down to a reasonable size.
12Dupâquier’s 3,000 familles sample aimed to geographically follow individuals as well as their families, the final objective being to reconstitute descending patrilineal genealogies. Consequently, only a small part of the corpus of TRA individuals was used, namely the descendants of a certain number of TRA couples married between 1803 and 1832, selected in each département in proportion to their population in 1806. The aim was to “take a representative sample of 3,000 couples formed under the First Empire and follow their descendants” (Dupâquier, 1981, p. 1164). In the end, the “number of marriages thus defined being roughly 7,500 for the 1803-1832 period, while the sample itself, in its geographical stratification, totalled just 2,965 families, the utilization rate of preliminary records came to around 40%” (Dupâquier, 2004, p. 10).
13This initial project was soon joined by a second – a survey of wealth set up on the initiative of Denis Kessler. Using data from the Enregistrement, or fiscal administration, Kessler sought to analyse the ways in which wealth was accumulated and transmitted in the nineteenth and twentieth centuries (Kessler and Le Bras, 1985, pp. 27-29; Bourdieu et al., 2013, pp. 32-38). Consequently, he gravitated toward hypotheses other than those of the economic and social history agenda defined by Ernest Labrousse (1955, 1967) and implemented by Adeline Daumard (1973), or by macro-economic studies such as those of Maurice Lévy-Leboyer (Lévy-Leboyer and Bourguignon, 1985).
14Starting in the 1980s, then, two coordinated samples were set up based on the TRA trigram. Their originality lies in their patronymic method, their national reach and the simultaneously longitudinal and genealogical nature of the data produced. But they also form part of a wave of major historical samples developed in a number of countries that reflects a shift in the social sciences toward analyses involving the use of data on individuals. These samples share a number of points in common. They collect information from the past, on individuals, anonymously or otherwise, and on a scale large enough to carry out statistical analyses.
Major historical samples based on individuals
15The first historical demography samples, dating from the late 1950s to the early 1970s, were mostly based on parish registers and the demographic events regularly recorded in them. This is because the samples addressed a period of time preceding the development of standardized administrative sources and also because they aimed to provide a precise measure of demographic behaviour while guarding against any potential biases resulting from the way the study populations were selected. This is true of the Henry sample, as well as of the first major UK sample (Wrigley and Schofield, 1981; Wrigley et al., 1997) and of the large-scale reconstitution of the Quebecois population by the Balsac project (Bouchard, 1989; Bouchard, 1996).
16However, these samples observe individuals only when specific life events occur (such as death), or solely those individuals concerned by a particular event (those who marry, for example). By their construction, the databases therefore tend to exclude greater or lesser fractions of the population, such as individuals who migrate, who do not marry, who are not Catholic, etc. Advances in computer technology have now made it possible to address this problem by exploiting data from population censuses on a large scale. For example, the complete archive of data from past US population censuses developed by the Integrated Public Use Microdata Services (IPUMS) project is emblematic of a new type of historical database, one that can be referred to as “all-encompassing” in that it is built not as part of a specific scientific research project but to provide a sample of the entire population (Ruggles et al., 2010; Sobek et al., 2011). This type of database can be used as a non-discriminatory basis for a broad range of research.
17But the real transformation in historical demography in recent decades came with the development of longitudinal data at the individual level and the statistical techniques required for their analysis. These data are produced either by building on previous samples, in particular through genealogical reconstitutions and data-matching on the same individuals at different dates, or by using longitudinal sources such as population registers.
18Population registers provide a continuous record of populations, making it possible to study the life trajectories of individuals, while situating these trajectories in a family or local context. The emblematic example here is the Eurasia project, an international network grouping longitudinal data gathered in a small number of villages in five countries (Belgium, China, Italy, Japan and Sweden) to study family structures and demographic behaviour (Lee et al., 2004; Allen et al., 2005). Samples based on population registers considerably improve and broaden the scope of historical demography samples based on the Henry method. Yet they share the same defect of being geographically circumscribed. What they gain in temporal depth they lose in geographical breadth, as tracing individuals’ lives year after year is so time-consuming that it is difficult to achieve on a large scale, obliging researchers to limit themselves to small areas or populations.
19In the absence of longitudinal sources, an initial solution consists in cross-matching information through links made at individual level. This is done either by using the same source at different moments in time – for example, by looking for the same individual in successive censuses (Ferrie, 1996) – or by cross-matching the largest possible number of sources to follow individual trajectories. One example of the latter is the data on Union army veterans of the US Civil War collected on the initiative of Robert W. Fogel of the University of Chicago (Fogel, 1993). Free from the geographical limits of other databases, this base can be used to follow a large sample of men from the generation that fought in the Civil War, from enrolment through to death (Costa, 2003; Lee, 2007; Costa and Kahn, 2008; Lee, 2008).
20Another solution lies in building genealogical links between the individuals studied. At the University of Utah, the Mormons have built the most extensive genealogical database in the United States, the Utah Population Database, with over 15 million records (Bean et al., 1978, 1990). In similar fashion, data on the Quebecois population drawn from Balsac have been used as a basis for large-scale genealogical reconstructions (Bouchard et al., 1985; Bouchard, 1989; Gauvreau et al., 2010).
21The international landscape of large historical individual-level databases has thus been structured both by specific scientific questions and by the nature of the sources used to build the databases (Kelly Hall et al., 2000). The considerable expansion and sustained development in this field over the last ten to fifteen years has spawned a large research community with its own journals (including Historical Methods), experts and organizations.
22The TRA project is part of this international landscape and shares some characteristics with the approaches mentioned above. It stands apart through its cumulative approach, aimed at offsetting the weakness of sources in France, a country with no population registers (unlike the Eurasia project) and no systematic archiving of census data (unlike IPUMS). This weakness is linked to the very history of these sources (Rosental, 2012). The project thus resembles other recent attempts to build representative long-term national samples. The Historical Sample of the Netherlands (HSN), similar to the TRA sample but beginning at a later date, differs from TRA through the choice of a random sample (after rejecting the option of a patronymic sample): the individuals in the HSN were picked randomly from birth certificates dated between 1811 and 1922, stratified by decade, province and degree of urbanization (within each province of the Netherlands, one or more groups were established according to population density – see Mandemakers, 2000). The Dutch project then sought to find traces of these individuals in a range of sources, including population registers (Mandemakers, 2012). These fairly significant differences aside, the HSN is relatively similar to the TRA project. It is a sample of individuals representative at national level, comparable in size (77,000 individuals born between 1811 and 1922) and based on the same starting points (Napoleonic civil records and decennial tables).
23Yet the existence of directly longitudinal sources is not a prerequisite, as shown in the work carried out on US censuses, which are now in their large majority indexed. In addition to the 1% or 10% sample for which all sets of information have now been processed, identification fields – last name, first names and birth date – have been digitized for all individuals, which makes matching across censuses a great deal easier (Ferrie and Rolf, 2011).
24The TRA project brings together all the qualities of its foreign counterparts, notably thanks to the wealth of French administrative sources that it uses and its ability to combine them over a long period of time. The TRA database provides a wide range of information – on the wealth of individuals, for example – rarely available with such a degree of precision in other samples. It can also serve as a basis for addressing a diverse range of questions, whose scope and interest are limited only by the availability of the ever-growing number of sources (such as electoral registers, military registers and hospital files, to mention but a few).
Construction of the sample
25The TRA project, built up in successive phases with data extracted from a growing number of sources, was initially based on two main pillars: the 3,000 familles sample and the wealth module. Both have moved forward simultaneously, but not at the same pace. The complexity of the Enregistrement sources was a major obstacle, and faced with the difficulty of rapidly reconstituting the descendants of the 3,000 families, marriage registers were used instead (Dupâquier et al., 1986; Dupâquier and Kessler, 1992). In the end, Dupâquier’s goal of creating a new Henry sample with an entirely different framework and structure turned out to be partly unattainable. But while Dupâquier’s achievements were to remain incomplete, the potential of his project remained, ultimately giving rise to a larger and more comprehensive sample than ever initially imagined.
26The 3,000 familles sample was overtaken by and incorporated into a more extensive collective project, forming a stem protocol from which numerous projects can branch off. The TRA project is producing a deep-seated transformation in French historical demography because it extends the analytical framework introduced by Louis Henry in at least three respects: it is individual-based, national and cumulative.
27While designed as a large-scale sample survey, the Henry sample also includes a nominative part, although the name-based data exist only at the reduced scale of the couple. Individuals exist only as part of a couple, as a parent (mother or father) or as a child. And the same applies to families, since they begin with the marriage of a couple and end with the death of one of the spouses. In the TRA project, however, the individual is central and forms the hub around which data-matching and, where relevant, family reconstitutions are carried out. More specifically, each individual is considered in two ways: on a personal basis throughout their life, by following their trajectory via a combined set of sources, and in relation to other TRA individuals and, more specifically, to their family members, whose trajectory is retraced using identical means. As such, the TRA project combines life-course approaches and genealogical approaches (these last, however, solely for the descendants of men, as children take the name of their father rather than the maiden TRA name of their mother).
28The Henry sample is national, but its data are not. Although the survey comprises a random sample of the French population, couples are considered only at the micro-geographical level where they are observed, i.e. their parish or municipality. The data collected for the TRA project are exhaustive in terms of the patronymic sample, making it possible to reconstitute the trajectory of individuals throughout their life and across the country, independently of their place of residence.
29Last, the TRA project is also cumulative, as it forms a hub around which a wide range of further work can be built. The Henry sample focused on measuring fertility. While other uses have been made of the database, notably a wide-ranging analysis of mortality (Gutierrez and Houdaille, 1983), it cannot easily be extended beyond its initial remit. The method can be reproduced ad infinitum, providing scope for the numerous spatial and temporal comparisons that have contributed to the success of the Henry project, but no further data can be added to the completed sample. The TRA project has no such limitation. Cumulative by construction and aim, it can be supplemented by additional work on particular research questions. And this last task can be done easily, simply by collecting TRA individuals from a specific source (for example, prison registers or nominative census lists) and matching them with the central core of the project, presented and supplied in the book (Bourdieu et al., 2013) in order to address a new area of research.
30The TRA project does have its limits, or, at least, an angle of view that should be taken into account by users, while keeping in mind that this perception bias evolves over time and will continue to do so as the base incorporates new sources.
31Based on civil records (including data from the Enregistrement), the TRA database contains information on the demographic events of birth, marriage and death in metropolitan France, a country whose borders have shifted over time (in the areas of Nice, Savoie and Alsace-Lorraine). The individuals identified are not necessarily all French and may have been born or even married outside France, and the corresponding documents may be missing. Similarly, individuals who emigrated or died outside France may not be followed through to their deaths.
32The data collected also have their limits, stemming from the practices of the government bodies that produced the sources from which they are taken. The civil records may be practically exhaustive but they do not provide completely homogeneous and consistent data. A striking example here concerns occupations, which are mentioned with varying frequency (much lower for women) and with varying precision and exactness. But when different sources are available for the same individual, consistency between the sources allows for occupations to be coded with reasonable accuracy. The coding of occupation names involves several operations, however, and is never a neutral process.
33Data conservation may also be a problem. And because destruction never occurs entirely by chance – archives destroyed in wartime, for example – the data collected may contain geographical biases.
34The initial objectives of the project introduce two further biases. First, building male lines of descent has an undeniable impact on the way families are perceived. In this respect, the TRA base has a sexist bias, one undoubtedly underscored by the way the sources treat women. For example, women’s occupational status is less well established than that of men, and women are often wrongly considered as not having an occupation. Further, some supplementary sources, such as military conscription archives, concern men only. Nevertheless, in the civil records and those of the Enregistrement, women are listed by their maiden names so can be followed throughout their life. The observed sex ratios show that women are slightly under-recorded as they may fail to be detected when identified by the administrative source under their husband’s name.
35The second bias lies in the fact that selecting individuals via the random drawing of family names leads to a form of cluster sampling and, hence, to over-sampling of individuals from the same family in comparison with a strictly random sample of the French population. The TRA individuals dying in a given year are not fully representative of the deaths of the year in question because they were all selected on the basis of their family names and belong to the same families more often than if they had been drawn at random. In a further problem, if the sample size is measured not by the number of individuals but by the number of family names, its relatively small size may result in major random variations. Some family names may be concentrated in certain places and occupations. For example, some TRA individuals are chair makers in a small town in northern France, Iwuy. This undoubtedly results in an over-representation of chair makers in the database, to the detriment of, say, lace makers, few of whom have family names starting with the TRA trigram. But this problem should not be a source of alarm for a number of reasons. First of all, not all the TRA individuals with the same family name are from the same family. Also, while the database documents the situation of chair makers rather than that of lace makers, it still provides access to information on a type of activity that plays a large role in the economy. Lastly, capturing a family dimension in the TRA data is also an advantage for analysing the family dimensions of historically observed economic and social relationships. The family-oriented design of the database produces a parallax effect, in that it includes family members more than neighbours, people of the same religious denomination, colleagues or regimental comrades. But it has the advantage of giving visibility to the family as a source of socialization.
II – A representative sample of the French population
36It could be argued that the TRA sample, based on individual-level data, is not in a position to compete with aggregate data for France as a whole. It is nonetheless possible and useful to place the sample in the national context and assess its representativeness in space and time. Beyond its technical aspect, the question of representativeness is also that of the appropriate scale for studying social phenomena on the basis of individual-level sources. The aim of the following discussion is to assess the general reach of the sample in its ability to give an account of a century of economic and social history in France on the basis of individual-level data. Unless we consider timeless universal models of behaviour to be the only possible point of reference, then using the sample pre-supposes, and helps to maintain, a critical view of the conditions for its validity.
Representativeness of the TRA sample in space and time
37One of the construction criteria of the initial sample was a balanced geographical distribution. TRA individuals are found across France (Figure 1). We take as an indicator the municipalities where at least one mention of a TRA was found between 1803 and 1902, either in the civil records or those of the Enregistrement – in other words, the municipalities where a TRA was born, married or died. The total comes to 16,840 municipalities, or nearly half of the French total (Bourdieu et al., 2013, p. 164). But they are not always present in the same proportion (it would be highly unlikely for the proportion of TRAs to be identical from one département to the next). The task is then to measure the spatial and temporal variations of the sample with respect to the French population.
Map of municipalities with at least one mention of a TRA individual, 1803-1902

Map of municipalities with at least one mention of a TRA individual, 1803-1902
38A simple way of studying the sample representativeness is to determine the proportion of TRAs in the overall population. The resulting ratio is a sort of “sampling rate” of the TRA sample, except that it is determined not before but after data collection. As such, the ratio expresses the number of inhabitants in France represented by each TRA. In practice, it can be simply be multiplied with the sample size to obtain the French population. It also provides a means to determine if and how the representativeness of the TRAs varies. Taking the civil records from 1803 to 1902, one TRA record exists for 1,055 French records, giving a sampling rate of close to 10 per 10,000, a rate that remains consistent for birth certificates (9.7/10,000), death certificates (9.5) and marriage certificates (8.8). The rate is lower for data on wealth, which are affected by two biases: the Enregistrement sources were established at a slower pace than those of the civil records, and for a long time took scant account of children (Bourdieu et al., 2013).
39While relatively stable in spatial terms, the ratio is not the same everywhere. It may vary across départements and over time for two reasons. First, the TRA sample is based on an alphabetical draw, and the share of TRAs in the population varies from one département to another. Second, loss of records due to war and other causes of destruction – from fire to mice – occurs unevenly across the country. The TRA sample is a national sample. It would be pointless to use it at canton level, or even at département level, both of which are too small for the sample to express itself fully.
40Mapping this ratio reveals areas of homogeneity. In a strip of départements extending from west to east, from Normandy towards the Mediterranean coast, the proportion of TRAs is higher than the average. And in a line of departments going from southwest to northeast, from the Atlantic to the Massif Central and Alsace, the TRA proportion is lower. The dispersion of rates measured at département level is high (with a sampling rate of 2.5 to 25 per 10,000), but if ten or so départements are excluded, the proportion of TRAs appears sufficient to represent the variability of demographic and socioeconomic behaviour in the country (Bourdieu et al., 2013, pp. 166-170).
41This spatial distribution is identical for births, marriages and deaths, the latter being obtainable from the civil records or the Enregistrement. The resulting TRA geography reflects the intrinsic distribution of TRA family names in France in a wide strip extending from Normandy to the Mediterranean and, conversely, the under-representation of TRAs in the southwest and northeast (Bourdieu et al., 2013, pp. 168, 169 and 173).
42Jacques Dupâquier assumed that drawing his initial sample in “special alphabetical order” would make it representative. Yet even assuming that the 3,000 couples designated by the family name of one of the spouses are representative of the population on the date of the census used as a reference (1806), nothing guarantees that this representativeness remains stable over time. Moreover, the decision to reduce the sample to 3,000 couples – even if distributed proportionately to the population of the départements in 1806 – by no means guarantees representativeness over the entire nineteenth century. Limiting the sample size increases its chances of shrinking over time in some départements and becoming overly concentrated in others, such that representativeness is weakened. Not to mention the effects of newly arrived foreign populations whose family names also begin with TRA.
43Ideally, a perfectly built sample would have the same sampling rate across time and space. In our case, it is significantly lower for the beginning of the century owing to the poor conservation of sources (Bourdieu et al., 2013, pp. 44-49), although afterwards it varies very little (Figure 2). In addition, we observe that the TRA sample also captures shorter-term movements of the population (Bourdieu et al., 2013, p. 173).
Sampling rate of TRA births, marriages and deaths

Sampling rate of TRA births, marriages and deaths
44While the sample captures short-term changes and general trends at national level, this is not necessarily the case at a more local level. In particular, it is likely that the variations seen over time for the country as a whole and those seen simultaneously at local (département) level are linked, mainly because the quality of sources varies according to local practices. Comparing the ratio of TRAs to the French population at département level shows that some départements remain at an average level of representativeness (around one TRA per 1,100 French people) with no major changes over time. Figure 3 takes the Rhône département as an example, but the same is true in many others, which means that in this type of département, the sample held steady over the century. By contrast, in a smaller but still consequential number of départements, considerable fluctuations in the sampling rate are observed. There are fewer TRAs in these areas, which tends to result in major variations in sample representativeness over time (in Isère, for example). Lastly, in a third group of départements, the number of TRAs varies over time, but in this case, apparently, due to inconsistent data quality. For example, in the Seine département, the ratio stabilizes only in the second half of the century, as occurrences of TRAs are rare at the start of the period.
TRA representativeness in three selected départements, as measured by death certificates

TRA representativeness in three selected départements, as measured by death certificates
Reconstituting the wealth of the French population
45As discussed earlier, the founding objective of the TRA project was to transpose knowledge of demographic and social phenomena from local to national level, acknowledging that the latter was the only meaningful level for measuring such phenomena, given the extent of migration. The sample size and design were established with the aim of avoiding as far as possible any temporal or geographical bias in the selection of the individuals recorded in the sample.
46Data on wealth did pose some specific problems, however, as children are under-recorded, especially the youngest (Bourdieu et al., 2013, pp. 170-171). The aim being to observe data on wealth, from which children are generally excluded, it was logical to focus on individuals who died at age 20 and above. An additional – and more thorny – problem concerns changes in the information on wealth that are recorded with varying degrees of precision by the country’s fiscal administration.
47The most accessible sources of tax data provide information on the value of each person’s assets but do so unevenly over time. Until 1825, the tax system included a multitude of partial sources, making it hard to gauge the wealth of an individual (Bourdieu et al., 2013, p. 100). From 1825 to 1865, on the other hand, the total wealth of each individual dying in a given place was systematically recorded in the table of deceased and missing individuals (Table de successions et absences, or TSA). After 1865, improvements in the tax information system relegated the TSA to a position of secondary importance, and it was no longer used to record the assets of the deceased (Bourdieu et al., 2013, pp. 101-103).
48For deaths after this date, information on individuals’ assets must be sought in the actual documents declaring the transfer of wealth to the heirs. Given the complexity of this source (which records both the amount and composition of assets), only a small fraction of the declarations (16% in the nineteenth century) were included in the sample, resulting in weaker representation over time and space (Bourdieu et al., 2013, p. 131). Indeed, when focusing solely on those individuals whose declaration of inheritance has been included, it is important to keep in mind that they form a selected group. Using these data to analyse a social property of any kind requires taking proper account of the distortions generated by this selection. For example, a person seeking to use the TRA project to write a history of land ownership in France would have to take account of sampling effects whereby TRA individuals with a known declaration of inherence are not distributed evenly across all the types of land ownership structure characteristic of France.
49More generally, the sample serving as a reference for analysing the wealth of French people must be fully specified. While it provides information on the existence or non-existence of assets at the time of death – and so allows us to clearly tell “rich” and “poor” individuals apart – the sample, from the last third of the nineteenth century on, does not include information on the amount of assets held by a share of the French moneyed classes, i.e. those whose declaration of inheritance was not recorded in the sample. We can, however, test the selection of these individuals using other characteristics, including sex and age, even if that means subsequently correcting the sample through appropriate weighting. To that end, the proportion of wealthy individuals with known total assets was compared on the basis of various characteristics (Bourdieu et al., 2013, pp. 179-188). This proportion results from the sampling strategy used, which varied over time, with a higher sampling rate in cities.
50We propose a system of weighting by period and municipality size to correct for these two effects, ignoring other characteristics that generate little or no bias. The weightings are established on the basis of 27 groups: nine periods and three spatial ensembles (Bourdieu et al., 2013, p. 186). The number of individuals in each group serving as a basis for the weighting is of sufficient size (at least 30, with two exceptions, and always over 20), while the number of weighted individuals (those whose wealth is not known) is not excessive.
51One last problem remains. As wealth is particularly concentrated, the TRA sample cannot claim to include the wealthiest individuals. Those in the top wealth percentile, for example, represent a very small number of individuals, and the number in the TRA sample is obviously even smaller. Present some years and absent others, these very rich but very rare individuals are liable to confound any assessment of wealth. We made the choice to exclude them, so as to maintain a homogeneous observatory of wealth, i.e. one that satisfactorily covers not all but 99% of the population – the 99% least wealthy. To that end, we used wealth thresholds extrapolated from an exhaustive sample of wealthy Parisians (Piketty et al., 2006). Individuals with known wealth above that threshold are excluded from the weighting calculation.
52The final sample (after weighting) is a tool for analysing changes in wealth inequality over time that is representative of the French population as a whole – or, at least, 99% of the population.
III – Wealth inequality in France
53Thanks to its unique characteristics, the project is able to address a set of original questions, as illustrated by the example of wealth inequality. Leaving aside the internal inequality of those having acquired some wealth, we will look at the fundamental differences between those with sufficient wealth to leave behind an estate and the rest. The share of the former in the population as a whole may be designated as “primary inequality”, distinguishing it from the more frequently studied “secondary inequality” concerning levels of personal fortunes among the wealthy. In fact, samples documenting wealth at the time of death are widely used, even if that means leaving out all those who have no estate.
The possession of wealth over time
54The TRA sample can be used to reliably establish who possesses wealth and who does not. One of the goals of the French Revolution being to build an economy of small property owners, one might expect a century and a half of economic growth to have brought France closer to that ideal, or at least to have made property ownership a more widespread phenomenon. But seen through the prism of wealth at death, the precise opposite occurred, with a decline in the share of positive bequests (Figure 4). That share fell steadily over the course of the nineteenth century, before stabilizing in the first half of the twentieth century. Contrary to expectations, the industrial revolution and the unprecedented period of growth to which it gave rise led to a twenty-point fall in the share of those possessing wealth. Inversely, primary inequality increased, with a twenty-point rise in the share of those for whom private property, to use the words of Marx and Engels, was “abolished”.
Percentage of TRAs leaving an estate on death

Percentage of TRAs leaving an estate on death
55This cumulative and macro-economic result was obtained thanks to the nationwide scope of our project, as only a sample representative of France at national level could achieve such an end. Any local study reaching a similar conclusion on a particular social group or selected population would rightly be greeted with scepticism and doubts about the potential for generalizing such findings. The value of a result of this order lies in its overall reach, in that it reveals a previously unseen trait that is hard to perceive without statistical objectivization because it characterizes not just all or part of the members of a society – no-one is 45% or 55% rich – but a society in its entirety. One of the advantages of rate-based statistics is that they make analysis in terms of an average (or representative) individual impossible.
56It may be objected that such a long and costly individual-level sample would be a needless luxury if it resulted merely in cumulative statistics on primary inequality such as these. But in reality, this kind of result can be achieved only by drawing on nominative data. Naturally, the public authorities annually publish the number and amount of bequeathed estates, known in France as the annuité successorale, a document initially established at national level and published at département level from the early twentieth century on (Bulletin de statistique et de législation comparée, 1884; Daumard, 1973).
57This approach is not problem-free, however. The annuité successorale published for a given year mainly concerns the estates bequeathed by individuals having died one or several years previously. Consequently, it records fluctuations in estates with a delay that varies according to mortality levels and the difficulties encountered by the tax services. In addition, establishing whether a given individual leaves behind a measurable fortune necessarily involves follow-up work that the tax administration did not need to carry out before 1902. This involves attributing to the same individual the various declarations of his or her wealth, whether these last are made at the tax office where the individual was domiciled or elsewhere. The existence and amount of individual wealth can be identified only if all the declarations of the same individual made in all the tax offices where he or she possessed his or her assets are gathered together – something that only the TRA project has succeeded in doing (Bourdieu et al., 2013, pp. 142-153).
58Yet even when supplemented by this process of collecting information at individual level, the choice of relying on tax sources poses obvious problems, first of all because the tax system is liable to affect the nature of wealth declarations and wealth-related behaviour in general. But these two impacts do not play a role in all cases. While they may weigh on large estates, they have a considerably smaller effect on modest ones. And only these last are of concern in an approach such as ours that seeks to identify changes in primary inequality, thereby tracing the line between those who have nothing and those who have little.
59The tax system may well have played a part in shifting that line by changing wealth-related behaviour. In particular, differential taxation on gifts and bequests may have resulted in the earlier or later transmission of the estate from one generation to the next. That said, while the choices in that respect have at times changed significantly – and in the twentieth century in particular – they may have affected large estates but have had only marginal impact on the proportion of individuals who die with no assets to bequeath.
60More generally speaking, changes in the tax system have also led to increases and decreases in the (overall or selective) under-reporting of wealth, whether by the tax administration or individual tax filers.
61From the taxpayer’s standpoint, it is obvious that under-declaring or not declaring wealth at death reduces the amount of tax to be paid. On the other hand, a faithful declaration provides a guarantee for the declarant that the inheritance rights relative to his or her estate are clearly established. [3] For this reason, tax fraud has a real disadvantage for the heirs – denying them a precise view of their wealth – in exchange for what is generally a meagre advantage. Throughout the nineteenth century, taxation was as a whole extremely low – at roughly 1% of a directly inherited estate. And although the situation changed in the twentieth century as tax became increasingly progressive, particularly between the two World Wars, only the largest estates were affected. Small estates were not liable and the individuals concerned were neither less nor more tempted by tax fraud than before (Bourdieu et al., 2003).
62The practices of the tax authorities correspond to constraints that have also changed over time. But these changes have not significantly altered the way small declarations are handled. The legal framework inherited from the French Revolution stipulated that the wealth left behind on death was taxed from the first franc. At the same time, because small estates, like any others, required a certain amount of work while generating minute yields, they constituted an expense rather than an income. As a result, the tax authorities may have been tempted to write them off. However, as tax office employees’ wages gradually became less closely pegged to the size of the estates handled, the temptation to neglect the smallest estates tended to decrease rather than increase. The rise in the number of local tax offices also had the same effect.
63These various factors suggest, at the very least, that under-declaration of small estates did not increase during the period in question. But other factors played a role, creating a nebulous area on each side of the dividing line between people who owned assets at the time of their death and those who did not. Yet this uncertainty is confined to narrow limits.
Those with wealth and those without
64After confirming that the measure of primary inequality used to build the sample is both meaningful and robust, we can take advantage of the individual nature of the data and establish results concerning society as a whole before breaking them down into several dimensions (Table 1). These findings can then be discussed, detailed and cross-referenced. The share of individuals with assets increases with age, stabilizes around the age of 60, and then decreases slightly. It also decreases in line with the size of the city, being particularly low in Paris, where the purchase of real-estate was made more difficult by the structure of ownership, at a time when co-ownership did not exist. The trend by age should take account of mortality differentials between the rich and the poor, which, depending on the chosen assumption, may either increase or decrease the “decumulation” of assets at the end of life (Attanasio and Hoynes, 2001; Bourdieu and Kesztenbaum, 2004; Ferrie, 2003).
Proportion (%) and number of the deceased leaving assets behind on death, by various characteristics

Proportion (%) and number of the deceased leaving assets behind on death, by various characteristics
65The variations observed at individual level can themselves be viewed in relation to the overall historical development of the phenomenon under study. The availability of individual data makes it possible to analyse variations in the results obtained (a decrease in the share of the population leaving assets behind at death) on a cumulative scale, in terms of both macro-social variables such as geographical location and individual variables such as age. It can then be seen that the increase in the share of people with no wealth at death is still observed even when the data are broken down by age group or place of residence (Table 1; details in Bourdieu et al., 2003). In other words, the decline in the number of wealthy individuals does not result from the combination of the ageing population and the decline in wealth with age; neither does it result from the rise in urbanization (Bourdieu et al., 2003).
66If a share of the population has no wealth at death and that share grows over time, it follows that the economic role of wealth is not that proposed by ordinary economic approaches and that it is not evolving in the expected direction (Ando and Modigliani, 1963). One of the main reasons for an individual to save money and create wealth is to build up a reserve for use when income decreases or dries up altogether (Masson, 2009). In the period under consideration, in a society such as that of France, with no unemployment benefits or guaranteed minimum income and where only a minority of the population had a pension, one would thus expect asset ownership to be the general and increasingly widespread goal. The decline in independent occupations (especially farmers and self-employed tradespeople) involving the holding of professional capital should logically have been offset, for wage-earners, by an increase in savings, in the form of money or property. But this was not the case. In the nineteenth century and first half of the twentieth century, saving was not (or, at least, not principally) the way that most individuals handled the ebb and flow of income over the course of their lives.
67This changes our view of primary inequality, raising a whole new set of questions and prompting new avenues of research. But for most of these lines of enquiry, nominative data are required. And while some data have been produced and are available in the existing base, others could be added, providing they are collected on the same nominative basis.
68Let us return to one of the biggest paradoxes in a database on the wealth of the French population, namely that it allows us to establish exactly how many people had none. But given the growing size of that population, we need to look beyond that observation alone. The seemingly homogeneous absence of economic capital appears highly implausible given the large size of the group under consideration. There are many ways of “having nothing” and they change over time. It is useful here to call on indicators that, at a given moment in time, indicate a gradation in forms of poverty. Town mayors long had the power to deliver certificates of poverty giving the holders a number of rights. As these certificates were liable to affects a person’s tax situation, they were often mentioned in the sources used for the database and thus constitute an effective marker for identifying poor people recognized as such. Similarly, the occupation given in such certificates – providing it is interpreted carefully, taking age and place into account – can be used to socially place individuals with no assets. Many other nominative sources could also be matched with the project data to supplement them with relevant individual-level information such as the health status, height and educational level of conscripts (Farcy and Faure, 2003; Kesztenbaum, 2008). These are just examples, but they suffice to show how the TRA project, initially designed to look at economic wealth, can be used equally well to define poverty and, ultimately, to arrive at a more exact understanding of inequalities.
69In addition, as new results are progressively obtained in the analysis of changes in wealth inequality over time, we must be able to review the ways in which these results are interpreted by using finer indicators. Our sample provides scope for this. For example, it can be used to examine how changes in inequalities in France are linked to the process of urbanization. It underlines to what extent changes in wealth inequality across the country are dependent on internal changes within the rural world, despite the often fast growth of cities and the inequalities that characterize them (Bourdieu et al., 2008). In rural areas, as in France as a whole, the decrease in the share of the wealth-owning population mainly concerns people with modest assets, thus increasing wealth differentials.
The elderly
70The more primary inequality grows, the more important it becomes to understand what is happening within the group of those leaving no estate behind, and particularly within the specific group constituted by the elderly poor. The fact that so many people coming to the end of their working lives have no wealth and are thus deprived of the income necessary for their final years is not consistent with a life-cycle hypothesis. To clarify this point, it is useful to maintain the same approach, keeping the goal of analysing phenomena on a macro-economic scale while seeking to characterize those phenomena at individual level. We can apply that approach to assessing who among the elderly has sufficient economic capital to support themselves. Assumptions obviously play a part in this exercise, but let us suppose that the wealth left behind by an individual is the wealth that they would have had at their disposal if they had continued to live for a period equivalent to their remaining life expectancy. It appears that towards 1900 only a third of the population had enough wealth from which to draw sufficient income to cover their needs during their last years of life. Furthermore, that share of the population, after decreasing during the long crisis at the end of the nineteenth century, again shrank considerably during the interwar period, as corroborated by accounts from contemporary witnesses confronted by the poverty of the elderly (Bourdieu et al., 2011).
71As a consequence, many elderly individuals were unable to retire and had to carry on working until the end of their lives. The phenomenon is difficult to measure, but one viable way of doing so is to look at the occupation of the deceased. Some of the stated occupations may no longer have been exercised, especially those representing a sort of “title of nobility”: generals keep their title until death. But the occupation is most likely to be reported if it is exercised just before death. One exception here – reflecting the limits of the sources as much as the complexities of reality – is the significant under-reporting of women’s occupations, obviously varying according to age, social group and marital status, to the point that it is difficult to draw valid conclusions. Such analysis is therefore more instructive for men, with the observation that, with the exception of the farming sector, the proportion of occupations reported on death (including for the least qualified) is higher for those leaving no wealth behind than for those who do (Bourdieu and Kesztenbaum, 2007). But it is also certain that work was not the only solution for all the elderly, if only because many of them no longer had the physical strength to carry out the arduous work that was generally the lot of the unqualified labourer (Lee, 2005).
72Elderly individuals failing to support themselves could choose to live with their families (Ruggles, 2007). The TRA project is lacking in this area because the sources are insufficient for determining who a given individual lived with towards the end of his or her life. But nominative lists from censuses can be used to gauge the extent of cohabitation, even if more in-depth research is required to gain a more precise picture of family living arrangements (Claverie and Lamaison, 1982; Fauve-Chamoux, 2006; Grafmeyer and Dansereau, 1998; Laslett and Wall, 1972). By drawing on family reconstitutions and matching TRA data with census data, this issue can be addressed upstream to determine how and to what extent mutual family support was a potential resource for the elderly (by examining if elderly persons lived close to family members able to provide them with regular or occasional assistance).
73One last consideration opens up another possible field of interest for the TRA project. Without the help of their families, elderly individuals were obliged to seek public assistance. TRA individuals thus represent a potential observatory for trends in social protection. Considerable changes over time in the number of TRAs dying in a hospice, or the fact that municipalities with a welfare office took in a higher proportion of poor people, are clues as to the extent of individual needs and the reach of local assistance policies.
74In a more general sense, the project can be used to measure the effects of the gradual introduction of pension systems. To do so, we first need to know who was receiving a pension. This can be done in a number of ways. Some of these are direct, such as the mention of a pension in an individual’s declaration of inheritance or in another nominative document, for example the “Bulletin des Lois” which, from 1800 to 1908, published an annual nominative list of people drawing a state pension, from which the TRA individuals concerned can easily be extracted. Others clues exist – chief among them the occupation of the individuals in the sample – that, though indirect, are sufficient for verifying pension entitlements. Comparing these individuals with the rest of the population reveals that individuals drawing a pension were more likely to leave wealth behind on death (at a comparable age and socio-occupational level). This result is consistent with the idea put forward in the nineteenth and early twentieth centuries by supporters of a mandatory state pension system, which they saw as the only way of enabling people with low income to build up the resources needed for the end of their lives and which, they believed, would encourage older people to save by giving them a longer-term view of their future (Bourdieu et al., 2011).
75Just a few of the findings obtained using the TRA project have been presented here. They nevertheless suffice to illustrate the variety and interest of the questions that can be studied using TRA data. These results also show that the project has a dual dimension. The first is its individual-based yet nationally representative design, which ensures a clearly differentiated vision of the scales of the phenomena under consideration, making it possible to identify them as components of processes that, in some cases, are visible only at macro-economic level, while grasping the distinctive nature of their realization on the micro-economic scale. The second is that the database is designed to grow and evolve. The new questions arising from the results it generates provide regular suggestions for new additions, with the guarantee that any additional new data will benefit in a cumulative fashion from all the information already available.
Conclusion: a matrix for historical research
76Through its extensive geographical and temporal reach, its ability to represent the evolution of (metropolitan) France in its entirety, and the wealth of information it provides at social, demographic and economic levels, the TRA database sheds new light on considerable swathes of contemporary history and can be used to find out more about the social mechanisms and structures inherent to transformations in French society over the long term.
77This article highlights two essential characteristics of the project that underpin its originality and its ability to produce new knowledge: it is a cumulative instrument and a representative base of individual-level data.
78Following in the footsteps of Louis Henry, the TRA project aims to construct a national observatory within a contained and controlled framework, but does so in a dynamic and cumulative manner, since any nominative source can be added to the database to enhance it while at the same time benefitting from what the project has to offer as it stands. And while it works first and foremost at individual level, it can also address other dimensions. Research on the mobility of the French population in the last two centuries – the importance of which is shown by the presence of TRA individuals in nearly half of all municipalities – will be furthered by taking account of the information available on the characteristics of the municipalities themselves (Bourdieu et al., 2000). [4] The project allows for multiple forms of data accumulation.
79This is why TRA should be used, appropriated and approached as a resource that, far from being complete, exists to be improved and enriched. A second volume pursuing the same objectives as the first will provide additional data on the twentieth century that, subject to a certain number of precautions stemming from changes in sources, will extend the current base to the 1960s. Further databases benefitting from the nominative nature of the TRA base will also be added to the data already available. They simply need to adopt the TRA project’s unique sampling principle applicable identically to any nominative source: individuals with family names beginning with the letters TRA. The process has already begun, for example with the collection of military records (Kesztenbaum, 2008), pension files (Bourdieu et al., 2011) and the nominative census lists. In each case, the resulting databases can be used to extend the TRA approach to other research fields, while benefitting from the information already available. This is the principle and the remit of the TRA project, which, far from being closed, serves as a matrix for writing a history that intertwines the micro- and macro-social.
80This approach to data aggregation generates a considerable corpus of information that measures basic social structures (civil status, occupation, family situation, etc.) and makes them directly useable on objects for which they may not be central, yet for which they define an extremely useful framework for analysis and verification. New quantitative samples stand to benefit from the TRA project not just by making indirect use of the database’s descriptive statistics but by integrating all that the TRA project already brings to their own data, providing these last are nominative and include TRA individuals. In return, it is hoped that via this cumulative approach, the individual-level information produced by the samples in question will then in turn be made available in the TRA project.
81The second characteristic of the TRA project lies in its ambition of representativeness. In other words, it seeks to establish phenomena that occur in a statistical manner and on a certain scale. Representativeness implies the idea of comparability, of using the same terms and identical variables to analyse phenomena occurring in other temporal, spatial and social worlds. It implies that there is something in common between the wealth of a farmer in southwest France in 1823 and that of a Prefect in Normandy in 1912. Quantifying wealth on a monetary basis is liable to make such comparability too immediately acceptable. The alternative stance – in which each historical occurrence is considered as irreducible, thereby making any generalization impossible – is no less aporetic if it leads to an isolated narrative that more than likely fails to perceive the global patterns in the reality it describes. The idea here is not to rhetorically pit two equally extreme and untenable positions against each other, but rather to show that the construction and use of the TRA sample provide an empirical means of moving beyond these opposing positions. The categories of analysis it sets out to observe and measure are built on a scale that corresponds to a meaningful historical reality. Because France is a nation, and because the country’s legal, institutional, political and linguistic systems are defined on a national scale, the project provides an observatory of social phenomena on a national scale. While based on individual data, it nevertheless recognizes that the individuals referred to in the social sciences are social agents whose properties can be expressed only at a macro-social level of analysis.
82And that comes at a price. First of all, the scale is far too narrow; some of the social phenomena of the nineteenth and twentieth centuries took place at European and world level, not just in the sense that some individuals broke free of social or geographical frontiers through their individual mobility, but in the sense that the very analysis categories themselves (industrialization, the rise of wage employment, the inception of the nation state) often reflect historical processes that played out across Europe and far beyond in the confrontation of nascent national frameworks. Second, the scale is also no doubt too broad to detect partially autonomous social forms taking shape “locally” in smaller territorial units, in more specific spheres of social life and which, in a certain manner, it is always possible to revisit, for example by using geographical variations of the terms designating the “same” occupations.
83Yet these two limits are neither an obstacle nor a weakness for the TRA project. They sketch out a two-pronged extension of the database beyond the additional sources that may in future be added to it. The TRA project already offers a possible gold standard for all samples specific to a particular local space or group. These samples may form part of the national portrait built up by the TRA project and potentially lead to changes in the construction of certain variables and even in the overall picture. But the TRA project is also part of an international movement that is breathing new life into historiography through individual-level data. In some respects it stands as the French component, and will no doubt long remain the sole representative for France of a more global movement.
Notes
-
[1]
Mainland France and Corsica.
-
[2]
Or at least in metropolitan France (Bourdieu et al., 2013, pp. 29 and 79).
-
[3]
This dual role is underscored by the particular case of soldiers who died at war, whose estate, while not taxed, was subject to a declaration that, like any other, established a detailed list of their assets.
-
[4]
Matching municipal information with the data in the base does, however, require that the shifting boundaries and composition of municipalities since the Revolution be taken into account (Motte et al., 2003).