1Human science researchers are often asked to describe the ideal database, the one that would perfectly meet their needs. It is then up to them to try to find or develop a base that approaches this ideal. While specific needs vary from one project to another, researchers widely agree that the ideal database would be representative of the population at both the national and local levels over an extended time period ; track individual trajectories from birth or soon after entry in the given country all the way to emigration or death ; comprise large samples allowing for fine distinctions corresponding to a broad choice of variables and response categories ranging from individual to family and occupational situations. And why not surrender to the myth of eternal youth and demand regular renewal of observations so the base would never become obsolete ?

2Just such an ageless database actually does exist in the French statistical system : the permanent demographic sample (EDP). EDP was developed in the late 1960s by France’s National Institute of Statistics and Economic Studies or INSEE. Sample individuals are selected by date of birth ; special census and civil registry forms have been designed for EDP respondents. The database compiles information from France’s five exhaustive censuses, 1968 to 1999, and from annual census surveys, first conducted in 2004. Civil registry information is collected on the selected individuals, including birth, marriage, birth of children, and death. EDP had a total of 2.7 million respondents in late 2013, and is being enhanced through tracking of existing trajectories and inclusion of new respondents born on the selected days.

3EDP is actually an administrative-type panel survey that has been running for 45 years. The way it is drawn up frees it from the traditional problems of panel surveys, namely attrition and distortion over time. First, the fact that it draws on exhaustive information-collecting sources ensures that tracking is systematic throughout France and not disrupted by address changes. Up to 1999 this condition was met by the censuses ; now the database has been linked up to other sources to the same effect. Second, the way samples are drawn makes it possible to integrate new respondents as they come into existence in France, thereby maintaining panel representativeness.

4Given the relatively low volume of studies that use the database, EDP does not seem to have received the recognition it deserves. But changes in access modes, the survey’s further enhancement through other data, and the central position it has reassumed in the French statistics system should increase researcher interest. Stéphane Jugnot’s working paper can play an important role here.

5Its first strength is that it offers a clear, precise and up-to-date overview of EDP. EDP has made information accessible that used to be scattered about in documents that were difficult-to-access or abstruse (at least for non-specialists). Jugnot reviews the history of how the database was developed, explaining both continuity and breaks over time. Switching from exhaustive censuses to annual census surveys was a major change. Jugnot’s analysis distinguishes between new database characteristics (such as increased participant tracking frequency), permanent characteristics, and characteristics that have been impacted by other changes (such as geographic mobility tracking). What makes this historical perspective particularly relevant is that the changes that have occurred during EDP’s “lifetime” are working to provide it with a new foundation, made up once again of exhaustive information sources. A survey entitled EDP++ has been developed that will include administrative data such as annual payroll information reports (DADS) and tax statements. EDP will also acquire new analytic dimensions by way of information on individuals’ wages and income, that could not previously be included.

6In this highly up-to-date introduction to EDP, the author also imagines the immediate future of the database, which, though permanent, is also being perpetually renewed. The second strong point of the book is its account of the complexity of EDP. The author’s explanation of the database’s potential uses takes full account of the difficulties involved. EDP is a perfect illustration of the adage that “les données ne sont jamais données” [data are never just given]. In addition to the highly particular access conditions, there are several challenges to using this material, and all manner of pitfalls. While the EDP development mode makes it akin to a panel survey, its panel characteristics exist only as potential and require considerable treatment. Since this is secondary use, the data are not collected with panel use in mind. And despite concern to keep data collection continuous, few variable values are available for all individuals in the same form for all censuses. Only partial census form information is available on GDP individuals for 1968, and on different proportions of GDP individuals for other censuses (1/10 for 1975, ¾ for 1982). Likewise, there are several discontinuities in civil registry information recovery.

7Jugnot’s detailed description of internal INSEE management procedures may seem a bit dry at times, but that information is crucial for assessing the possibilities offered by the database over time. Changes made in 2010-2011 to the INSEE procedure for managing personal identification have had considerable repercussions, making follow-up of persons born outside metropolitan France more difficult. In the end, researchers wishing to use EDP for longitudinal analysis have to make their way amidst all these stumbling blocks, and Jugnot’s working paper is a precious guide to steering clear of them.

8Another originality of the paper is the variety of presentation modes. Jugnot’s approach to changes in EDP is not merely chronological. There are several topic entries, such as EDP data sources and matching methods. The presentation of EDP++ components is an extremely useful tool. The author also takes up the question of using this database in conjunction with others, either regular surveys such as the voter turnout survey and the DADS panel survey [1] or one-off surveys. The author also presents the legal framework for EDP access and, on the practical side, a set of exploratory data. The graph series is a highly effective illustration of his discussion of the strengths and particular challenges of EDP, providing orders of magnitude that should be useful to potential EDP users.

9The paper lends itself to various cross-readings, thereby making itself a valuable tool to EDP users throughout their work. However, it makes no claim to be exhaustive, and it cannot substitute for attentive analysis of questionnaires, forms and code dictionaries. It may be supplemented by other works on EDP indicated in the up-to-date entry list. A practical guide to access conditions and application procedures would have made a useful appendix. Though access is easier now than in the past, it can only be granted after a fairly long set of administrative approval procedures and only within a specific, secured environment, the CASD (centre for secure data access). All in all, though, this working paper offers the lay reader a clear idea of the properties of EDP while the trained reader will find detailed information on recent and upcoming developments in the database. This guide has its rightful place on the desk of anyone interested in EDP.


