1Because of the structure and demographic history of its population, Quebec, which developed from a small pool of founders and whose rapid expansion was primarily the result of natural increase, constitutes a remarkable laboratory for population genetics studies. The genealogies that can be reconstructed for this context possess levels of completeness and depth rarely obtained elsewhere. Marc Tremblay, Julie Arsenault, and Évelyne Heyer draw on these rich genealogical data, which they subject to modern techniques of statistical genetics. They identify the founders for five regions of Quebec and compare their probabilities of transmitting their genes to a significant proportion of the contemporary population. In two of the five regions, several founders have a non-negligible probability of having transmitted their genes to 5% of the population, a finding which is to be related to unusually high levels of certain rare hereditary disorders.
2Population genetics studies seek to determine the magnitude of genetic variability in natural populations and to explain its origin and persistence and its relevance to the evolution of populations (Hartl, 1984). When a mutation occurs in any given gene, leading to a malfunction of this gene and a risk of disease, the diffusion or disappearance of the mutated gene in the population is determined primarily by the demographic behaviour of the carriers. Although new mutations are rare, all individuals carry abnormal genes (between four and five lethal recessive genes) among the genes present in their cells (Vogel and Motulsky, 1986).
3Several studies have shown certain rare hereditary disorders to have a particularly high incidence in Quebec (Bouchard and De Braekeleer, 1991 and 1992; Vézina, 1996; Scriver, 2001). Conversely, other disorders commonly observed in European populations are practically non-existent in some regions of Quebec (Bouchard and De Braekeleer, 1992). The high (or low) carrier frequencies of deleterious genes can be explained by founder effects or by the phenomenon of genetic drift. A founder effect occurs when a new population is formed from a relatively small number of individuals belonging to the same parent population (Bouchard et al., 1991a; O’Brien et al., 1994). Genetic drift is a modification of genic frequencies in a population resulting from the random hazards of transmission (Hartl, 1994). In the case of founder effects, people with hereditary disorders have a very high probability of carrying the same mutation. It thus becomes possible to develop tests for carriers in order to improve detection and prevention of these disorders. The high frequencies of hereditary disorders observed in Quebec are generally attributed to one or a combination of several founder effects.
4These diseases are not evenly distributed across the various regions of Quebec (Table 1). Certain genopathies are observed mainly in the north-east regions of the province (notably those of Saguenay and Charlevoix), while others are more frequent in the south and the west. These differences derive largely from founder effects and from regional variation in migration patterns; in this study, we have attempted to bring some of these effects into sharper focus. Using the method of gene dropping on reconstructed genealogies, we have measured and compared the transmission of founder genes in five regional populations of Quebec.
Some of the principal hereditary diseases found in the population of Quebec

Some of the principal hereditary diseases found in the population of Quebec
I – Data and methods
1 – Regions studied and genealogical data
5The data used come from genealogies reconstructed with information about 756 individuals who were married in five regions of Quebec (Beauce, Terrebonne, Charlevoix, Rimouski and Saguenay) between 1900 and 1971 (Table 2). These regions were chosen for their distribution across the territory of Quebec (Map 1) and for the availability of data at the time the study was conducted. Data from the Charlevoix and Rimouski regions were first used in the study by Tremblay et al. (2000). Heyer et al. (1997) used data from the Saguenay, and Bergeron (2001) used data from the Beauce. Finally, data for the Terrebonne region were compiled specifically for the present study.
Characteristics of the starting points for the genealogies (subjects) by region

Characteristics of the starting points for the genealogies (subjects) by region
Geographical location of the five populations studied

Geographical location of the five populations studied
6With the exception of the Beauce region, all the starting points for the genealogies were taken from the BALSAC Population Register, which contains information derived from marriage, baptism, and burial records (Bouchard et al., 1991b). In the case of the Saguenay and Charlevoix regions, we started with parents of children born between 1950 and 1971 (chosen at random from baptism records). For Rimouski and Terrebonne, marriage records were randomly chosen from those available for each period. For the Beauce region, the initial data come from the marriage records published for this region by the Quebec Genealogy Society (1995); these are marriages that served as control cases for an earlier study (Bergeron, 2001).
7The genealogies were reconstructed mainly by means of the BALSAC Population Register and the BALSAC-RETRO Genealogical Database. Initially planned to cover the population of the Saguenay region (between 1842 and 1971), the BALSAC register now extends to the whole population of Quebec for the nineteenth and twentieth centuries (marriage records); at present, it contains at least 1.8 million computerized records, of which about half have been linked (Bouchard, 2002). The BALSAC-RETRO Genealogical Database is a peripheral database of the BALSAC Population Register, developed through various research projects (Jomphe and Casgrain, 1997). It contains genealogical information on approximately 315,000 individuals from all regions of Quebec and extends back to the beginning of the seventeenth century, i.e. to the founding of the population of Quebec. Other data sources were also used. For the more recent generations, several records of marriages published by genealogical societies in Quebec were consulted. We used information from the Population Register of Old Quebec (Charbonneau et al., 1987) and from Jetté’s genealogical dictionary (Jetté, 1983) for the period covering the seventeenth and eighteenth centuries. All the genealogical paths were followed back until the necessary information (i.e. the names and marriage dates of the parents) became impossible to find in the available sources or referred to individuals married outside Quebec. In most cases, the genealogical paths extended back to the seventeenth century, to the early founders.
2 – Genealogical completeness and depth
8For each regional genealogical sample, we calculated the completeness of the genealogies by generation level and the average genealogical depth. The completeness for a generation level g (Cg) corresponds to the ratio between the number of ancestors identified (repeated ancestors are counted as many times as they appear) and the number of ancestors expected for this generation level (Jomphe et al., 2000):
10where:
11g = generation (the first generation is that of the subjects’ parents);
12Ag = number of ancestors known for generation g (over all the genealogies);
13N = number of genealogies.
14The mean genealogical depth (D) is calculated by taking the sum of Cg over all generations:
16where: m = highest generation level.
17This indicator can be interpreted as a measure of the average generation at which the genealogical paths are interrupted (Cazes and Cazes, 1996).
3 – Identifying the founders
18The concept of founder refers here to the genealogical “end of the line”, that is, to the individuals whose parents are unknown. All the founders of each group of genealogies were identified and counted. The groups of founders were also compared in order to reveal similarities between regions (common founders). These founders served as the starting point for application of gene dropping to calculate genetic transmission probabilities.
4 – Transmission of founder genes: the gene dropping method
19In the late 1980s, simulations of genotypes based on real ascending genealogies began to be performed regularly (Thomas, 1990). Hitherto, simulation was more often used to reproduce the structure of ancestries (Edwards, 1968). Gene dropping is used for various purposes, notably to estimate inbreeding coefficients and to predict the risk of losing genes passed on from founders (MacCluer et al., 1986). Heyer (1999) applied this method to a set of genealogies of individuals from the Saguenay region in order to test the hypothesis that a deleterious gene could be introduced by a single founder.
20The gene dropping procedure begins with the attribution of two hypothetical unique alleles to each founder in the sample of genealogies. Figure 1 illustrates the process of attribution of the alleles in a simple ascending genealogy.
Attribution of two unique alleles (G1 to G10) to each founder (F1 to F5)

Attribution of two unique alleles (G1 to G10) to each founder (F1 to F5)
21A random number generator is then used to determine which of each founder’s two alleles is transmitted to each descendant, and so on. At the end of the first simulation, all the individuals in the genealogy have a genotype (Figure 2). The procedure is then repeated several times, and after each simulation the number of times an allele is attributed to different subjects is calculated. A large number of simulations are performed and the results are used to calculate probability distributions for the occurrence of the founder alleles. In total, 50,000 simulations were carried out on all 756 genealogies. The simulations and analyses were written in the C++ programming language (Borland C++ and Visual Studio software).
Application of the gene dropping method: genotypes are attributed to descendants by Mendelian segregation of founder alleles

Application of the gene dropping method: genotypes are attributed to descendants by Mendelian segregation of founder alleles
22The method and basic principles of gene dropping are relatively simple. Since each parent possesses two alleles at a given locus and since only one of these alleles will be passed on to a child, each allele has a one in two chance of being transmitted to each descendant, this probability being independent of the heritage of the other parent. This is quite simply a simulation of the Mendelian transmission of a gene at a given locus. The method can be extended to the case of several loci, thus making it possible to measure the change in the linkage disequilibrium (a measure of the association between two loci) (Austerlitz and Heyer, 2000). Application of this method, however, rests on certain hypotheses. First, selection at the simulated locus is not taken into account, since we are seeking only to replicate the effect of chance due to the Mendelian transmission of genes. In reality, besides random variations, allele frequencies are subject to evolutionary forces such as migration, mutation, and natural selection, which produce a specific and non-random variation in allele frequency in populations (Hartl, 1994). With the method used for this study, the effects of migration are captured through the reconstructed genealogies. Mutations are not taken into account, but their effects are negligible at the historical scale of the population of Quebec (a dozen generations). Regarding the force of selection, Heyer (1999) has shown, by comparing a neutral recessive gene with a lethal gene, that in the case of recessive disorders the results of the simulations differ little.
23On the whole, the simulations performed for this study take into account all the available and relevant information concerning the ancestral origins of the individuals in each of the populations studied. Although the quantity of information varies from one group to another, on average the completeness and depth of the genealogies used reach levels rarely observed in other populations, thereby increasing the accuracy of the results obtained from these simulations. Finally, since each of the groups constitutes a random sample of the regional populations selected for this study, different results might be obtained with different samples. Nevertheless, taking into account the structure and demographic history of the population of Quebec and its regions (relatively small pools of founders, rapid expansion primarily through natural increase), we believe that the results presented accurately reflect the populations studied. Furthermore, tests carried out on different genealogical samples have established that beyond a certain number of genealogies (between 60 and 100, depending on the region of origin of the subjects), the amount of information introduced has relatively little impact on the results.
5 – Carrier frequency and transmission probabilities
24For a given founder, the carrier frequency of the founder allele is determined by the number of subjects, in each sample, to whom the simulated transmission of alleles attributed one of the two alleles of the founder in question:
26This calculation is performed after each simulation, and the number of simulations where the carrier frequency reaches a given value permits estimation of the transmission probabilities for a given frequency interval. These probabilities are calculated assuming a normal distribution around the frequencies calculated in the sample (Heyer, 1999).
27Two frequency intervals were analysed: all frequencies greater than or equal to 5%, and those between 1% and 5%. The first interval is used to identify the most important founders for each region, while the second interval corresponds to the carrier frequencies of several recessive genes in the population (Bouchard and De Braekeleer, 1992; Scriver, 2001).
II – Results
1 – Genealogical completeness and depth
28Table 3 shows the completeness of the genealogies by generation as well as the average genealogical depth. The genealogies are complete (Cg = 100%) up to the second generation for the regions of Charlevoix, Rimouski and Saguenay; up to the third generation for Beauce; and up to the fourth generation for Terrebonne. The completeness measures stay very high (above 90%) for all regions up to the seventh generation. As of the ninth generation, the values drop more rapidly (especially for Rimouski). More than two thirds of the ancestors of the population of the Charlevoix and Saguenay regions were nevertheless identified up to the eleventh generation. Completeness falls practically to zero beyond the fourteenth generation (start of the settlement of Quebec). For the recent generations of forebears, most marriages took place in the same region as that of the subjects. For older generations, the zone of the places of marriage of the forebears expands to other regions of Quebec, and then outside the province.
Completeness of genealogies at each generation (%) and average depth of genealogies, by region

Completeness of genealogies at each generation (%) and average depth of genealogies, by region
29The average genealogical depths are all greater than 10, except for the Rimouski region. The slightly lower depth obtained for Rimouski (9.51) is attributable to the fact that the marriage dates of the subjects in this region are slightly earlier, on average, than in the other four regions (see Table 2).
2 – Distribution of the founders
30The total number of founders identified varies appreciably by region (Table 4). Thus, 2,624 founders were traced for the population of the Beauce region, as opposed to 6,468 for that of Terrebonne. These results are directly related to the sample size for each region (the Beauce sample being the smallest), but they also depend on other factors, notably the genealogical depth and the structure of the genealogies (frequency of appearance of each ancestor). Thus, on average, the Beauce genealogies yield a larger number of founders than those of the Charlevoix, Rimouski or Saguenay regions (between 19 and 28). The Terrebonne group has the most diversified origins, with an average of 43.1 distinct founders per genealogy.
Number of distinct founders identified in each region

Number of distinct founders identified in each region
31The five groups of founders are not completely separate. Several founders are in fact observed in more than one group (or even in all the groups). Table 5 presents, for each group of founders, the proportion of founders shared with each of the other groups. Here again, the difference in sample sizes must be taken into account when comparing results. Still, it can be noted that a large proportion of the founders in each region (at least 35% in each case) are also found in other regions. More than 80% of the founders of the Beauce, Charlevoix, Rimouski and Saguenay regions appear among the founders of the Terrebonne region. Similarly, 91% of the founders of the Beauce and Charlevoix populations also appear in the Saguenay genealogies. Considering the number of subjects involved, this similarity is rather remarkable.
Proportion of population founders of each region shared with each of the other regions (%)

Proportion of population founders of each region shared with each of the other regions (%)
3 – Probabilities of reaching a carrier frequency greater than or equal to 5%
32Table 6 presents, for each region, the 20 founders with the highest probabilities of having transmitted their genes to 5% or more of the population, that is, at least one person in 20 would be a carrier of one of the founder’s two alleles.
The 20 highest probabilities of reaching a carrier frequency greater than or equal to 5%, by region

The 20 highest probabilities of reaching a carrier frequency greater than or equal to 5%, by region
33The highest probabilities are found in the Charlevoix region. Four founders of Charlevoix (numbers 38, 239, 316, and 323) have probabilities of around 22%. In the Saguenay region, the probabilities for these same four founders are close to 10%. In none of the other three regions do the probabilities exceed 1%. The Charlevoix and Saguenay regions are clearly differentiated from the others: of their 20 most important founders, 19 are the same in both regions. The probabilities calculated for the Charlevoix region are nevertheless still higher than those calculated for the same founders in the Saguenay region. These results clearly illustrate the gene flow that occurred between these two populations at the start of the colonization of the Saguenay region through the settling there of immigrants from Charlevoix (Pouyez and Lavoie et al., 1983; Gauvreau and Bourque, 1988). It is estimated that between 60% and 70% of the genetic endowment of the Saguenay population derives from that of Charlevoix (Lambert, 2002).
34In the Rimouski region, two founders (numbers 118 and 70) stand out from the others, though the probabilities that their genes were transmitted to at least 5% of the population only reach 0.03% and 0.04%. In the Beauce region, the highest probability for a founder of having this carrier frequency is 0.43%. The probabilities for the other founders of this region decrease gradually, and none seems to stand out from the others. The lowest probabilities of reaching a carrier frequency of at least 5% are observed in the Terrebonne region: all are below 0.01%.
35These results show that there is very little chance that founders’ genes could have been transmitted to at least 5% of the population in each of the five regions. With the exception of those common to Charlevoix and Saguenay exclusively, only six founders are among the top twenty in more than one region (shaded areas in Table 6). Founders 295 and 326 appear in the top twenty in four of the five regions (Terrebonne being the exception), founders 196 and 297 are among the top twenty in the Beauce, Charlevoix and Saguenay regions, while founders 70 and 118 are found in Beauce and Rimouski. It should be noted that Terrebonne is the only region not to share any founders with another region.
4 – Probabilities of reaching a carrier frequency between 1% and 5%
36The twenty founders with the highest probabilities of having transmitted their genes to between 1% and 5% of the population (Table 7) are broadly the same as those observed previously. The order in which they appear, however, is not always the same as in Table 6, especially in the Charlevoix and Saguenay regions.
The 20 highest probabilities of reaching a carrier frequency between 1% and 5%, by region

The 20 highest probabilities of reaching a carrier frequency between 1% and 5%, by region
37Furthermore, at this level of frequency, the transmission probabilities are much higher than in the previous case. The founders in the Charlevoix and Saguenay regions are again distinguished by the highest probabilities, but the differences compared with the other regions are much smaller. The highest transmission probabilities are almost 51% for the most important founders of Charlevoix (numbers 217 and 7), 50% for those of Saguenay (numbers 323, 316, 38, and 239), 45% for Rimouski (number 118), 44% for Beauce (number 2301) and 36% for Terrebonne (numbers 1161 and 1155). Thus, if one or other of these founders had been a carrier of a deleterious gene, the chances of this gene being present in the population in question, at a frequency between 1% and 5%, varies from 36% to 51%.
38Once again, of the 20 principal founders of the Saguenay and Charlevoix populations, 19 are common to both regions. For some of these founders, however, the probabilities observed in Saguenay exceed those obtained in Charlevoix, which was not the case for carrier frequencies greater than or equal to 5%. The founders of the Terrebonne region are again characterized by the lowest probabilities: only five of them have more than a 20% probability of having reached a carrier frequency between 1% and 5%, while no probability is below 23% for the principal founders of the other regions.
39Here too, excluding the founders common to the Charlevoix and Saguenay regions exclusively, very few of the founders are among the top 20 in more than one region (shaded areas in Table 7). The six founders mentioned earlier are again present, plus two others. Founders 295 and 326 are in the top 20 in all five regions and founders 196 and 297 are among the top 20 in Beauce, Charlevoix and Saguenay. Founders 70 and 118 still appear in Beauce and Rimouski, and founder number 70 also appears in Terrebonne. One of the other two founders is common to Terrebonne and Rimouski (number 110), and the other is common to Charlevoix, Rimouski and Saguenay (number 71).
40Compared with the frequencies greater than or equal to 5%, the number of founders with a non-zero probability of having a carrier frequency between 1% and 5% is much higher, although for the majority of them (at least 67% in each case), this probability is less than or equal to 0.01% (Figure 3). Around 20% of the founders of the Beauce region (less than 10% in the other regions) have more than a 1% probability of having reached this frequency level. The Beauce is also characterized by the highest proportion of founders (6%) with probabilities over 10%. Finally, the distribution of the founders of Charlevoix by the probability of having transmitted their genes is closer to that of the founders of Rimouski than to that of the founders of Saguenay.
Distribution of founders of each region by their probability of reaching a carrier frequency between 1% and 5%

Distribution of founders of each region by their probability of reaching a carrier frequency between 1% and 5%
Conclusion
41The results of these simulations have uncovered important differences between the transmission probabilities of founder genes in the five regions studied. Overall, very few founders have relatively high probabilities of having transmitted their genes to at least 5% of the population. At this level of carrier frequency, the highest probabilities are observed in the Charlevoix region (four founders have a probability above 20%). For carrier frequencies between 1% and 5%, the transmission probabilities are much higher in all five regions. Charlevoix and Saguenay are differentiated from the other three regions, however, with probabilities higher than 34% for their 20 principal founders. In every case probabilities are the lowest in the Terrebonne region; it will be remembered that this region has the highest average number of founders per genealogy, implying greater genetic heterogeneity.
42The comparison of the founders by region showed that Charlevoix and Saguenay share many founders. For carrier frequencies between 1% and 5%, the most important founders have similar transmission probabilities in both regions. The demonstrated presence of rare diseases or of disorders specific to the populations of these two regions, such as the spastic ataxia of Charlevoix-Saguenay and sensorimotor polyneuropathy with or without agenesis of the corpus callosum (see Table 1), is in all likelihood a consequence of the genetic contribution of these founders common to Saguenay and Charlevoix. For the other regions, we find very few common founders among the top twenty. Only two of these founders are common to all five regions for carrier frequencies between 1% and 5%. Their probabilities of having transmitted their genes at that frequency nevertheless vary by region, with high values for those of Charlevoix, Saguenay and Beauce (from 42% to 47%), medium for Rimouski (25%) and lower for Terrebonne (8% to 9%). Thus, although the five populations studied share, in total, a large number of founders, their genetic contribution varies significantly from one region to another. Viewed as a whole, the results suggest that differences become more pronounced with greater geographical distance between regions, but other regional populations need to be studied to improve our understanding of this phenomenon. These results nevertheless bear out the possibility that certain deleterious genes could have been transmitted in several regions of Quebec.
43Because the great majority of founders of each region have less than one chance in 100 of having transmitted their genes to between 1% and 5% of the population, it is likely that the presence of deleterious genes at such frequencies in these regions is attributable to only a few of the principal founders. If these simulations had been carried out for genes for recessive diseases, the results would not have been very different from those for neutral genes, since recessive genes hardly ever reach a high enough frequency for the process of selection to have a significant effect (Heyer, 1999). These results do not exclude the possibility that other founders might have high probabilities of transmitting their genes at frequencies lower than 1%, although in this case recessive diseases become much rarer. Finally, regional comparisons are a profitable path to explore for identifying the founder or founders who introduced a given genetic disorder into the population of Quebec. Previous studies have shown that it was impossible to determine, in the groups of founders common to various groups of individuals with hereditary diseases in the Saguenay region, which had introduced one or other of these disorders (Heyer and Tremblay, 1995; Heyer et al., 1997). On the other hand, when the affected individuals are distributed over a wider geographic scale, it is possible to identify in certain cases the founders who are likely to have introduced the disease (Gagnon et al., 2001; Vézina et al., 2001). A more in-depth study including a larger number of regions and more easily comparable genealogical samples (as regards sample size and the marriage dates of the subjects) would make it possible to further specify the similarities and differences in the regional structure of the gene pool of Quebec.
44Note: This study received financial support from the Social Sciences and Humanities Research Council of Canada, the Fonds pour la formation des chercheurs et l’aide à la recherche du Québec and from the Asselin Foundation (Saguenay).