1Nineteenth-century Germany was very socioeconomically and culturally diverse. Was the country’s major east-west divide reflected in the spatial distribution of household structures? Miko?aj Szo?tysek, Siegfried Gruber, Sebastian Klüsener and Joshua Goldstein draw on the census of 1885 to answer this question, using data published for the 83 regions and smaller states to build an indicator of household complexity based on the mean number of marital units per household. Taking advantage of the homogeneity of their data at national level, they reveal spatial disparities marked by a north-south rather than an east-west pattern, with greater household complexity in the north of the country. This disparity can be associated in structural terms with economic and socioeconomic characteristics of the regions, notably with spatial variation in agricultural employment, fertility and age structure.
2Post-World War II Germany – and, more recently, post-unification Germany – has provided scholars with a unique opportunity to study demographic differentials in a single linguistic area in which the populations living in the eastern and the western parts of the country were exposed to two very distinct political systems over a period of 45 years. The differences in the ideological principles that guided the policies implemented in the two parts of Germany unquestionably altered the demographic development of their respective populations (Kreyenfeld, 2004). These disparities were later confirmed by the unification process, which brought to the surface pervasive differences in individual demographic behaviour, particularly regarding marriage and fertility (Conrad et al., 1996; Kreyenfeld, 2003).
3Pre-1945 Germany offers researchers an even more interesting set-up. For centuries, Germany was governed by a weak central power structure, as most of the political power was concentrated in the hands of the rulers of the dozens of German states that constituted the Holy Roman Empire, which was succeeded by the German Union and the German Empire. As a result, the German Empire exhibited a high degree of spatial variation in demographic behaviour and socioeconomic characteristics.  As historical Germany was situated in the centre of Europe, it had within its borders many of the cultural, economic, and religious variations found across the European continent. Thus, it represents an ideal laboratory for studying interregional demographic differences, as has been shown in studies on fertility, family formation, mortality, and migration (Knodel, 1974; Vögele, 1998; Hochstadt, 1999; Lee, 2001). However, the co-residence patterns in historical Germany have thus far been largely unexplored (Janas, 2005; Rosenbaum, 1996; Hennings, 1995; Weber-Kellermann, 1982; Lee, 1981).  An exception is a study by Kemper (1983), which looked at the spatial variation of household complexity in 1933. But in that period, the pattern was already heavily influenced by industrialization and modernization processes.
4This article aims to fill this significant gap in demographic studies of historical family structures in Europe by analysing spatial variation in household structures in nineteenth-century Germany. We use aggregate data from published statistics of the census of 1885 to document the spatial variability of co-residence patterns based on a measure of household complexity (marital units per household, or MUH). Four main research questions will be addressed:
- Are the household complexity patterns in the German Empire consistent with the hypothesized European east-west distinction proposed by Hajnal and others (Hajnal, 1965, 1982)? In this case we are particularly interested in cultural interpretations of the Hajnal line that have emphasized differences between Slavic and non-Slavic populations (Macfarlane, 1981).
- Are the historical German east-west differences in land ownership and agricultural organization, also referred to as the East Elbian socioeconomic divide, an important organizing principle of household and family structure, as some scholars believe (Alderson and Sanderson, 1991)?
- To what extent are spatial household structure patterns consistent with the arguments advocated by Ruggles (2009, 2010), who stated that most of the spatial variation can be explained by agricultural employment levels and demographic characteristics?
- Are patterns of household complexity within Germany congruent with the spatial distribution of inheritance practices (Berkner, 1976; Robisheaux, 1998)?
I – What is so special about Germany?
5According to Hajnal, the entire German Empire was dominated by the classic “(Western) European marriage pattern” (Hajnal, 1965; Hajnal, 1982), in which family formation was contingent upon an individual’s ability to establish an adequate, independent livelihood. Local studies have confirmed this view (Imhof, 1976). Other researchers have tentatively argued that the German household and family pattern represented an intermediate category between the extremes of the “Western” (nuclear or stem) and “Eastern” (joint) family types (Laslett, 1983; Robisheaux, 1998; Rothenbacher, 2002). This intermediate form was characterized by a high age at marriage, high proportions of stem-family households, and high proportions of households with life-cycle servants; as well as by generally low proportions of co-resident kin and of other types of complex residential arrangements (Laslett, 1983). Laterally extended households were said to be non-existent in Germany’s distant past, as well as in more recent years (Rothenbacher, 2002). Local studies have indicated, however, that such a homogenous picture of family patterns in Germany may be misleading (Berkner, 1976; Schlumbohm, 1994).
6German ethnologists and demographers of the nineteenth and early twentieth centuries have generally asserted that the “typical” German family type has always been the paternalistically administrated two-generation small family with co-residing servants. Researchers have also argued that there was a fundamental contrast between German and Slavic patterns of family composition and household formation (Schlumbohm, 2009). This “familial divide” was believed to have still existed at the end of the nineteenth century, and to have determined the diverging demographic trajectories of the Germanic and Slavic populations during the demographic changes associated with the first demographic transition (Knodel, 1974; Conze, 1966).
7In a similar vein, Macfarlane observed that Hajnal’s division of Europe seemed to follow the Slav/non-Slav division (Macfarlane, 1981), and suggested that the family and household patterns uncovered in Europe by historical demographers were coterminous with broad “cultural regions”. Laslett and his associates from the Cambridge Group also argued for the presence of a strong “cultural element in the shaping of the domestic group organization” on the continent, and asserted that the pattern of household composition across Europe cannot be interpreted in purely economic terms (Laslett, 1983). The latter view has recently been reiterated in sociological research (Therborn, 2004).
8Scholars have cited a range of institutional, economic, and/or environmental factors in seeking to explain the variation in household structures. Alderson and Sanderson (1991) suggested that the key element in the formation of co-residence groups in historic East Central Europe (east of the river Elbe) was the pattern of land ownership and agricultural organization, which was dominated by the agrarian estate system of manorialism. Ruggles (2009) also stressed the role of economic factors, arguing that the differences in the family systems in both historical north-west Europe and North America, as well as in contemporary developed and developing countries, can be explained by agricultural employment levels and demographic characteristics (fertility and mortality in particular), with no recourse to geographical or cultural hypotheses. In a later article, however, Ruggles (2010) stated that this explanation applied to stem families only, and not to joint families. Elsewhere scholars have argued that household composition strategies were also determined by inheritance practices (Rudolph, 1995). With direct reference to Germany, the role of inheritance rules in determining residential patterns was demonstrated by Berkner, who found differences in peasant household structures in two micro regions of Germany that resulted from different patterns of property transfer (Berkner, 1976).
9Systematic analysis of regional distribution of household patterns appears to be particularly worthwhile in the German context. The area of the German Empire represents a missing link in the existing spatial models of the European family, following recent comprehensive investigations of historical Iberian, French, and even Eastern European patterns (Le Bras and Todd, 1981; Rowland, 2002; Szo?tysek, 2008). A European geography of family forms is not complete without a spatial reconstruction of household composition within Germany. Inter-regional comparisons of co-residence patterns from the published statistics in Germany provide an excellent background against which more detailed studies of family composition in the nineteenth-century German Empire might be carried out in the future. Studying patterns of co-residence in the German context might also contribute substantially to the formulation of further theories regarding the underlying factors of differentials in household composition.
II – Data and Methodology
10In this study we make use of published aggregate-level statistics from the German censuses (see references for the used sources). Although census micro-data have survived for individual locations, and even for several regions of eighteenth and nineteenth-century Germany, the options for using these data to construct a “nationally representative sample” are still very limited (Gehrmann, 2009). An aggregate-level approach therefore seems indispensable when attempting to determine the spatial variation in household complexity in historical Germany.
11There are a number of reasons why the 1885 census is used as the basis for this study, in which we analyse Germany at the level of 83 regions  (Regierungsbezirke and smaller states). First, the information contained in the census allows us to conduct a systematic analysis of household complexity patterns for all of Germany just before the onset of the fertility decline.  While the later census of 1910 provided more informative statistics on households than the census of 1885, it did so for the largest administrative units only (states and provinces of bigger states). In addition, the 1885 census provided finely-gridded data of the 892 districts for the MUH measure, making it possible to explore spatial patterns at a high level of spatial detail. 
12The German census data of 1885 are in general of good quality, as the German Empire had already established high standards for census-taking by that period (see Gehrmann, 2009, on the evolution until 1871; and Lee and Schneider, 2005, on developments between 1872 and 1939). Since the foundation of the German Empire in 1871, the census results for the whole of Germany had been published by the German Imperial Statistical Office. However, census implementation remained in the hands of the statistical offices of the federated German states, which then communicated their results to the German Imperial Statistical Office. This decentralized arrangement had some implications for the comparativeness of the data across the federated states, as the German states had different traditions in census-taking and were sometimes reluctant to give up longstanding statistical definitions which deviated from standards used in other states of the German Empire (Lee and Schneider, 2005). While by 1885 an agreement had been reached on how to define a household  (Rothenbacher, 2002), differences remained in other areas, such as on the question of whether to count the “de facto” or the “de jure” population. Most German states counted the de facto population, but Saxony and the Hanseatic towns of Hamburg, Lübeck, and Bremen reported the de jure population. A unified standard on this issue was not achieved until after World War I (Lee and Schneider, 2005).
13Another matter with potential implications for spatial variation in data quality was that the financial resources of the statistical offices varied across the German states. While Prussia spent 170,000 Reichsmark (RM) on its statistical office (5.77 RM per 1,000 inhabitants) in 1889, some of the smaller states spent substantially less. For example, the joint statistical office of six Thuringian states had a budget of just 768 RM (0.93 RM per 1,000 inhabitants) (Lee and Schneider, 2005; own calculations). The state of Mecklenburg-Strelitz did not even have a separate statistical office in its administration. The smaller states were also unable to invest in cost-intensive technologies such as counting machines that partly automated the counting process (Lee and Schneider, 2005). As a result, some of the spatial variation visible in our data might stem from regional differences in census-taking standards and the resources invested in collecting and checking the census data, as these differences have potential implications for the quality of the data.
Measure of household complexity
14Using the tabulated returns of the 1885 census for household structure analysis does, of course, have some limitations. Because the specification of kin membership in co-residence groups was not provided, the available census data are not useful in conducting an analysis to provide a more detailed breakdown of living arrangements. However, household complexity can be measured from routine aggregate census data on the number of households, and on the population classified by age, sex, and marital status, by using the indices commonly applied in family demography (Burch, 1980; Burch et. al., 1987; Parish and Schwartz, 1972). The number of marital units per household (MUH) is obtained by dividing the sum of the absolute numbers of married, widowed, and divorced males, as well as of widowed and divorced females, by the total number of households in a given region (Parish and Schwartz, 1972).  In this article, we provide this measure per 100 households. In an ideal population that follows neo-local household formation rules  and practices universal marriage, no married individual would co-reside with anyone except his or her spouse and unmarried offspring, and all widowed and divorced persons would live alone. In such a society, the index MUH per 100 households would be expected to equal 100. Figures above 100 indicate either the co-residence of married couples or the co-residence of a married couple with widowed or divorced individuals.
15Since most servants were unmarried, the presence of servants in a household does not increase the number of marital units, and therefore does not interfere with the MUH index. Meanwhile, married, widowed, or divorced individuals who co-resided in the household, but who were not related to the head’s family (and were not counted as a separate economic unit), increase this figure, even if there is no direct indication that more kin were co-resident in the household. On the other hand, the measure is insensitive to unmarried co-resident relatives, even though their presence in the household served to extend the family beyond the conjugal core.
16It should be noted that a measure such as marital units per household is a very crude indicator of household behaviour and household composition strategies. MUH represents household complexity only in the broadest possible sense; i.e. it suggests the extent to which adults of all types tended to co-reside, rather than to live independently in their own households (Burch, 1980). Relying upon MUH is much more problematic if the goal is to gain greater insight into the nature and character of the actual co-residence, and to arrive at a more elaborate classification of household types or living arrangements (Hammel and Laslett, 1974).
17For nineteenth-century France, Parish and Schwartz (1972) proposed two inflection points between the nuclear and stem family systems, and between the stem and joint family systems, of 106 and 123, respectively (the “full” joint family system value of MUH would equal 144). The German median MUH value of 108 is very close to the first of the two inflection points, which might suggest a prevalence of stem family composition in 50% of the German regions studied. However, without more detailed household-level statistics, it is difficult to determine whether the household extensions observed in Germany result primarily from the stem family life cycle and its related pattern of headship transmission, or whether they represent the reincorporation of extended kin at some point in the development of the household (Berkner, 1976; Ehmer, 2009).
18In order to justify the use of MUH as a measure of household complexity, Parish and Schwartz (1972) compared MUH with information on secondary families, as well as on ascendants, from the 1962 census in France. In both cases, they obtained a correlation coefficient of 0.92. But as these findings are not directly applicable to the German context, we also performed a check for Germany. Unfortunately, there are no micro-data available for the whole of Germany in the nineteenth century. But we were able to obtain a sample of ten spatially dispersed rural locations from the 1846 census of the German Customs Union (MPIDR, 2012),  making it possible to compare location-wise the MUH levels with the prevalence of household typologies based on the Hammel-Laslett classification (Hammel and Laslett, 1974). This comparison yields a correlation of 0.78 (significance level 0.008) for MUH and the proportion of complex family households (extended and multiple family households combined) for these ten German rural regions (Table 1). If we exclude one outlier (Saxony-Coburg with a low MUH and a high proportion of complex households), the correlation increases to 0.96 (significance level 0.000). These results are based on a small number of regions, but they at least give no reason to doubt our assumption that the MUH measure can be used to investigate household complexity in Germany with aggregate data, if individual-level data cannot be obtained.
19Yet another significant limitation of the MUH measure is that it does not account for horizontal extension of domestic groups, which results from the presence of unmarried siblings, aunts, uncles, nephews and so on, of the head or head’s spouse. It has been pointed out that unmarried co-resident kin were responsible for a substantial proportion of extended living arrangements in most populations for which data are available (Anderson, 1971; Ruggles, 1987), and there is no reason to believe that nineteenth-century Germany was different in this respect.
Comparison of marital units per household (MUH) and percentages of household types by region, Germany, 1846
Comparison of marital units per household (MUH) and percentages of household types by region, Germany, 1846Note: Household types according to Hammel-Laslett classification.
20In the descriptive part of our analysis, we use maps to provide a general overview of the spatial variation in household structures. In addition, we are able to explore the question of whether the macro-regional data used in the regression models accurately reflect the spatial variation that existed in Germany in 1885, or whether these regional values conceal substantial spatial heterogeneity at the more finely-gridded district level. To answer this question, we apply hierarchical measures of inequality based on the Theil index (Theil, 1965), which allow us to decompose the overall variation in MUH values observed at the district level into the variation observed between and within the larger regions (Regierungsbezirke and smaller German states). The between-region variation relates to dissimilarities between the means of the district values derived for each region, while the within-region differences comprise the variation observed between the districts of each region. Formally, the hierarchically decomposed Theil index is defined as follows:
26with yi denoting the MUH value in district i, n standing for the total number of districts, and nr denoting the number of districts in each region r. The index can range from zero (no inequality/differences between districts) to log(n) (total inequality). Equation 1 can be rewritten as follows:
28where TB represents the between-region component of inequality, and TW denotes the within-region component.
29In the second part of the analysis, we specify regression models based on the Regierungsbezirk-level dataset. Due to limitations in the available variables and the usage of aggregate regional data, the regression models should not be interpreted as attempts to establish causality. In total, three models are calculated. The first includes only two geographical variables denoting the longitude and latitude of the regional centroid  to investigate to what degree spatial trends are visible in the dependent variable. The second model controls for four variables related to the four central research questions: the percentage of the population who were Slavic (in reference to the cultural interpretations of the Hajnal line), the percentage of day labourers to farmers (in reference to the East Elbian agricultural divide), the percentage of the population working in agriculture (in reference to Ruggles’ agricultural employment hypothesis), and a regional dummy variable for partible inheritance. The third full model also includes five demographic and cultural covariates: the male singulate mean age at marriage, an indirect control for fertility differences (children under age five per 1,000 females aged 15-49 years), the percentage of widowed and divorced people, the percentage of the population aged 65 and above, and the percentage of the population who were Catholic. Models 2 and 3 will then be compared to see whether the introduction of important demographic and cultural covariates influences the estimates obtained for the four variables related to our main research questions. In all three model specifications population weights are applied to control for variation in the population size of the regions. In order to examine to what degree the models are able to explain the variation in the dependent variable, we also use plots to compare the observed MUH values with the predicted values.
30As spatial data are used in these models, the model estimates may potentially be distorted by spatial autocorrelation problems (Anselin, 1988). One of the underlying assumptions of an OLS regression model is that the sample consists of independently drawn observations. This assumption is often violated in spatial analyses of regional data, as adjacent spatial units are likely to share many similarities. Nevertheless, standard regression models treat these adjacent observations as independent, which could lead to biases in coefficient estimates and derived significance levels.
31In order to test for spatial autocorrelation, we calculate a Moran’s I test  on the dependent variable for our dataset of 83 regions, which results in an index of 0.32 (significance level 0.000). This indicates that positive spatial autocorrelation might cause problems in our models. Spatial autocorrelation is not a problem for the model process as long as similar spatial autocorrelation pattern are also present in the covariates. Thus, in order to determine whether the models are able to account for the spatial autocorrelation pattern present in the dependent variable, we perform Moran’s I tests on the unexplained model residuals. If these tests report insignificant results, we can be assured that our model outcomes are not biased by spatial autocorrelation. To address the question of whether our models are able to explain the spatial pattern of household complexity, we check whether any spatial trends are still visible in the residuals by regressing them on our latitude and longitude variables.
32Some general statistics on these variables and information on the sources from which we obtained the data are provided in Table 2.
- Slavic: as some authors have linked the Hajnal line to cultural differences between Slavs and non-Slavs (Macfarlane, 1981), we want to control for this effect. For this variable, we use the percentage of the population who spoke Slavic languages.  Our assumption is that a higher share of Slavs in the population is associated with a higher number of marital units per household (Le Play, 1982).
- East Elbia: to account for differences between West and East Elbia, we use the ratio of day labourers to farmers in 1882 as our variable.  In regions with a high ratio of day labourers (such as East Elbia), we expect to find a lower degree of neo-local household formation.
- Agriculture: in the specification of the third variable on agricultural employment, we follow Ruggles (2009), who found a positive association with household complexity in a worldwide comparison. Our variable is the proportion of persons employed in agriculture (including relatives without an occupation and servants) in the total population based on regional data from the employment census of 1882.
- Inheritance: in order to control for differences in inheritance systems, we use a dummy variable for all of the regions in which partible inheritance dominated.  Regions with partible inheritance are expected to have fewer marital units per household (Berkner and Mendels, 1978).
- Age at marriage: regional variation in the number of married persons (“marital units”) might depend on differences in the age at marriage. We therefore include the male singulate mean age at marriage (SMAM) as a measure of nuptiality. In this context, the male age at marriage is considered to be the most important influence on generation length. In turn, generation length is – along with life expectancy – a principle determinant of the extent of the overlap of the generations (Ruggles, 1987). If men form families later in life, the prospects for many three-generational families will be limited.
- Fertility: the likelihood of co-residing with a married child might also depend on the overall number of children a couple have. We therefore use as a measure of fertility the number of children under age five per 1,000 females aged 15-49.
- Widowhood and divorce: we also control for the share of widowed and divorced people, as most of the complex households detected in Germany in 1885 were likely to have included widowed people. The variable is defined as the proportion of widowed and divorced people in the population in 1885.
- Elderly people: in order to check for the incidence of multigenerational families, we use the share of the population aged 65 and older as another variable. A higher share of elderly people could lead to a higher share of marital units per household (Preston and King 1990). 
- Catholic: in addition to looking at the effects of demographic variables, in Model 3 we also control for the effect of Catholics in the population. As a legacy of the Augsburg Peace Agreement (1555), the German Empire was still divided into predominantly Protestant or Catholic territories at the end of the nineteenth century. Following the findings of many contemporary social scientists on the role of religion in residence patterns and intergenerational support, we assume that Catholics would have had more marital units per household than Protestants (Goldscheider and DaVanzo, 1989; Pampel, 1992; Treas and Cohen, 2006). The percentage of Catholics in 1885 is therefore included as a variable in our model.
Dependent and Independent variables. Descriptive statistics
Dependent and Independent variables. Descriptive statistics
33Apart from the dummy variable on inheritance pattern, we assume a linear association between the covariates and the dependent variable. Descriptive analyses did not provide any indication that this assumption is inappropriate for one of the variables considered. A challenge is that two of the main variables are highly correlated. The Pearson correlation coefficient between share Slavic and share day labourers is 0.61 as in both cases the numbers are increasing from the West to the East. But apart from this, none of the other variable pairs reports a correlation coefficient above 0.5.
III – Results
34In presenting our results, we will first turn to our descriptive analysis of spatial variation in household complexity in the German Empire in 1885. While data availability constraints forced us to run our models at the level of the 83 Regierungsbezirke, for our dependent variable of marital units per household we were able to obtain data at the more finely-gridded district level for most of the German states (in total, 892 districts and small regions). This allows us to investigate to what extent the regional-level data mask variation at a smaller geographical scale. The maps displaying the spatial variation in MUH at the Regierungsbezirk and the district levels are presented in Figure 1.
35The maps reveal a relatively well-defined clustering of the MUH levels across space. Generally, the MUH levels were higher in northern and north-eastern Germany, and lower in the southern and south-western areas of the empire. Among the regions with the highest MUH levels were the easternmost part of the German Empire that bordered the Russian Empire (East Prussia), an area in north-western Germany south of the city of Hamburg (part of the Prussian province of Hanover), and an area in the central part of western Germany north of Frankfurt (Oberhessen) (Figure 1). The “hot spot” in East Prussia was situated in an area with a high share of non-German speakers, where the agricultural structure was dominated by big estates. While this cluster appears to be in line with the Slavic and East Elbia hypotheses, the other two hot spots are not. The hot spot in the province of Hanover is situated in a region characterized by poor soil quality (mostly heath and moorlands), low population density, and family farmsteads. More detailed household statistics, which are available for Prussia in 1880 (Galloway et al., 1994), reveal that the two MUH hot spots in East Prussia and the province of Hanover were both characterized by a low share of (male) family members in the household and a high share of servants and helpers. By contrast, the MUH hot spot in the central part of western Germany (Oberhessen) appears to be largely attributable to a high number of widows per household.
Spatial variation in marital units per 100 households in Germany*, **
Spatial variation in marital units per 100 households in Germany*, *** The maps are based on a standard deviation categorization centred on the mean. The histogram shows the distribution and the chosen categories.
** The district data for Bavaria and Elsass-Lothringen originate from the 1880 census; as it was not possible to obtain information on the number of non-institutional households for these two territories, the MUH for the districts in Bavaria and Elsass-Lothringian districts was derived by dividing the number of marital units by the total number of households (including institutional households)
36Areas with low MUH values were mostly concentrated in the south-western part of Germany (Elsass-Lothringen, Württemberg (Swabia), Bavarian Swabia). Smaller territories with low MUH levels included the Lower Rhine valley area around Cologne, the Bavarian Forest region along the border to Bohemia, and a region in the south-eastern part of the German Empire (in Upper and Middle Silesia). The majority – but not all – of these regions were in areas of Germany in which partible inheritance practises dominated (e.g. Württemberg, Elsass-Lothringen and parts of Upper Silesia; Huppertz, 1939).
37However, we can also find smaller regional “cold spots” with low MUH levels in the north-eastern part of the German Empire, an area that was generally characterized by high MUH levels. These included two Catholic exclaves: Ermland in East Prussia and Eichsfeld in Thuringia. The latter was also an exclave in terms of inheritance practises, as partible inheritance practises were dominant there (Grabein, 1900). However, as these small cold spots were embedded in bigger regions that were predominantly Protestant and were dominated by non-partible inheritance, they are hidden in our models. Mecklenburg-Strelitz, situated north of Berlin, also stood out as a cold spot of household complexity, but we have reasons to believe that this finding is related to data quality problems (see below). On the other hand, there were areas with high MUH levels in southern Germany as well, such as a region in Upper Bavaria north of Munich, as well as a region along the border to Luxembourg (in the Eifel and Saar area).
38Overall, the maps suggest that a clear east-west division in MUH cannot be detected in Germany in 1885. A larger concentration of marital units in the households of the east was counterbalanced by similar tendencies seen among the regions situated in the north-west, despite the large Slavic population and the dominance of a manorial agrarian regime in the east. But there are indications that areas with partible inheritance had lower MUH levels. It should be noted, however, that the spatial variation at the regional level in the patterns of co-residence observed in Germany in 1885 seems to have been smaller than the spatial variation of other European areas or prior to that time, including France (Parish and Schwartz, 1972), Italy (Barbagli, 1991), the Iberian Peninsula (Rowland, 2002), and historical Poland-Lithuania (Szo?tysek, 2008). Looking at Germany’s within-country variation provides certain insights into the differences in family organization in different regions in the mid-1880s. However, compared to historical Germany’s spatial heterogeneity in political and socioeconomic structures, the within-country variation visible in the MUH is comparatively small.
39The Theil analysis allows us to investigate whether the spatial variation in household complexity was rather characterized by large-scale between-region variation, or by small-scale within-region variation. A finding that the between-region dominated would provide support for the view that the spatial variation across Germany was predominantly shaped by large-scale differences in economic or cultural characteristics. By contrast, a finding that the within-region dominated would suggest that the variation in local conditions was very important, and would seem to refute broad cultural explanations, as the German regions were internally quite homogenous in terms of culture in 1885. This part of the analysis will also allow us to explore the extent to which small-scale spatial variation at the district level is hidden in the macro-regional dataset that we use for our models.
40For the Theil analysis, we had to exclude a number of medium-sized and smaller German states for which we were unable to obtain district-level data. We therefore restrict ourselves here to six German states and territories which together represent approximately 84% of the German population at that time.  This reduced sample contains information for 58 of our 83 regions, which are subdivided into 850 districts. Based on the Theil index, we decompose the overall variation in MUH in our dataset of 850 districts into differences between the mean district values of each region (between-region variation) and the differences observed between the districts of each region (within-region variation). Our results show that approximately 50.4% of the overall variation can be attributed to regional differences between the Regierungsbezirke, while 49.6% of variation relates to differences within the Regierungsbezirke. This suggests that almost 50% of the variation was not operating at large scale, but rather at a medium to small scale.
41In order to examine these findings at a higher level of detail, we further decompose the variation observed within our 58 regions. To do this we make use of the fact that, in the Theil index, the overall within-region variation is obtained by summing up the within-region variation contributed by each region (Equation 1). The results of this decomposition show that large parts of the within-Regierungsbezirk variation were concentrated in a small number of regions: 50% of the within-Regierungsbezirk variation is contributed by just 12 of the 58 regions. The level of internal variation was especially high in Bavaria, where the regions Upper Bavaria, Lower Bavaria, and Middle Franconia alone contribute 21.4% of the total within-region variation observed across all 58 regions. In these Bavarian regions, the variation mostly stems from the fact that the urban areas had very low MUH values relative to the rural districts. This descriptive finding for Bavaria (and a number of other regions) is in line with Ruggles’ hypothesis (2009) that a large part of the variation stems from differences in the shares of the population employed in agriculture. However, next to the within-region variation, a quite substantial share of variation also existed between regions.
42As noted in the methods section, the models are designed to explore the extent to which existing spatial patterns persist after controlling for covariates related to the main research questions and a number of socioeconomic and demographic covariates. In the first model, we only include the latitude and longitude values of the centroids of the 83 regions in order to explore to what extent the model detects significant spatial trends in the data (Table 3). As was already suggested by the maps, the model emphasizes a north-south rather than an east-west pattern, which is contrary to our expectations. The longitude variable is positive, indicating that MUH values increased towards the east, but it is not significant. The latitude variable, on the other hand, shows a highly significant increase in the MUH values the farther north a region is located.
Model results. Regional data from the German census of 1885 (N=83)
Model results. Regional data from the German census of 1885 (N=83)Significance levels: *** 0.001; ** 0.01; * 0.05; °0.1.
43We will now turn to model 2, in which we control for the four variables directly related to our four main research questions. Of these variables, only agricultural employment and partible inheritance pattern provide significant results, with both exhibiting the expected sign. However, model 2 is able to explain the spatial pattern to a limited extent only, as the Moran’s I test on the residuals returns a highly significant value of 0.28, which is still very close to the value observed in the dependent variable (0.32). The residuals also continue to exhibit a significant spatial trend in the latitude variable, which implies that model 2 is only partially able to explain the north-south differences in household complexity in Germany.
44Our full model 3, in which we also control for demographic and cultural covariates, is far better at explaining the spatial variation. The Moran’s I on the residuals is substantially reduced to a value of 0.07, which is only significant at the level of 0.1. This implies that there might still be some small biases in our estimates due to spatial autocorrelation, but the likelihood is much lower than for Model 2. When we regress the residuals of model 3 on our longitude and latitude variables, we obtain a non-significant result for both, which implies that we are able to explain large-scale trends in the MUH pattern. The results for our four main variables change substantially only for partible inheritance, when we introduce the demographic and cultural controls. This variable becomes insignificant. The Slavic and day labourer variables remain insignificant, while the agricultural employment variable returns a higher coefficient. The other demographic and cultural control variables are, apart from the male singulate mean age of marriage, all significant at least at the 0.1 level, and most of them exhibit the expected sign. The exceptions are the percentage of elderly people and Catholic variables. Among the possible explanations for the unexpected negative sign for the share of elderly people is that many retired people may have chosen to move to a separate hut/cottage on the farmstead rather than to live in the household of the younger generation; evidence that this was the case has been found for some parts of rural Germany (Berkner, 1972). The Catholic variable also has an unexpected negative sign. We believe this could be related to the fact that, with a few exceptions, almost all of the regions with a partible inheritance pattern were predominantly Catholic, while very few of the regions with an impartible inheritance pattern were Catholic. Thus, these two variables seem to be closely linked. If, for example, we omit the dummy variable of partible inheritance, the Catholic variable becomes even more negative and highly significant. 
45To further investigate the spatial fit of Model 3, we plotted the predicted and observed values in a graph in order to determine the extent to which outliers are clustered in certain regions (Figure 2). Overall, the graph supports the view that the model is able to account for most of the spatial MUH variation at the macro-regional level. Some of the outliers belong to the regions identified above as having data quality issues. These areas include Mecklenburg-Strelitz and a number of Thuringian states (Saxony Weimar, Saxony Meiningen), where data collection activities were heavily constrained by small financial budgets. In addition, a substantial number of the outliers are regions with very small populations (e.g. Waldeck, Lippe-Detmold, the Principality of Lübeck, and the Principality of Birkenfeld). Meanwhile, Bremen’s outlier status might be related to the fact that the city counted the de jure population. Especially in a harbour town, where sailors were away on their ships for weeks or months, a de jure population count tended to result in higher MUH values than a de facto count would have done. But there are also a number of outliers for which we have no indications of data quality or data comparability problems. This suggests that we are missing some important covariates for these regions. This group of outliers includes the MUH hot spots in the province of Hanover and Oberhessen, as well as the Bavarian region of Upper Bavaria.
Observed versus predicted values of Model 3
Observed versus predicted values of Model 3Note: Region names are included for all of the regions in which the difference between the observed and the predicted values (residuals) is higher than the standard deviation of the residuals across the whole dataset.
46In this paper, we investigated the spatial variation in household structures in late nineteenth-century Germany using an aggregate measure of household complexity (marital units per household, or MUH). The MUH measure represents household complexity only in the broadest possible sense. As it does not consider unmarried co-resident relatives, the MUH captures only part of the phenomenon of extended living arrangements. However, when country-wide microdata are not available, the use of this indirect measure of household structure is advantageous, because it allows working on the scale of Germany as a whole.
47Our analysis allowed us to examine the relevance of the existing theoretical considerations related to household complexity for understanding regional MUH differences in Germany. Several clusters with different incidence rates of marital units per domestic group existed in the German Empire at the time, with higher proportions of households with more than one marital unit found in the north and north-east, and lower proportions found in the south and south-west of the country. Contrary to our expectations, some of the supposedly decisive socioeconomic and cultural borders within late nineteenth-century Germany do not appear to correspond with the observed spatial patterns of family composition. In our descriptive analysis and our models based on 83 regions (Regierungsbezirke and smaller states), we did not find evidence to support the hypothesis that the East Elbian divide can also be seen in the spatial variation in household complexity. Although the East Elbian part of Germany had a higher degree of household complexity than southern Germany, this area of high MUH levels also extended into north-western Germany, where the agricultural structures were dominated by family farmsteads and life-cycle service. Neither did we find any indications that the Slavic population had higher levels of household complexity, with the possible exception of elevated levels in the far eastern part of Germany. Our results generally confirmed Hajnal’s view (which had been challenged by, for example, Laslett, 1983) that all of the German Empire was part of the Western European Family System.
48We found indications that differences in inheritance patterns might have played a role, as regions with partible inheritance initially appeared to have had lower levels of household complexity. But in our complete model, the variable was not significant. The most compelling pieces of our evidence support Ruggles’ “agricultural employment hypothesis” (2009), as the share of the population working in agriculture, as well as a number of the demographic indicators, were strongly associated with levels of household complexity. The results of our explorative analysis of household complexity at the more finely-gridded district level provided support for the view that urban-rural differences were even more closely related to household complexity at this geographic level. This suggests that, if we had been able to model household complexity at the district rather than the regional level, we may have found even stronger support for Ruggles’ hypothesis.
49However, to properly understand the geography of historical family systems, it is essential to consider the intrinsically complex interplay of economic, demographic, and cultural factors; which are, in turn, further differentiated by local and environmental contexts, and by historical path dependencies. This research can serve as a starting point for more contextual and place-specific future investigations.  Linking together the analyses undertaken at different aggregation levels should allow us to account for aggregate constraints on people’s household strategies, without losing sight of individual behaviour and the complexities of local histories.
Max Planck Institute for Demographic Research, Germany.
University of California at Berkeley, United States.
Correspondence: Miko?aj Szo?tysek, Laboratory for Historical Demography, Max Planck Institute for Demographic Research, Konrad-Zuse-Strasse 1, Rostock, 18057, Germany, email: Szoltysek@demogr.mpg.de
Empirical evidence shows that east-west differences in family formation patterns already existed before 1945 (Klüsener and Goldstein, 2012). This suggests that the East and West German policies between 1945 and 1990 did not create a new difference, but rather amplified an existing gap.
The terms “co-residence”, “living arrangements”, and “household patterns” are used interchangeably in this article.
In this article, the term region refers to the 83 Regierungsbezirke and smaller states. The more finely-gridded district-level data are explicitly called districts.
In Germany, the decline in fertility did not become a widespread phenomenon until 1890 (Knodel, 1974).
Unfortunately, it was not possible to obtain data for all of the socioeconomic covariates at the district level. Therefore, the regional-level dataset was used for the models.
The census definition of “household” between 1875 and 1910 encompassed both biological and other kin relations criteria, as well as socioeconomic criteria. A group of people was considered to be a co-residential household group if they were living together on the basis of shared resources. This category included not only biological members of the family and other related persons, but also servants, boarders, and lodgers (Rothenbacher, 2002).
Only family households were taken into account.
Neo-local residence rules imply that, upon marriage, each partner is expected to move out of his or her parents’ household and establish a new residence, thus forming the core of an independent nuclear family.
This sample comprises a total population of 20,318 persons.
Derived from a GIS file of the administrative borders of German regions in 1885 (MPIDR, 2012).
The Moran’s I index is very similar to Pearson’s product moment correlation coefficient, except that instead of looking for the correlation between the values of two variables x and y by each unit i, it looks for the correlation between the values of a variable x in each region i, with the mean value of the same variable x in the regions j, which are adjacent to region i. This adjacency can be defined in different ways. We use a first order queen definition of adjacency, which considers all of the regions which border each other at a minimum of one point as neighbours. The Moran’s I Index can take on values from –1 (strong negative spatial autocorrelation) through zero (no spatial autocorrelation) to +1 (strong positive spatial autocorrelation).
Data on the Slavic-speaking population are not available for 1885 or for any earlier period. For the Prussian Regierungsbezirke, we used values from 1890, while for the other 48 regions of the German Empire, we had to rely upon values for 1900. Both decisions are rather unproblematic, as the share of the population who are Slavic changes little over time in most regions. This holds even true if we consider the migration of Slavic people from eastern Prussia to the Ruhr area, as the share of Slavic inhabitants in the Ruhr area was still very low in 1890. This is also the case for the 48 regions outside of Prussia in 1900. Our definition of Slavic languages does not include the Baltic languages (e.g. Lithuanian).
Employment data were not available for the census year, but for 1882, where an employment census was carried out in the German Empire. In the model, we used the 1882 data without further modifications as the regional differences in the share of people working in specific jobs and sectors are unlikely to have changed drastically over such a short period of time.
This information is based on a map showing the regional distribution of inheritance patterns (Huppertz, 1939). It was elaborated on the basis of voluminous local studies from the 1890s.
This, however, depends very much on the household living arrangements of older adults.
These six states and territories are Prussia, Bavaria, Württemberg, Baden, Hessen, and Elsass-Lothringen. The district data for Bavaria and Elsass-Lothringen come from the 1880 census. As it was not possible to obtain information on the number of non-institutional households for these two territories, the MUH values for the Bavarian and Elsass-Lothringian districts were derived by dividing the number of marital units by the total number of households (including institutional households).
As share Slavic and percentage day labourers, two of our main variables, are highly correlated, we calculated variance inflation factors to see whether our results are affected by multicollinearity. The results indicate that multicollinearity does not severely affect our model outcomes. Also omitting one of the two correlated variables has little impact on the outcomes for the other variables.
The on-going data collection and harmonizing project “Mosaic” will soon bring to light historical census micro-data for several dozen local communities from different parts of nineteenth-century Germany (see www.censusmosaic.org).