Background
1In the field of demography, the need to age-standardize populations under comparison is taken for granted, for instance in comparing mortality or fertility rates (Ahmad et al., 2001 ; Preston, Heuveline and Guillot, 2001). Likewise, in the fields of medicine and public health, comparisons of disease incidence rates are considered misleading in the absence of appropriate adjustments for population age structure. Yet the influence of compositional effects in comparisons of educational participation measures is routinely neglected, even though the threat of biased results is the same as in other disciplines.
2It is easy to understand how education indicators are influenced by the age distribution. Suppose that participation rates are higher in the early grades than the later grades, after children have had more exposure to the risk of dropping out. In a “young” population, where the country’s birth cohorts are increasing in size, a greater share of the overall number of school-age children will be in the early, high-enrolment grades, and a relatively smaller share will be in the later, low-enrolment grades. In an “old” country with shrinking birth cohorts, the situation is reversed, so comparison of these two countries’ education systems would be biased by age compositional differences. Further, even if two countries have identical education systems (i.e. all student flow rates – promotion, grade repetition, drop-out – are exactly the same), but again have opposing cohort growth profiles, then the difference in age structure ipso facto leads to different overall participation ratios, even though for any given individual, schooling prospects are the same in both countries. So while for some policy questions, the unstandardized indicator might still be relevant, for many others, such apparent differences resulting from age composition are misleading.
3Comparisons over time are equally affected. Commonly, education monitoring organizations such as UNESCO publish participation metrics demonstrating whether enrolment or attendance has changed between two different points in time. For instance, in UNESCO’s Education for All Global Monitoring Report (2010), progress in net and gross enrolment over the last decade are reported for a number of developing countries in the midst of a fertility transition, yet we find no discussion of the corresponding change in cohort sizes that may well be partly responsible for such improvements. Furthermore, the same publication also makes both implicit and explicit comparisons between countries’ participation ratios. The authors characterize as “disconcerting” that 25 developing countries “experienced stagnating or declining” primary net enrolment ratios (NER) between 1999 and 2007, when in fact 11 of these were within two percentage points of zero change (UNESCO 2010). As shown below, this is well within the magnitude of potential age compositional effects. In age-standardized terms, these countries may in fact be slowly improving. While marginal positive growth might, of course, still count as “stagnant”, the nine countries showing improvements of less than two percentage points were not singled out for concern, even though some of them are equally far from the target.
4Concerns about data quality and age have been mentioned as possible explanations for discrepancies in traditional educational metrics calculated from register or survey data (UNESCO, 1997 ; UNESCO, Institute for Statistics 2010 ; UNESCO, 2010). However, such reports typically focus on age-reporting differences resulting from disparate data sources or from information collected at different time points over the year. Yet even with the most complete and accurate census, age compositional differences over time or across countries will still affect results. This phenomenon was alluded to in a report concerning international tertiary participation ratios. Adelman (2010) argued that the figures for OECD countries were not comparable due to vastly different demographic profiles and fertility trends. However, the current study is, to our knowledge, the first effort to systematically quantify the extent to which age composition can affect education participation measures. Here, we examine its impact on participation measures among compulsory school-aged youth.
5Whether or not the effect is found to be substantial, there is no reason to conclude a priori that it can safely be ignored. In the proceeding analysis, we demonstrate the range of this effect on education participation measures and examine the circumstances under which adjustment might or might not matter.
I – Data and Methods
1 – Some education indicators
6In the following analysis, the process of interest is educational participation. We apply age-standardization to two typical measures : the gross enrolment ratio (GER) and the net enrolment ratio (NER).
7Alternatively, these measures are referred to as rates, but regrettably, the terms “ratio” and “rate” are often used interchangeably in the educational context (compare, for example, the glossaries in UNESCO Institute for Statistics, 2012, UNESCO, 2012, or the use of both terms on the same page by UNDG, 2010). While the UNESCO Institute of Statistics appears to consistently distinguish between gross ratios and net rates, this usage is far from universal, even among UN agencies. In any case, while the net measures do restrict the numerator counts to the population at risk in the denominator, even these are not calculated in terms of demographic occurrence-exposure rates based on person-time units (and neither are school drop-out “rates”, for that matter). Accordingly, we employ the term “ratio” for all participation ratios in our discussion.
Gross enrolment ratio (GER)
8This is defined as
10where
11- Ej,t is the number of pupils enrolled in a level of education j, in school-year t, regardless of age ;
12- Aj is the age group which officially corresponds to level of education j ;
13- Pa,t is the population of age a in a school-year t.
Net enrolment ratio (NER)
14This is defined as
16where
17- Ea,j,t is the number of pupils of age a enrolled in a level of education j in school-year t ;
18- Aj is the age group which officially corresponds to level of education j ;
19- Pa,t is the population of age a in a school-year t.
20A variant of NER additionally includes those who entered school early, or progressed more rapidly than expected, and are already enrolled at the secondary level while they are still of primary-school age. This measure is variously called “adjusted NER”, “total NER”, or simply “NER+”. It is obtained by substituting Ea,t into the numerator for NER above, so that NER+ contains children of the appropriate age who are enrolled in any level of education. We make no use of NER+ in the present analysis, but wish to clarify the distinction to facilitate comparisons with figures published elsewhere.
21The relationship between the numerators of the different indicators is illustrated graphically in Figure 1.
Enumerator populations for net ratio (A), adjusted/total net ratio (A+C), and gross participation ratio in primary schooling (A+B)

Enumerator populations for net ratio (A), adjusted/total net ratio (A+C), and gross participation ratio in primary schooling (A+B)
Note : The nominal age for primary participation is assumed to be 6–10 in completed years of age.Gender parity index (GPI)
22An important dimension of educational development, and also an explicit Millennium Development Goal (specifically Target 3A), is the elimination of gender disparity at both primary and secondary levels of schooling. While true gender equity goes far beyond equal numerical representation and is affected by classroom culture, curricular content, and even sanitary facilities, equal access and participation in a quantitative sense is normally measured by means of a gender parity index (GPI). This index measures the female-to-male disparity in access to education, most commonly the ratio of gender-specific GERs. In other words :
Enrolment versus attendance
24The notion of access to and participation in education can be operationalized using different measurable indicators. The most frequently encountered term, “enrolment”, refers to the inscription of an individual in a school register. It is of course possible for a child to be officially registered, but to not or no longer be physically attending class. The notion of school “attendance” focuses on this physical, de facto, participation. The choice between these measures is driven partly by substantive considerations, depending on the question to be answered, but also partly by pragmatism or empirical constraints based on the source of data. In particular, official registration data tend to capture enrolment, while individual and household surveys (including the data source for our empirical case study) tend to ask respondents (or their proxies) about their actual attendance at school. However, for the purposes of this exercise the distinction between enrolment and attendance is a nuisance factor, not a substantive difference to explicitly account for. We use the term “participation” to refer to either enrolment or attendance, depending on context. Importantly, all our conclusions regarding age-standardization of participation measures apply equally to net and gross attendance metrics based on either enrolment or attendance, so in all the above, GER and NER could be replaced with GAR and NAR.
2 – Age-standardization
25In any standardization exercise, the choice of a standard is essential, but there is no simple rule about what to use. When comparing two or more countries in mortality research, standardization is often performed by applying the average of the populations’ distributions (Preston, Heuveline and Guillot 2001). The same could be done for a comparison of educational enrolment, but in educational policy contexts, comparisons across time within the same country are often made to gauge educational progress. In that case, an average of the starting and ending years’ age compositions might be a reasonable standard. But if the aim is to compare progress in enrolment between two countries, change over time in either country’s age distribution could affect the results.
26Therefore, in the empirical analysis below we present results using a “flat” age distribution. In other words, the total size of each age cohort is equalized. In our opinion this is the least arbitrary of standards, and has the potential to provide the fairest and most consistent comparisons across either time periods or contexts. Unlike an average, it does not need to be updated over time. Moreover, since school-age corresponds to a life stage of particularly low mortality, a flat age structure approximates what would be observed in a stationary population. Indeed, the ubiquitous “total fertility rate” (TFR) measure may be interpreted as a (rescaled) standardized fertility rate based on a flat age distribution, thus providing a well-established precedent.
27In order to age-standardize these ratios, we reweight the population under analysis by a standard age distribution. For a flat standard, this reduces to the sum of age-specific enrolment ratios, divided by the number of single-year age groups in the school-age population. As an aside, we note that this is proportional to an indicator already routinely calculated, but misinterpreted as an approximation to “school life expectancy” (Barakat, 2012).
28In the general case, for an arbitrary standard, the first step is to calculate the weighting factors, which are given by
30where
31- wa is the weighting factor at age a ;
32- pa is the proportion of the population that will be standardized at age a ;
33- is the proportion of the standard population at age a.
34The second step is to reweight the enrolled (Ea) and non-enrolled (NEa) population in each age a using the weighting factors
36Once we have the weighted total population () and weighted enrolled population (
) for each age, the age-standardization of gross enrolment ratio (ASGER) and net enrolment ratio (ASNER) can be easily calculated by the following equations :
38where
39- is the weighted number of pupils enrolled in a level of education j, in school-year t ;
40- is the weighted number of enrolled pupils of age a in level j in school-year t ;
41- Aj is the age group which officially corresponds to level of education j ;
42- is the weighted population in age-group a in school-year t.
43By using the same standard age distribution in two populations, the remaining differences between estimates may be attributed solely to differences in the performance of the educational system.
3 – Empirical Data
44The further a population from stationarity, the greater the effect of age standardization. In practice, the effect will be greater, all else equal, the higher the level or variation in fertility, and the greater the variation in educational participation by age. The most interesting cases are therefore likely to be non-industrialized countries.
45We present the example of India, since it provides for a range of potential scenarios across its constituent states that are likely to mirror the situations in many other non-industrialized settings. The analysis is based on samples of the National Sample Survey (NSS) extracted from the International Integrated Public-Use Microdata Series (IPUMS) repository, Minnesota Population Center, University of Minnesota (2011). Micro-data is required in order to perform the reweighting by single years of age.
46India’s education system consists of five years of primary and seven years of secondary schooling (UNESCO International Bureau of Education, 2007). This gives a total of 12 grade levels in primary through secondary school (“primary+secondary”). Reflecting the official entry age into primary school, the lowest age included in our analysis was age 6. Theoretically, the definition of gross participation includes participants of any age in the numerator. In practice, however, enrolment in primary and secondary school tends to be negligible beyond the mid-20s, even in countries with extreme levels of late enrolment, and adults in basic education or literacy courses are not considered to be within the scope of what GER is designed to capture. Because late enrolment, repetition, and possibly re-entry after dropout are frequent occurrences in India (Mehta, 2007 ; World Bank, 2009), individuals up to age 25 were included in the analysis.
47Within this 6-25 age group, less than 0.01% of cases were missing essential data, and these were excluded from analysis. The remaining raw sample contains 150,266 female and 162,763 male unweighted individuals for 1983 ; 128,790 female and 142,818 male individuals for 1993 ; and 132,304 female and 144,373 male individuals for 2004. The data contain identifiers for the 35 sub-national Indian states and union territories. To ensure stable results, states were excluded from analysis if any sex × age × educational stage cells contained fewer than 20 sample members for any year, and appropriate person-level sampling weights were applied. As a robustness check, the analysis was repeated unweighted, and the substantive conclusions are unaffected. Since some of the age cohorts are present in more than one sample (e.g. those aged 6-10 in 1983 were 16-20 in 1993), it is worth pointing out that the samples are independent and not linked in any way. In any case, our analysis is unaffected by whether the same individuals are contained in the two samples or not. The variables analysed are age, sex, school attendance and level of education previously attained or currently attending, allowing us to calculate primary and secondary age distributions and age-specific attendance.
II – A case study : Indian states
48We calculate and compare customary and age-standardized indicators for Indian states and territories between 1983 and 2004 to provide an empirical illustration of the significance of the above discussion. Home to the single largest population of illiterates and of out-of-school children, India is of particular intrinsic interest as a case study in international educational development. In addition, the variation at the sub-national level provides a well-defined set of examples which cover the entire parameter range of interest. In other words, across Indian states, substantial variation is evident in cohort growth and decline, and in school attendance trends : the NAR ranges from 10% to 96% among Indian states and over the time period considered, effectively bracketing the entire range encountered among developing countries worldwide. In addition, the size of the 17-year-old age group relative to that of 6-year-olds ranges from around 0.23 to over 2.4. Again, this covers the whole range from strongly positive to strongly negative cohort growth. This vast variation in the key characteristics allows us to show how age standardization could affect estimates of net and gross participation rates for different combinations of scenarios, i.e. shrinking or expanding cohort sizes combined with changes in age-specific attendance.
49The results of age-standardizing the gross and net attendance ratios are shown in Figure 2. [1] These graphs present gender-specific box and whisker plots of the differences, or deltas, between the commonly calculated metric and the age-standardized metric for each year of data analysed. The differences yielded were similar for NAR/ASNAR and GAR/ASGAR, although the deltas for the latter were somewhat larger. Except for 2004, GAR/ASGAR deltas were larger for females. Although the median delta was in the 1.5 point range for females, in several cases the delta yielded by the different methods was more than 3 percentage points. Hence, for 1993, but especially for 2004, the deltas for both males and females range between –4 and +3 points.
Effect of age-standardization on net (left panel) and gross (right panel) participation rates in primary through secondary school across Indian states and territories, by year and sex

Effect of age-standardization on net (left panel) and gross (right panel) participation rates in primary through secondary school across Indian states and territories, by year and sex
Note : Boxplots indicate the median and 25th and 75th percentiles ; whiskers extend to minimum and maximum values. Two extreme cases are highlighted and examined in Figure 3.50The differences in NAR versus age-standardized NAR (ASNAR) also range between 1 and 2 percentage points, but in a number of places the state-specific differences between NAR and ASNAR are sizable. We highlight the cases of females in the Andaman and Nicobar Islands in 1983 and males in Delhi in 2004, which represent the largest negative and positive standardization effects, respectively. It is noteworthy that in the Delhi case, age-standardization raises the value by 2.5 percentage points, while for the Andaman/Nicobar Island case the value is lowered by 4.5 percentage points, highlighting that the effect can go either way.
51To show how age-specific participation and age distribution jointly determine the difference between unstandardized and standardized indicators, Figure 3 compares the cohort-age profiles of the two opposite example cases highlighted above.
52The rationale for age-standardization is that age distributions may vary over the age range included in typical educational enrolment metrics. If either the age distribution or age-specific participation ratios were uniform, age-standardization using a flat distribution would have no effect. In fact, if the age-specific participation ratios were uniform, standardization would be ineffective regardless of the reference age distribution.
53While the age pattern might be expected to show steadily declining participation from the entry level, with improvements over entry cohorts reinforcing this pattern, reality is often more complex. While the example of boys in Delhi (Figure 3) shows this expected pattern, the example of girls in Andaman and Nicobar Islands shows another common pattern, which is more “hump-shaped”, with a peak in attendance around age 8 due to late enrolment, but also dwindling attendance at older ages.
School attendance ratios by age (left panel) and relative size of single-year age cohorts (right panel) for males in Delhi in 2004 and females on the Andaman and Nicobar Islands in 1983

School attendance ratios by age (left panel) and relative size of single-year age cohorts (right panel) for males in Delhi in 2004 and females on the Andaman and Nicobar Islands in 1983
Note : Straight lines in the right panel indicate linear least-squares smoother.54With respect to age, we observe larger birth cohorts at older ages in the case of Delhi males in 2004. In contrast, for the above example of females in Andaman and Nicobar Islands in 1983, the cohorts were larger at younger ages, indicating higher fertility. Average trend lines are also shown here to clarify the overall age patterns, as a fair amount of measurement error in the form of “age-heaping” in age reporting seems evident. This is the most likely explanation for the rather jagged appearance of the age distribution, as generally the even ages are more prevalent than should be realistic, especially in the absence of any particular reason to believe fertility had fluctuated in the curious way suggested by the age reports.
55Obviously, if the reported age information is unreliable, transforming enrolment as a function of age will only lead to deterioration in data quality, and is not recommended. It would still be possible to base the standardization weights not on the shares of reported ages observed in the data, but on a known good estimate of the true age distribution, e.g. from other population statistics. This external data source would not need to contain any information on school participation. In doing so, some error could be introduced if the propensity to misreport age differs between the enrolled and the non-enrolled. However, to our knowledge, there are no studies on the relationship between enrolment and age misreporting.
56Intuitively, we can see that in the case of Delhi males, standardizing to a flat age distribution effectively reduces the weight of the low attendance ratio at higher ages, leading to an increase in the overall ratio. For the Andaman and Nicobar Islands, the relative weight of the higher ages with their low attendance is increased by standardizing to a flat age profile, leading to a lower attendance indicator.
57The differences between unadjusted and age-standardized figures for GAR are especially salient since the gender parity index (GPI) is derived from GER (or GAR, as here). If an age-standardized GER (GAR) substantially changes the conclusions about gross enrolment, then there will be similar differences between age-standardized GPI (ASGPI) and the commonly calculated GPI. The bias can actually be even greater if, for example, the ASGER (ASGAR) is in the positive range for one gender but negative for the other, thus increasing the disparity that would be observed if calculating GPI without age standardization.
58We focus here on changes in GPI over time, specifically over the period from 1983 and 2004, which are the earliest and latest observation years in our data. The spread in these changes over time, in both the customary and age-standardized indicators, is shown in Table 1. While the GPI dynamics at the secondary level are fairly robust to age composition effects, the same is not true at primary level. Comparing the time change in GPI and ASGPI reveals a nuanced picture. On the one hand, the overall range between the states with the smallest and largest improvements in gender parity points towards greater progress in the situation of females when age-standardization is performed, but on the other, the median improvement becomes smaller. Validating the policy-relevance of the standardization procedure, we note that in terms of unstandardized GPI change between 1983 and 2004, two clear outliers can be observed. The change in primary GPI in Delhi and Tripuram over this time period is –17.8 and –17.1 points, respectively, according to the standard formula. This stands in contrast to the vast majority of states, which saw large increases in GPI. In fact, the next largest decrease is only –5.2 points. At the same time, these two outliers of Delhi and Tripuram are precisely the ones where age standardization is found to have the largest effect on estimated GPI change over time. Taking age structure into account, the decline in ASGPI between 1983 and 2004 reduces to values of –6.4 and –1.6 points, respectively, which are much more in line with the pattern across all states.
Maximal, median, and minimal values over Indian states of the change between 1983 and 2004 in the customary and age-standarized gender parity index (GPI and ASGPI), by school stage, in percentage points

Maximal, median, and minimal values over Indian states of the change between 1983 and 2004 in the customary and age-standarized gender parity index (GPI and ASGPI), by school stage, in percentage points
III – Discussion and Conclusion
59The purpose of this investigation was not to challenge or dispute previously published enrolment figures for India. As mentioned above, UNESCO has already documented how disparate data sources (e.g. administrative versus household surveys) and collection at different time points may yield differences in reported age that result in over- or under-estimates (UNESCO Institute for Statistics, 2010). Our results are certain to differ from other sources, because we employed a different data set. Instead, we seek to provide an empirical demonstration of the sensitivity of education participation measures to age composition, and under what conditions they are affected. Given that less-developed countries are experiencing major changes in fertility rates, and that they are often in the midst of educational reforms or are characterized by school inefficiencies weighing on age-specific attendance, the issue of compositional effects are salient enough to warrant greater attention to how such effects may bias education metrics. These factors are especially important in analyses of enrolment growth over time within a single country. Further, countries with more inclusive age ranges (e.g. relatively longer primary periods) have non-comparable primary enrolment figures due to the age sensitivity of these metrics – something age standardization could also address.
60We acknowledge that in some cases the adjustment would yield a rather small difference – perhaps one or two percentage points in the case of NER or NAR – although we find GER/GAR to be somewhat more sensitive to these factors.
61As a case in point, after recent analyses of progress towards equity in primary education in Bangladesh, a publication claimed that gender disparities had been eliminated, primarily as a result of higher GER among girls than boys. However, as we have seen, the observed difference of five percentage points is well within the magnitude of the potential effect of age compositional differences and sex differences in age-specific school attendance ratios. Since these were not accounted for, we cannot be confident that the small apparent female advantage really exists. The authors further cite a higher GER in rural over urban Bangladesh, yet higher fertility in rural areas could easily have biased their finding of a three point difference (Chowdhury et al., 1999 ; Mushtaque et al., 2002).
62However, the significance of the size of the difference between customary and age-standardized indicators depends on the application. While a small magnitude may not drive general planning, the analysis shows that possible age distortions should be considered when specific policy conclusions are based on small changes in NER or GER (or one of the derived measures). What appears as a slight decline in non-standardized participation may in fact present as a slight increase in age-standardized participation, with qualitatively different policy implications.
63While we are sceptical of the value of indicator-based “rankings” in comparative education, we argue that when they are used, they should at least be accurate. We believe that age distortion may be sufficiently large in some cases to change the rank order of countries with similar levels of participation. Another concern is that for any kind of statistical inference using NER, GER or other participation measures either as an input or outcome, ignoring the age structure may well create a systematic bias, since the age composition, as an outcome of fertility change, is correlated with other predictors of general development.
64Even in the absence of explicit comparisons across time or place, with these metrics an implicit comparison is always being made – 100% indicates universal enrolment or perfect parity. Benchmarks are not set up in a vacuum, but based on what experience elsewhere has shown to be achievable. Whether age-standardization is relevant in principle when countries’ education indicators are compared to international benchmarks is open to debate. A comparative perspective would point towards benchmarking the age-standardized indicators. On the other hand, educational development benchmarks are not purely a “numbers game” : reaching an NER or GER close to 100% is about reducing the number of children who are – as a matter of fact – out-of-school. Counterfactuals concerning how many would be out-of-school if the age distribution were flat should not distract from this.
65In conclusion, countries face uneven demographic challenges in their pursuit of quantitative and equity goals in connection with educational participation. We can quantify this challenge by examining the extent to which apparent successes and failures as measured by common indicators can be attributed not to educational policy, but simply to age composition shifts. This example illustrates the under-explored potential for applying demographic techniques to education statistics and hopefully will serve as an incentive for further investigations in this direction.
Unadjusted and age-standardized gross attendance ratios (GAR and ASGAR) and gender parity indices (GPI and ASGPI) for joint primary+secondary school stages in the Indian states and territories in 1983*

Unadjusted and age-standardized gross attendance ratios (GAR and ASGAR) and gender parity indices (GPI and ASGPI) for joint primary+secondary school stages in the Indian states and territories in 1983*
* Figures are rounded, so the difference may diverge slightly at the level of one decimal point.Note : States or territories for which any single gender/age group contained fewer than 20 observations have been omitted.
Unadjusted and age-standardized gross attendance ratios (GAR and ASGAR) and gender parity indices (GPI and ASGPI) for joint primary+secondary school stages in the Indian states and territories in 1993*

Unadjusted and age-standardized gross attendance ratios (GAR and ASGAR) and gender parity indices (GPI and ASGPI) for joint primary+secondary school stages in the Indian states and territories in 1993*
* Figures are rounded, so the difference may diverge slightly at the level of one decimal point.Note : States or territories for which any single gender/age group contained fewer than 20 observations have been omitted.
Unadjusted and age-standardized gross attendance ratios (GAR and ASGAR) and gender parity indices (GPI and ASGPI) for joint primary+secondary school stages in the Indian states and territories in 2004*

Unadjusted and age-standardized gross attendance ratios (GAR and ASGAR) and gender parity indices (GPI and ASGPI) for joint primary+secondary school stages in the Indian states and territories in 2004*
* Figures are rounded, so the difference may diverge slightly at the level of one decimal point.Note : States or territories for which any single gender/age group contained fewer than 20 observations have been omitted.
Notes
-
[*]
Wittgenstein Centre for Demography and Global Human Capital (IIASA, ÖAW, WU).
-
[1]
Detailed gross attendance ratios by state are tabulated in the Appendix Tables A.1, A.2 and A.3. Additional data for NAR are available from the authors on request.