# Measuring Working-Life Expectancies: Multistate Vector Regression Approach vs. Prevalence-Based Life Table Method

M Nurminen

###### Keywords

demographic aging, employment time, epidemiology, expectancy, health statistics, life table, multistate regression, stochastic process, working life

###### Citation

M Nurminen. *Measuring Working-Life Expectancies: Multistate Vector Regression Approach vs. Prevalence-Based Life Table Method *. The Internet Journal of Epidemiology. 2014 Volume 12 Number 1.

###### Abstract

Background. Demographic aging is ensued by many adverse societal consequences. Extending working life has been proposed as a key measure to adjust involved public health and socioeconomic problems. Measurement of the length of working careers is still not a standard practice.

Objective. This article reviews traditionally used and modern statistical techniques for the analysis of life tables with a view to measure accurately the duration of future employment time. It offers a reasoned discussion of the application of various estimates to calculate working-life expectancy.

Data and Methods. Because of the methodological nature of this review, the studies cited or referred were not selelected systematically, rather they were included based on the author's subjective approval of their relevance, currency and high scientific quality. In particular, this article examines the advantages and limitations of two methods - Sullivan (1) and Davis et al. (2) - and compares their estimates contained in a population-based study that was designed to analyze aggregate sequential labor force survey data from Finland in 2000-2010.

Results. We provide cogent methodological arguments substantiated by empirical findings to evince the better performance of the preferred multistate vector regression approach over the commonly used period life table technique for measuring working-life expectancies in actuarial practice and epidemiological research.

Conclusions. The multistate modeling and estimation methodology presented makes possible a superior statistical analysis of stochastic processes in working life and deals with a real demographic problem with societal consequences encountered foremost in developed countries with aging populations.

### Introduction

Population aging confronts all developed countries and a variety of strategies have been undertaken to address it. In Finland (3), as in many other countries (4), extending the time spent in employment over the life course has been put forward as a key measure to adjust to the increase of the longevity of the population. However, measurement of the length of working careers is not easy and is not yet a standard practice. Thus it is important to apply accurate measures to quantify the duration of future employment time. This paper first refers to the definitions of alternative expectancy measures and analytically re-examines the practices of statistical methods employed for measuring working life or health expectancy. Secondly it compares the estimates of working-life expectancies calculated using two competing approaches, the current or period life table technique (1) and the multistate vector regression approach (2). We then explain the reasons for discrepancies between the results, derived empirically from an application to a recent population-based study from Finland (5). Finally, although many of the issues raised here are specific to Finland; nonetheless, the questions concerning the development of time spent in employment and its measurement are especially relevant for several developed countries.

### Alternative expectancy measures

This section considers the attributes of working-life expectancy in comparison with some other approaches for measuring the length of working life as well as recalls the definitions and historical trends of the alternative expectancy measures.

The estimation of the duration of future employment time is not a simple matter. In reviewing alternative employment activity measures, Hytti (6) discussed the relative advantages and limitations of the retirement exit age (i.e. the average age of withdrawal from the labor market) versus active-life expectancy. She pointed out that the labor market exit age acts rapidly and in the correct direction of changes in the transitions to retirement. However, the exit age measure does this whilst ignoring the cumulative experience up to the present time. In comparison, the expectancy was said to react slowly to the changes in the participation of the labor market and in the usage of pension scheme. But the expectancy measure — which can be regarded as a far-sighted feature — is also influenced by the behavior of the studied population in the preceding years. Another advantage is that expectancy shows whether or not the development tends toward the targets set in official employment and pension policies.

For an individual at a particular age, working-life expectancy is the expected number of working years remaining in one's life (7). While this is a hypothetical construct that cannot be directly measured, it is an intuitive and broadly accessible concept. As such it can provide the means for summarizing and comparing the labor market status of surveyed populations as well as for monitoring the time trends in employment statistics.

This definition is interpreted statistically as the expected (average) value of the distribution of the length of working-life in the population, which is consistently estimated by the sample mean. It is the future duration or occupation time in the employed state, conditional on an individual's initial age. Note that 'future' time is ingrained in the concept of expectation. However, in the context of a birth cohort's follow-up it refers to the aging of its members within the range of a calendar year that was used as the population-time data base for the estimation of the multistate regression model (8). Therefore, the working-life expectancy should not be interpreted as a predicted value. Forecasting beyond the observed data base is treacherous, e.g., because the premises of the underlying regression model may change.

Summary measures for the working years can provide useful indicators for evaluating labor force potential and for evaluating the need for policy adjustments. Although the calculation of the working-life expectancy tables is complicated, their wide use demonstrates that they nevertheless are relevant and comprehensible.

Approaches to measure the length of working life vary according to what object exactly is being targeted and how it is formally estimated. Because of the abundance of early retirement in the past, discussion on extending working lives has focused largely on pension systems (9). Increasing life expectancy also suggests that a natural way to extend working lives is to push the final exit from working life further along the life course. However, over the past several years there has been a growing understanding in Finland and in other parts of Europe that it is useful to see the question of extending working lives in the context of the entire life span (10, 11, 12). The duration of working life is influenced not only by retirement but also by other age-related factors over the course of life, such as participation in education and child care, spells of unemployment or long term absence caused by sickness.

Measures of participation in working life over the life span may be founded on the experiences of actual cohorts. The time that cohort members spent in working life may be measured retrospectively by selecting persons alive at the end of the working-age years and calculating the time they have participated to working life over the course of their lives (13). Alternatively, an actual cohort may be longitudinally followed prospectively from birth up to beyond the working-age years taking into account the mortality of the cohort. Both methods estimate the length of working lives of the cohort members in past circumstances in the history of the cohort. They do not, however, provide us with information on how the length of working life is evolving in time and under prevailing circumstances.

Expectancies aim to reveal the present state of a population. They estimate the average time spent alive in a defined state (student, employed, unemployed, disabled, retired, etc.) computed from age-specific probabilities or risks over a period of time. Partial life expectancies up to a given age were first used in health metrics to calculate the 'healthy life expectancy' of the total life expectancy as a response to the intriguing qualitative question of whether the increase in life expectancy adds to 'healthy' or 'un-healthy' years. Or, to put the question quantitatively: What share of the increase in life expectancy is found in a state of disability? For a demographic-epidemiologic application, see Davis et al. (14).

### Development of expectancy measures for working-life tables

To place the recently developed regression modeling methodologies using working-life tables in the general context of demographic life tables, we turn to the distinctions among of the different approaches to the analysis of labor force participation.

Applications of working population health indicators (15) such as active life expectancy have been numerous (16, 17). They have been applied also for employment patterns by the Social Insurance Institution, Finland (10), and for retirement by the Finnish Centre for Pensions (18). Working-life applications of partial life expectancy have gained growing interest over the past few decades as an analogous issue of how participation in working life develops in relation to total life expectancy (6, 7). As a summary statistic of age-specific participation in working life, working-life expectancy measures the average length of working life remaining for an individual at a given point in time. It is usually gauged in units of expected years remaining in working-life that are meaningful to ordinary laypersons. The term 'partial life expectancy' refers to the average number of years remaining between exact ages x and z (where 15 ≤ x < z < 65, the limiting age) for persons alive at an exact age x. Summary measures of working-life years can provide useful indicators for evaluating labor force potential.

A study for the EU Commission sought to investigate the working-life indicator which should complement the monitoring instruments of the European Strategy by focusing on the entire life cycle of active persons and persons in employment (11). The study recommended using the working-life expectancy as one of the core labor market indicators at European and national level. Recently the Employment Committee decided that the working-life expectancy will replace the average age of withdrawal from the labor force (or exit age) indicator. The expectancy indicator will be utilized for monitoring the European employment guidelines. The measure has already been included in the Joint Assessment Framework indicator package by experts who peruse the targets and trends in employment set for the EU's growth strategy.

Working-life expectancies have been applied using various definitions of working-life participation. An early study in the field measured active life time as opposed to the time spent in retirement (19). Several studies have defined participation to working life as equal to labor force participation (10, 11 [duration of active working life], 12 [labor market expectancy]). The problem with this definition is that time spent gainfully employed and time spent unemployed are merged. By pooling the employed and unemployed individuals these expectancies obscure the age-pattern of working-life participation because employment and unemployment are highly age-specific phenomena but in a mutually reversed manner. Some studies have, thus, defined the expectancy of labor market time as the expectancy of time spent employed (11, 12). Yet another way to measure the participation in working life is in terms of working time. This has been carried out by calculating the expected lifetime duration of working time in hours (11) or by dividing employment time into full-time and part-time employment (12), an action which would also include those working only part of the year due to the seasonal nature of their job. Thus, depending on a person's education, level of skill, and type of job, a non-trivial fraction of one's working life might be spent not working.

We prefer to define the working-life expectancy exclusively in terms of participation in gainful employment. However, to obtain an even clearer picture, we complement working-life expectancy with expectancies estimated for time spent in unemployment and for inactive time. By explicitly distinguishing the multiple relevant labor market positions ('states') and estimating their expectancies we aim to gain a better understanding of the interplay between the working and non-working life over the entire life span. While we recognize the significance of working time and, in particular, the role of part-time work in labor market policies aimed at lengthening life-time employment and adding flexibility to different phases of life, measuring working-time expectancy is beyond the scope of present paper. In theory, the multistate vector regression approach could be applied to partially answer this question. This could be achieved by splitting the 'employed' state into several states based on an individual's usual number of weekly working hours during the observation period. For example, we could specify for a model with the state space: Employed for at least 35 hours a week, employed for between 15 and 34 hours a week, employed for between one and 14 hours a week, unemployed, inactive.

### Statistical modeling and estimating working-life expectancies

Working-life expectancy may be estimated using either marginal probabilities (prevalence rates) or transition probabilities (incidence rates). The actuarial prevalence-based working-life expectancy describes the prevailing work life participation rate in a cross-sectional sample of a population (1). This rate incorporates workers' past labor market experiences as these manifest in the age-specific frequencies of being occupied in a labor market state during a given short time period. Prevalence-based life tables, on the other hand, are based on external information on the proportion of given health dimension, for instance disability, in each state. The Sullivan (1) method relies on widely-available data (period life table and age-specific cross-sectional prevalence of disability). But it makes the 'stationarity' assumption, i.e. it assumes that observed cross-sectional prevalence of disability/mortality is equal to that of the hypothetical ('synthetic') cohort comprised of data from actual cohorts that are present at different ages, in a specified year.

Contemporary work treating marginal probabilities by use of the multistate vector regression modeling approach for estimating working-life expectancies was developed 30 years later in Davis et al. (2). Incidence-based expectancy is calculated from longitudinal cohort data that are required to estimate transitions among various states (5, 7).

While ordinary current life table technique is set up on a calculation of employment probabilities based on a period life table (1), we prefer to base the analysis on a cohort life table modeling approach (2). The former type indicators are limited when applied to intrinsically dynamic processes with multiple decrements, such as the labor force process. The life table calculated from prevalence rates cannot provide the occurrence/exposure rates in a continuous time frame. If labor force participation rates change over time, these trends are incorporated more accurately in the multistate life table method than in the prevalence-based technique.

Considering the advantages and limitations of the two comparative methods, our stand is that, while the period life table expectancy is the most common measure in a readily useable form, it is an estimate pertaining to a particular point in time. In contrast, regression-based cohort life table expectancy, which models and projects future labor force participation, is a more appropriate statistic for description and analysis of long-term behavioral and institutional conditions in national employment systems rather than short-term changes. For the latter purpose, period data on employment, activity rates, etc. should be used, while the multistate modeling approach provides weighted averages of the probabilities to be active over the whole lifetime that can be used for planning and policy development objectives.

Cohort life table expectancy can be theoretically based on large-sample, weighted least squares theory, and therefore allows stochastic data analysis and inference (inter alia, with respect to significance tests, interval estimates, interaction effects, time trends, and projections). The Markov property is not required for the estimation of the working-life expectancies using marginal probabilities (2), whereas the estimation that uses transition probabilities was done under the Markov assumption (20). The failure of the Markov condition would mean that the estimates are not statistically efficient but the method is still useable. When the Markov property does not hold, standard errors can be obtained using the method of Liang and Zeger (21). The multinomial regression approach is suited to the analysis of discrete-time aggregated data that are usually produced by official statistical agencies. Brunsdon and Smith (22) had used the same logistic transformation of the marginal probabilities (i.e. the logarithmic transformation of the ratios of probabilities) as Davis et al. (2). However, they used autoregressive integrated moving average modeling, rather than weighted least squares, to estimate the model parameters.

An important contribution to estimating work life expectancy was made in a methodological paper in the United States by Millimet et al. (23). In the US study, Bureau of Labor Statistics work life tables were subdivided simultaneously by a host of factors such as gender, race, and education, not only by a singular characteristic at a time. Pooling together multiple years of data, rather than using a single wave of the Current Population Survey, ensured that the estimates of working-life expectancy are not overly sensitive to the particular economic conditions that existed in the year the data were collected. The data-analytic approach was to apply an econometric model, instead of a simple relative frequency calculation. The modeling strategy allows one to draw greatly more information about persons' working-life behavior and also permits much more detailed working-life expectancy tables to be constructed. The US study was the first one to recognize explicitly the fact that because working-life tables are generated from survey data, sample variation may be significant.

The Millimet et al. (23) study resembles the research of Davis et al. (2) in many respects. The American model (23), like that of the Australian study (2), explicitly incorporated three labor force states: employed, unemployed and inactive (out of the labor force). However, Millimet et al. estimated their multinomial model on three subsets of data for the working-life states. Similarly Davis et al. use a logistic transform to estimate probabilities, but the estimation of the multistate model parameters is done for the three states together by weighted least squares for a large-sample form of vector regression equation.

The major difference and the novelty of the method of Davis et al. (2), compared to the related method of Millimet et al. (23), is that the former first proves the asymptotic normality of the empirical log-ratios. The next step is the estimation of the parameterized true log-ratios by way of generalized estimation equations. It is only possible to proceed in this way because the method deals with a large number of individuals. Millimet et al. did not exploit the large number of individuals and they used a standard package for maximizing the likelihood function. In a sense the method of Davis et al. (2) is not logistic regression since it ends up with weighted least squares as opposed to solving non-linear likelihood equations by Newton-Raphson or some other numerical device. That is why Davis et al. refer to their approach as a large-sample version of multiple regression.

Working-life expectancies are formally defined in terms of working-life table probabilities. Thus they have a direct probabilistic interpretation. The working-life expectancy measure may be expanded to multistate working-life expectancies by building it on the probability of occupancy in a given state amongst the multiple states of work ability or labor market activity (24). One can also construct a model for transition probabilities between labor or health states simultaneously and use the multistate life table (also known as increment-decrement life table) method (25, 26).

Davis et al.'s multistate vector regression approach has the distinct asset that it allows the summary to be multivariate. In other words, the parameter of the outcome state (i.e. the logarithm of the ratio of the probability of a given state to the probability of the referent state) is expressed in terms of multiple covariates of a temporal, spatial and socioeconomic nature. Given available data, the covariates may also represent alternative labor market or pension policies. An analysis of multistate working-life expectancies by using a multivariate regression model enables one to capture the joint impact of several simultaneously contributing causes of early retirement (e.g. via disability caused by work). Nurminen (5) utilized these attributes of the multistate regression method in order to gain a reliable picture of the multidimensional changes in Finnish working-life expectancies during the years 2000-2010.

Estimating working-life expectancies based on marginal probabilities for occupancy in labor states and setting prediction intervals around the estimates can also be done using the multistate regression modeling approach (5). Recently, Majer et al. (27) presented a theoretical framework for a multistate life table model that projects transition probabilities by the Lee-Carter (28) method, and illustrates how it can be used to forecast future health expectancy.

### Comparing multistate and prevalence-based life table methods

For illustration we present results from a national population-based study conducted in Finland (5) in which life table techniques were used for estimating expectancies from official statistics which are readily available.

The methodological interest in this paper focused on the appropriate usage of inferential tools for discrete time stochastic processes in practical research applications.

As already mentioned, the novel approach developed by Davis et al. (2) for estimating working-life expectancies differs from the traditional Sullivan (1) method in many fundamental facets. Although the advanced method uses data from the life tables and the annual Labour Force Surveys of Statistics Finland (29), it estimates the working-life expectancies jointly for multiple years throughout the entire study period. An alternative approach that has been applied to the analysis is to carry out separate estimations for a series of survey or census years and then fit a curve to describe trends, as was done in Hytti and Nio (9) in their monitoring of cross-sectional employment activity data over a number of years. The present analysis spanned eleven years (2000-2010) and a large number of individuals, so the results are not so sensitive to economic conditions as a survey that would rely on only a single year of data.

Nurminen (5) based the analysis of cohort or panel data on a large-sample regression model (Appendix A) fitted to a multistate life table (Appendix B), instead of a simple relative frequency calculation from the average demographic experiences of artificial cohorts (constructed using an arbitrary radix for the number of survivors at each given age). This stochastic inferential approach allows one to draw probabilistic inferences on several work-life characteristics and also permits much more detailed working-life tables to be estimated, e.g. stratified by socioeconomic factors. The study modeled the state probabilities as a function of polynomials of age and year, and Gross Domestic Product. The set of variates describing demographic and economic conditions faced by persons can be expanded, but not at will. This modeling approach enables one to circumvent the problem of small cell sizes encountered in a disaggregation of the data.

The multistate regression methods were developed to overcome the limitations of the traditional prevalence techniques. The states are defined to be multiple, some of which are transient (or recurrent) while others are assumed non-transient. In the quoted study (5), the customary life table was enhanced by explicitly defining a three-state employment state space: 1. 'employed' (permanently employed, employed for fixed-term, and self-employed); 2. 'unemployed'; 3. 'outside the labor force' (students, conscripts, disabled and old-age pensioners, etc.). State 4. 'dead' was taken as the reference state. In the context of this analysis, the sum of the working-life expectancy and the expectations of time spent in the states 'unemployed' and 'economically inactive' is equal to the partial life expectancy between exact ages 15 and 65 years, as normally defined by demographers. This decomposition is different to the two-state system which estimates the duration of 'active working life' by classifying persons as 'active' (in the labor force) or 'inactive' (out of the labor force) (9). The relative frequency approach relies on age group or other subgroup comparisons and produces estimates of the (marginal or transition) probabilities derived from average behaviors of the sample at each age. The tabular analysis of further disaggregated data (e.g. by allowing various modes of exit from the labor force) would necessarily turn out to be cumbersome or impossible without resorting to modeling. The regression analysis of panel or cohort data is applicable when the numbers are reasonably large.

The results evinced that the duration of working lives in Finland have extended positively for both genders in the 2000s (Table 1). For a 15-year-old male the expected length of working career up to age 64 in 2010 was 34.6 (95%confidence interval, 34.3-34.8) years, while for females it stood 34.0 (33.6-34.4) years. The favorable trends are forecast to continue up to 2015 under the provision of economic equilibrium (5).

The working-life expectancy measure has been characterized as being sensitive to volatile labor market variations in a report of the working group for lengthening working careers (30). For example, in 2008 the Finnish working-life expectancy at age 15 was 34.6 years but it decreased due to the rapid decline in employment in the recession year 2009 by one whole year. Actually, this expectancy was computed using the traditional technique (1) on a year-by-year basis. The multistate vector regression (2) approach to expectancy, which is based on fitting a smooth model over the studied interval, herein 2000-2010, does not overestimate the effect of such changes on the total length of working careers (Figure 1). In this study (5), the model yielded the following estimates of male working-life expectancies for the years 2008, 2009, and 2010: 34.5, 34.2, 34.6. The drop from 2008 to 2009 was only 0.3 years and the counteractive rise from 2009 to 2010 was 0.4 years.

Finally, because working-life tables are generated from survey data, sampling variation may be important (e.g. due to population dynamics, economic fluctuations, interview methods), especially in small samples. Although the Finnish official research institutes acknowledge this fact, they do not provide standard error estimates for their active working life expectancies (17). Under stationary conditions (i.e. independence of an initial health state), a new 'equilibrium' estimate of the prevalence rate and its approximate variance has been developed (31). In the Davis et al. (2) approach, standard errors (and covariances) can be found by using the delta method based on the loss function or alternatively estimated regression coefficients.

We point out that standard errors, based on either the delta method or Monte Carlo simulation, only reflect the inaccuracy in model estimation from observed prevalence rates. These rate estimates have an inherent variance because they are obtained from a sample survey (rather than a census). That means that our standard error calculation does not incorporate the variance in the estimated rates.

To contrast results obtained with these two fundamentally different methods Table 1 presents working-life expectancy estimates obtained for the 15-74 span of age for the years 2000-2010. The regression model-based estimates indicate lesser variation than the corresponding prevalence estimates for both genders. The current working-life table estimates designate large fluctuations especially for males around the economic upturn year 2008. Both series of estimates show lower figures for males in the economic recession year of 2009 compared to the neighboring years. The recession affected foremost men's employment. Preliminary statistics pointed to the fact that the recovery was delayed until 2011 among women. A finding is that the absolute difference between the two sets of expectancies has risen and then fallen over time.

Yet these estimates are parallel taking into consideration the basic methodological differences in the estimation approaches: viz. modeling versus tabulation. The two methods differ in that the multistate regression estimates are derived from cohort life table data for rates of change between the states (of being either (un)employed in the labor force or being outside it, or finally leaving the population through death), whereas the current life table estimates do not reflect dynamic changes but instead are computed directly from annual labor force participation rates during the period of sampling. Theoretically, the two methods should give identical results if the populations were stable and age-specific transitions did not alter from year-to-year. Current life tables formed in a recessionary period, during which labor force exits increase, present a bleak picture of working-life involvement. Conversely, those tables calculated during a subsequent period of recovery tend to exaggerate labor force attachment (32).

Uncertainty of the expectancy figures is inherent in any statistical estimation. Debating whether the actual working-life expectancy in a given year and a certain sub-population is exactly 35.5 or 35.2, or which method is somewhat less accurate, is missing the essence, unless one wants to look at a 'snapshot' rather than a trend to gain a wider perspective. Almost any period chosen for observation provides somewhat inaccurate (biased) working-life expectancies for at least some demographic groups. Nevertheless, the discrepancies between the working-life expectancies estimated using the two different methods are important and large, e.g., for year 2010: 2.0 y for males and 1.5 y for females in excess of the model-based estimates in lieu of the prevalence-based estimates.

### Discussion

Working-life expectancy is a period or cohort measure, depending on whether cross-sectional or longitudinal data are available. Usually the latter data are not readily accessible or are prohibitively expensive. Real people age and die as members of cohorts through successive periods subject to ever-changing rates. Period life expectancy is constructed as an entirely synthetic measure, referring to a fictitious cohort living its whole life according to the rates of a single period. Thus working-life expectancies, defined in this study abstractly through imaginary cohorts, can be paralleled with the experience of real cohorts, and makes them more tangible and easier to interpret. Davis et al. (2) remarked, "Sullivan's method contains both period and cohort considerations but predominantly the former. The cohort element arises typically from surveys designed to estimate prevalences of health [or labor force] states whereas the period component is due to the use of standard life tables. [p 1097]"

The major novelty of the multistate methodology is in its way to reconstruct the relevant elements of the longitudinal stochastic process that generated the working-life table datasets from sequential cross-sectional surveys. This was made possible by estimating the marginal probabilities using a weighted least squares procedure under a multistate vector regression model, and this in turn led to estimates of cohort expectancies. The goal of Davis et al.'s approach is not to reconstruct the full increment-decrement system, which they recognize cannot be recovered with their methodology, but to estimate unconditional health expectancies, as in the Sullivan method. They use successive cross-sections not to estimate inter-survey period conditions, but to track actual cohorts as they appear in these cross-sections. They then estimate corresponding cohort health expectancies, with the aim of recovering the multi-state system prevalent during the inter-survey period. Thus the Davis et al. approach is very different to the traditional Sullivan technique that does not yield a cohort measure from cross-sectional data.

Mathers (33) has noted, "The problems with Sullivan's method arises not because it uses prevalence and mortality data averaged over all health states, but because the data it uses are dependent on past conditions in the population [p 190]". Therefore, the Sullivan method for calculating period expectancy does not either produce a 'pure' cross-sectional indicator derived from the current prevalence rates summarizing the experience of a population at a point in time. Consider the period expectancy computed from standard life tables, for example at the age of 25 for Finnish men in 2010. It is a pure cross-sectional indicator only in the sense that it gives the expectation of remaining employment time for persons who experience at each age of their life the risk of moving outside the workforce, observed for Finnish men of that age for in 2010.

There has been some debate in the literature about the magnitude of the bias in the Sullivan method; see Guillot and Yu(34) for more references. Certain authors have concluded on the basis of actual data that the differences between the observed and equilibrium proportions of healthy persons are significant, and that health expectancy estimates based on the Sullivan method should not be used for conclusions on compression or expansion of morbidity (15, 35). On the other hand, a simulation study indicated that Sullivan's method provides acceptable estimates of the period working-life expectancy if the changes in transition rates over a reasonably long term are smooth and fairly regular (36). In particular, the repeated application of the method can provide good estimates of trends in health and working-life expectancy. However, there is no guarantee that the assumption of the Sullivan method holds in real situations. Moreover, age-specific transition rates and conditional health expectancies cannot be estimated with the Sullivan method. But as Davis et al. (2) pointed out, "By definition the Sullivan method as described cannot supply these estimates [of cohort expectancies], except in so far that a period measure is a surrogate for the analogous cohort quantity [p 1099]."

As has been insightfully recognized by Goldstein and Wachter (37), period and cohort expectancies can be interpreted as a time-delayed (lagged) measure of the experience of real cohorts in populations undergoing steady demographic changes such as mortality improvement and aging. They showed the correspondence between period life expectancy at birth and the life expectancy of particular cohorts. Their approach was to look at the relationship between period and cohort life expectancy in terms of two measures: 1. the lag of years by which the equivalent period and cohort life expectancies are observed, and 2. the gap in life expectancy that a cohort gains by experiencing improving, rather than steady, mortality.

For countries with mortality rates close to those of the present day Finland, today's period life expectancy at birth summarizes the expected longevity of people born about 40 to 50 years ago who are or would be now in the prime of middle age. Current life expectancy at age 65 matches expected survival beyond 65 for people who celebrated their 65th birthday with a lag of some 15 years ago (cf. 38, 39).

The presented study (5) found that the marked difference (gap) between period (the Sullivan method) and cohort (the Davis et al. method) working-life expectancies at a moment in time rose and shrunk over the last decennium (the 2000s). The likely reason for this observable fact is the volatility of market forces.

Compelling empirical evidence of this relationship was previously provided by the prospective follow-up study of an actual cohort of active Finnish municipal workers at three successive surveys. The data were analyzed both as aggregate cross-sectional data for estimating marginal probabilities (24) and as individually-linked longitudinal discrete-time aggregate data for the estimation of transition probabilities (25) between four work/heath states. The index expectancy state was defined as currently/continued having excellent or good work ability. A comparison of the working-life expectancies gave the following estimates for a 45-year-old man up to retirement age of 62.5 years in 1981: 7.3 years (transition probabilities), 5.5 years (marginal probabilities). The gap favored, by 1.8 years, the working-life expectancy based on transition probabilities, calculated conditional on having been in the index state at the preceding time point.

### Conclusions

To summarize the differences between the applied regression estimates that simulate the cohort expectancies (2) and those obtained by the period table technique (1) we underscore two points. First, the Finnish study (5) used estimates of the multistate probabilities based on a parametric model rather than non-parametric working life table estimates which do not take into account the decline in mortality over time. Second, the estimated state occupation times pertained to a particular birth (age) cohort, instead of the study of a particular point in a given period of calendar time (year). We conclude that cohort measures are preferable over the period measures because they are more relevant to persons now living and to planners of future health services.

### Acknowledgments

Dr. Brett A. Davis of the Australian Institute of Health and Welfare, Canberra, ACT, provided invaluable expert advice in the application of stochastic process analysis to working-life expectancies.

I am very grateful to Dr. Tuula Nurminen for perusing the final draft and for her most valuable suggestions which significantly improved the quality and clarity of the article.

The author is greatly indebted to Mr. Joseph Brady for his skillful English language revision of the original manuscript.

##### Table 1*

* Source: Nurminen (5)

§ The age group 65-74 years added to the working-life expectancy computed for the age interval 15-64 years 1.1 years for men and 0.8 years for women in 2010.

# In its strictest form, cohort (working-) life table records the actual experience of a particular group of individuals (the worker cohort) from a specific age to (the final retirement from the labor force or) death. The period (working-) life table considers the experience of a given population during one short period of time, for example, the resident population of Finland in 2010.

##### Figure 1*

* Source: Nurminen (5)

§The period life table estimates were computed using the Sullivan (1) method and the model-based estimates with the method of Davis et al. (2). The values for the current year 2011 are preliminary estimates.

### Appendix A

Details of modeling and estimation methods

These elements are extracted from the methods description in Nurminen et al. (3). A full explication of the stochastic modeling and inference is given in the thesis of Davis (40).

The interest here is on estimating the marginal probabilities and working-life expectancies that are not conditional on the initial state, but only on the initial age x and gender. For j = 1,2,3,4 and 14 < x < 65, let * _{j}(x)* be the random variable denoting the number of lives in state j at age x, and let the vector of the frequencies be . We make the assumption of homogeneity in the sense that individuals in the same cohort independently follow the same probabilistic model.

Then define the expectations l_{j}(x) = E[_{j}(x) ], l(x)= E[ (x)], j = 1,2,3,4, and assume as in Davis et al. (2):

(a). The expectations l_{j}(x) = np_{j}(x), where n is the number of lives in a hypothetical cohort.

(b). As n tends to infinity, n^{-½}{ (x)-l(x)} is asymptotically normally distributed with zero mean and covariance matrix of rank 3.

(c). Birth cohorts are stochastically independent and for each age x the random vector (x), for large n, follows approximately a multinomial distribution with parameters n and p_{j}(x).

With state 4 (dead) as the reference, we form the partial log-odds, that is, the ratio of the probability of being in state j at a given age x relative to state 4:

, j = 1,2,3, (A.1)

which are estimated consistently by .

Next let be the vector of the log-ratios and be its estimator. It follows (2) that n^{-½} is asymptotically normally distributed with zero mean and covariance matrix

V(x) = (A.2)

where

Owing to the one-to-one correspondence between the marginal probabilities for all the states of the stochastic process at a given age x and all the marginal log odds, one is able to estimate the p_{j}(x) through the use of regression estimates for the

Exploratory analysis can be used to suggest a parametric form for the log ratios, , and the estimation of β is done by weighted least squares. With the resulting estimate of β we have the estimates:

, (A.3)

, j = 1,2,3,

Thence the estimated working-life and related expectancies (for given age z) are

, j = 1,2,3,4, (A.4)

where the maximum age of retirement or exit is assumed to be 64 (or 74) years. These integrals can be evaluated using a discrete approximation but we applied the S-Plus function integ.spline (41), which integrates under a spline function through a set of points.

If the vector of log ratios is modeled by ξ(t,x) ξ(t,x;β) = Z(t,x)'β, with Z(t,x) an appropriately chosen design matrix, then the loss function to be minimized with respect to β is

(A.5)

Note incidentally that the inverse V^{-1} is the covariance matrix of a multinomial distribution with probabilities p_{j}, j = 1,2,3,4.

Due to dependence along age cohorts, that is along diagonals of the (t,x) plane with c = t–x constant, the variance-covariance matrix of β was calculated by the method of Liang and Zeger (21). The generalized estimating equations derived by Davis et al. (2) can be solved for the regression coefficients and to obtain standard errors of the estimated probabilities either analytically or by means of Monte Carlo simulation.

Finally, the estimation of expectancies conditional on having reached an age z greater than 15 can be computed as . Hence the expectancy (up to age 64) of state j for a person of initial age z is

(A.6)

This is estimated consistently by substituting the of equation A.3.

The second-order moments of the probabilities can be estimated using the delta method. That is, the covariance matrix of is obtained from the following expression:

(A.7)

### Appendix B*

Sampling scheme for the analysis of the multistate working-life table. The table gives the numbers of men who were in state j at age x and year t, lj(x,t), j=1,2,3,4, compiled from the Finnish Labour Force Survey annual files 2000 through 2010. Birth cohort of year 1991 is shown as an example.§

*Source: Nurminen (5)

§Data vectors of the form, L(x,t) = {l1(x,t), l2(x,t), l3(x,t), l4(x,t)}', where: l1(x,t) = number of individuals employed at the time of the 2000 survey; l2(x,t) = number of individuals unemployed; l3(x,t) = number of individuals out of the labor force (students, conscripts, disabled persons, old-age pensioners, etc.), l4(x,t) = number of deceased persons. Data used in the estimation procedure for all eleven survey years were available for persons born in 1947-1996, that is 50 age groups of 15 to 64 years in 2000. Since the computation procedure requires at least 5 observations per birth cohort along the diagonal of the (x,t)–matrix, the last estimable age group was 60 years. The frequencies of individuals, born in the same year, in the 4 states can be followed down the diagonals of the table, as shown for age 20 (birth cohort of year 1991 in bold face). Note that the total sizes of the cross-sectional samples vary. This is because the persons drawn into the samples were not individually linked. A cohort is defined as a closed population in the sense of being closed for exit. In this scheme, cohort is taken to mean a sample from a population born in the same year with turnover in membership in a given period of time.