# Estimating Conditional Working Life Expectancies from Aggregate Cohort Data

M Nurminen, C Heathcote, B Davis

###### Citation

M Nurminen, C Heathcote, B Davis. *Estimating Conditional Working Life Expectancies from Aggregate Cohort Data*. The Internet Journal of Epidemiology. 2004 Volume 1 Number 2.

###### Abstract

In this paper, we estimate transition probabilities between different health states that lead to the estimation of future occupation times in the work ability states conditional on an individual's initial state and age. We analyse data obtained from three longitudinal surveys on active Finnish municipal workers. We found that men permanently leave the work force due to disability or death earlier than women in all age groups, regardless of whether they commenced in a better or worse work ability state. Women tend to retire on old-age or similar pension before men, especially those women with an initially fair or poor capacity for work. The data suggest that the work ability of Finnish aging workers appears to deteriorate prematurely and that individuals leave employment before the statutory retirement age. Thus the work ability of employed persons should be followed already at sufficiently early ages when it is still possible to intervene in the process.

### Introduction

In this paper, we analyse a cohort whose members move between four health states: (1) having excellent or good work ability; (2) having fair or poor work ability; (3) on a disability pension or deceased; and (4) retired on an old-age or similar pension. The purpose of this study is to estimate occupation times in the four states, given initial state, sex, and age. Clearly the paper addresses a problem which is relevant to occupational health policy. The usefulness of this quantification becomes vital considering that in many developing countries populations are aging, and that the large age cohorts will in a few years retire and will require more intensive and more expensive health care. Thus it is important to know what are the average times that people continue working in a given health state. We differentiate nonworking life by defining disability and retirement as separate states. This is especially important in understanding the quality of life among aging persons and the inequities of access to retirement.

The data on work ability analysed in this paper originate from the surveys on aging and work ability carried out by the Finnish Institute of Occupational Health (FIOH) in 1981, 1985 and 1992. For these three sample years it was possible to observe what, in effect, is a fragmentary longitudinal survey giving the numbers of transitions between the four states. On the basis of this limited data set the problem is to estimate the expected times spent in each of these states by workers who were aged 45 to 51 years in 1981.

We used a non-homogeneous discrete time Markov chain on the above four states to model the data with the longitudinal counts of transitions at irregular time points. We then estimated the one-step transition probabilities of this process by a recently developed method of least squares interpolation. Interpolation is required to estimate the one-step probabilities because the data are available only at widely and irregularly spaced times. Given an individual's initial state of work ability at a specific age and year, we used estimates of the transition probabilities to estimate the expected future occupation times, called expectancies, of the different states and their standard errors. These expectancies are conditional on knowledge of the initial state and as such they are called conditional working life expectancies. Details of the estimation of transition probabilities can be found in Davis et al._{1} and a brief description is given in the Appendix. Similar methods for treating marginal cross-sectional data are described in Davis et al._{2} The most important difference with methods that are better known in demography_{3},_{4} is that we interpolate in the time dimension and age dimension simultaneously. Polynomial interpolation in the time dimension _{5} and in the age dimension (graduation) are well known. But fitting a two-dimensional polynomial with interaction terms to the Lexis surface is to the best of our knowledge a novel approach. For a similar approach in the field of mortality studies, see Heathcote and Higgings._{6} A related paper is Millimet et al. _{7} Existing methods for analysing multistate (increment-decrement) life-table data have been described comprehensively in the demographic literature._{8},_{9},_{10} and applied to study the association between economic activity and health. Thus, these methods are well-known among economists, but less familiar among epidemiologists. For economists, the principal obstacle to labor force participation is often illness, and for epidemiologists, disability is often revealed by the inability to work normally.

The data come from a longitudinal survey of Finnish municipal workers who are required to leave the work force on or before their 63^{rd} birthday. States (1) and (2), describing levels of work ability, are non-absorbing, whereas states (3) and (4) are absorbing in the sense that no return from them to the workforce is possible. The maximum working age constraint means that the comparison of expected occupation times for the absorbing states has direct inferential implications for the propensity of workers to be affected by events which cause their working lives to cease. Consider the subpopulation of all individuals of a specific age and work ability state at a particular point in time. Due to the imposition of a maximum working age, all such individuals will be observed for the same time period which, for each individual, consists of time in active work followed by transition into one of the absorbing states. Hence, comparing expected occupation times for the absorbing states is equivalent to comparing the effects of causes of work life termination.That is, an absorbing state with a long expected occupation time corresponds to a cause of early exit from the work force.

This paper is organized as follows. In the next section, we describe the longitudinal surveys of aging Finnish municipal workers. The following two sections compare the observed and modelled multi-multi-step transition probabilities and give expressions for working life work life expectancies. The Results section presents estimates of future occupation times and discusses the statistical significance of their differences. The final section closes with remarks on the analysis of aggregate data and the significance of the empirical results on the prevention of disability and premature retirement. The Appendix gives some technical details.

### Data and methods

### The Finnish work ability surveys

The determination of work ability and its level relied on the self-assessment of workers who replied to a questionnaire from the FIOH in 1981, 1985 or 1992. The nonresponders were persons who either did not return the questionnaire or did not answer the questions regarding work ability, but who neither received pension nor were deceased. The nonresponders may include unemployed persons, though unemployment in the municipal sector was rare before the depression started in Finland in 1992. The nonresponse rate was only 5% of the subjects enrolled in 1981, and it was ignored in our analysis, as it was judged to make little difference. Tuomi_{11} and Tuomi et al._{12},_{13} give a detailed description and discussion of the study. An analysis of the cross-sectional distributions of the cohort population in the different work ability states will be reported elsewhere.

For the purposes of this paper, it suffices to note the following points. The cohort was assembled through a nationally representative sample of 6 257 active municipal employees, aged 45 to 58 years in 1981. As the initial study commenced in 1981 when all the cohort members participated actively in the work force, we obtained the cohort data at three time points. At the time of the second survey, the subjects were divided into groups of active workers (including persons who did not respond to a questionnaire but who had not retired), ex-workers (on disability or old-age pension), and deceased persons. In the third survey, when the subjects were 55-69 years old, 23% of them were still actively working, 30% were on disability pension, 41% were retired on old-age pension, and 6% had died. The actual retirement age is about 59 years on average in Finland, despite the common official retirement ages of 63 or 65 years. At present the statutory retirement age for municipal workers varies and is between 63-65 years.

The work ability of a worker is classified as excellent, good, fair or poor depending on a work ability index score; for details on the construction of this index, see Tuomi et al. _{14} Individuals may permanently leave the work force due to disability or retirement or death. Disability pensioners are here defined as persons with a medically diagnosed disabling disease and persons aged 60-64 years with a permanently weakened work ability which incapacitates them to work for wages. Retired persons are ex-workers who have permanently left the work force for reasons other than disability or death; that is, persons on old-old-age, unemployment, early-retirement or veteran's pension. Since the frequency of death in the observed age range is small we have combined death and disability into one state. Hence, at any particular age, an individual is classified as belonging to one of the following four mutually exclusive and exhaustive states: having excellent or good work ability (state 1), having fair or poor work ability (state 2), on a disability pension or deceased (state 3), retired on an old-age or similar pension (state 4). States 3 and 4 are assumed to be absorbing (the few transitions from retirement to death were ignored), states 1 and 2 are clearly transient (i.e.,non-absorbing).

The data used are the observed number of transitions between these states during the periods 1981 to 1985 and 1985 to 1992. These frequencies were available for the male and female birth cohorts aged 45 –51 years in 1981, so there are 14 data points for each gender. Table 1a shows the observed transition frequencies for the female cohorts aged 45 and 46 years in 1981, the corresponding data for men are shown in Table 1b. Note that occupation times in each state are not known, in spite of the longitudinal design.

### Parameter estimates

Let p_{ij} (x_{r},x_{r+1}) be the conditional multi-step probability that an individual is in state j at age x_{r+1} given given that he/she was in state i at age x_{r} . Then for time interval (x_{r},x_{r+1}) and transient state i, consider the log partial odds Θ_{ij} (x_{r},x_{r+1}) of transition to state j ≠ i defined by

(1)

Θ_{ij} (x_{r},x_{r+1}) = log {p_{ij} (x_{r},x_{r+1})/ p_{ii}(x_{r},x_{r+1})}

The use of log partial odds is convenient as it gives estimates of the transition probabilities through the logistic formulation. Suppose further that the Θ_{ij} (x_{r},x_{r+1}) are parametrised as Θ_{ij} (β,x_{r},x_{r+1}), where β is the parameter vector of the one-step log partial odds Θ_{ij} (x)=log{p_{ij} (x,x + 1)/p(_{ii}(x, x+1)}. Inverting the one-to-one correspondence in equation (1) yields the following logistic formulation of the transition probabilities

Working life expectancies are given by formula (3) below.

In the case of the Finnish work ability data, the regressors for the one-step log partial odds are age x and year t of transition. For each Θ_{ij} (β_{ij} ,(x,t),(x + 1, t + 1)) , it is necessary to determine the form of the most suitable polynomial model. To do this we generated a pseudo-dataset consisting of observed transition frequencies obtained using spline interpolation to a unit time grid in both age and year. In each case, the inspection and exploratory analysis of the plots of the observed partial odds, Θ̃_{ij} (β_{ij} ,(x,t),(x + 1, t + 1)) , for the pseudo data were used to determine the most appropriate model. (Superimposing a tilde on a letter indicates that it is a random variable with the same letter without a tilde being its expectation.) Note that the logistic regression method has the advantage that the form of the model for each of the observed partial odds need not be fixed a priori but can be determined by the pattern of the data. Although there are other possible choices, such as the complementary log-log function, the logistic function is more readily interpretable. The regression models can include indicator variables for factors such as socio-economic status when appropriate data are available.

The parameter estimates and their standard errors, computed in S-Plus using the method outlined in the Appendix, are given in Table 2. A polynomial in age, and interactions with year, improved the quality of the model fit. Also a plot of the observed p̃_{ij} ((x_{r},t_{r}),(x_{r + 1}, t_{r + 1})) and the results of the weighted least squares estimation, the p_{ij} (β*;(x_{r},t_{r}),(x_{r + 1}, t_{r + 1})), are shown graphically for women in fig. 1, fig. 2 and fig. 3. The corresponding plots for men are displayed in fig. 4, fig. 5 and fig. 6. Note that in these figures the fitted multi-step transition probabilities, the p_{ij} (β*;, x_{r},x_{r+1}), are obtained by multiplying the appropriate estimated one-step transition probabilities.

##### Table 2

##### Figure 1

##### Figure 2

##### Figure 3

##### Figure 4

##### Figure 5

##### Figure 6

The figures also show the 95% confidence intervals for the p_{ij} (β*;, x_{r},x_{r + 1}) based on standard errors obtained by the delta method.

### Working life expectancies

Consider a health surveillance program conducted for a population of workers who are known to permanently leave the work force on or before reaching a maximum working age w. For example, Finnish municipal workers can remain in the work force until their 63^{rd} birthday at the latest, that is, w is here taken to equal to 62 years. Recall that an individual's reason for removal from the Finnish work force is classified into either of the following two categories: disability and death (state 3), or retirement (state 4).

For an individual initially in state i at age x in year t (both age and year fixed), the expected future occupation time in state j is the area under p_{ij} ((x,t),(y,t + y - x)) as y varies between x and the maximum working age w. A standard approximation to this area cf. _{15}, Part 2 is

Equation (3) is taken as the conditional working life expectancy of state j given initial state i at age x in year t. Replacing p_{ij} ((x,t),(y,t + y - x)) with its estimate p_{ij} (β*;(x,t),(y,t + y - x)) in (3) gives an estimate ê_{ij} (x,t) of this occupation time.

The results discussed always have the initial year fixed at 1981 (the year of the first survey) so, for ease of notation, we will henceforth suppress the argument t. Under this convention, (3) can be rewritten as follows:

Consider a worker initially aged 45 years; this individual has at most between 17 and 18 years of remaining work life (before reaching his/her 63^{rd} birthday). Under the discrete approximation in (4), for a given initial state, the conditional working life work life expectancies sum to 17.5. In general, these expectancies sum to (62 - x + ½) years for a worker initially aged x years.

### Results

Table 3a gives values of ê_{ij} (x), with i = 1,2 and j =1,2,3,4, for the female cohorts aged x = 45,46,...,55 years in 1981 (standard errors are given in parentheses). The same information for the corresponding male cohorts is given in Table 3b.

##### Table 3a

##### Table 3b

The standard errors permit tests of the statistical significance of the difference in estimated occupation time between women and men for a particular initial age in 1981. For example, ê_{ii}(45) is 7.73 years for women and 7.28 years for men. Since the standard errors on these estimates are 0.29 and 0.24 years, the standard error of the difference is [(0.29)^{2} + (0.24)^{2} ]^{0.5} = 0.38 years. Hence, the 95% confidence interval for this difference is (-0.31, 1.21) years. Therefore, for persons aged 45 years in 1981 initially in state 1, we infer that the gender difference in future occupation time of this state is not statistically significant. In contrast, for age 52 years, the corresponding 95% confidence interval is (0.11, 1.71) years indicating a significant difference in favour of women. Similarly, for initial ages x = 45,46, ... , 55years in 1981, two-tailed hypothesis tests (with a significance level of 5%) for equality between the sexes in ê_{ij} (x) were carried out for i = 1,2 and j = 1,2,3,4. The results are summarized in Table 4.

##### Table 4

Occupation times for the absorbing states are of particular interest because, under the assumption of a maximum working age, these estimates provide information on the timing and cause of an individual's permanent removal from the work force. For example, consider the female cohort aged 45 years in 1981. For state 3, the estimated future occupation time is 3.0 years for individuals initially in excellent or good health, and 5.3 years for those initially in fair or poor health. Hence, an individual in state 2 is estimated to be affected by disability or death 2.3 years sooner than an individual in state 1. Table 5a shows values of the differences ê_{13}(x) - ê_{23}(x) and ê_{14}(x) - ê_{24}(x) , together with 95% confidence intervals and Student's t – statistics, for the female cohorts aged 45-55 in 1981. The same information for the corresponding male cohorts is given in Table 5b.

##### Table 5a

##### Table 5b

For all female cohorts aged between 45 and 55 years in 1981, individuals initially in state 2 affected by disability or death are estimated to leave the work force earlier than those similarly affected but with initial state 1. Considering state 4 we observe that, for initial ages 45-51, those commencing in the better work ability state retire before individuals commencing in the worse work ability state, and conversely for ages 52-55 years. Note that the differences in occupation times for state 3 are statistically significant for all initial ages, and for state 4 the differences are significant at all ages except 51 and 52 years.

Estimates from all the corresponding male cohorts indicate that, for those affected by disability or death, individuals initially in state 2 have significantly shorter work lives than individuals initially in state 1. Also, in the case of all the male cohorts studied, persons initially in state 1 retire before individuals initially in state 2 with significant differences occurring at initial ages 48-52 years.

For given initial work ability state, age and year, we can also compare the expected future work life of individuals leaving the work force into state 3 with individuals leaving into state 4. Table 6a shows values of the differences ê_{13}(x) - ê_{14}(x) and ê_{23}(x) - ê_{24}(x), together with 95% confidence intervals and Student's t – statistics, for the female cohorts aged 45-55 years in 1981. The same information for the corresponding male cohorts is given in Table 6b.

##### Table 6a

##### Table 6b

Consider initial state 1, women aged 50-55 years in 1981 leaving to state 3 are estimated to have significantly longer work lives than those who leave to state 4. For men this is the case for initial ages 53-55 years, and significant differences in the opposite direction are estimated for ages 45-49 years.

In the case of initial state 2, women leaving to state 3 are estimated to have significantly shorter work lives than those leaving to state 4 for initial ages 45-51 years and significantly longer working lives for initial ages 54 and 55 years. For males, in this comparison, individuals in all cohorts aged 45-55 years in 1981 are estimated to have significantly shorter work lives when leaving to state 3.

### Concluding remarks and summary

### Methodologic issues

We applied the methodology of Davis et al.1 for aggregate longitudinal data with the objective of describing transitions between health states in terms of a regression model for covariates, while taking into account the correlation among repeated observations for an individual. In this case, the regression coefficients have implications for the population rather than to an individual, and hence the concept of 'population-averaged' model can be applied. _{16} This approach has two distinctive features. The first is that the population average response is the focus, and the second is that subject heterogeneity is not explicitly modeled. This type of modeling is effectively applied in epidemiologic studies such as those conducted on occupational health. Here the focus is the difference in the outcome rate between two groups (e.g., persons with work ability status 1 vs. 2) with different risk factors, rather than the changes in an individual's probability of a certain endpoint over time as a function of covariates.

Often in the case of longitudinal data, for some observations we may not know the exact time of the event at issue but only that it occurred within a certain time interval, known as ‘interval-censored' data. We tackled the problem of incomplete time-to-event information by modeling missing observations as a smooth regression surface of the calendar year and participants' age. The adopted estimation and interpolation procedure is similar to the practice of using a two-dimensional projection-pursuit regression surface,_{17} but is superior to simpler method of using a polynomial interpolation in the time dimension because it is also based on participants' age. The importance of smoothing simultaneously on the age dimension is that, because of the underlying 'censoring' process, the true observations typically are conditional on participants' age.

Use of the log partial odds and logistic parametrisation has the advantage that it facilitates the joint estimation of the transition probabilities. By contrast, Tuomi et al. 13 estimated the probabilities leading to the state of disability, e.g., directly by first excluding persons who had retired or died. Because the exclusions were of different magnitude for women and men at different ages, and because disability is associated with mortality, the obtained estimates of the sex-specific probabilities are not comparable and do not characterize the underlying counting process.

There are alternative approaches to log partial odds regression modelling, which make use of aggregate data. Time series can be constructed from a sequence of cross-sectional surveys. In autoregressive models, the series are generally assumed to be time-dependent. However, it is not obvious that the autoregression methods have offsetting advantages over simpler model-building procedures, which would compensate for being statistically involved. The autoregressive methods comprise complex estimation processes that demand computer-intensive tasks. The quasi-likelihood approach for 'observation-driven'_{18} and 'parameter-driven'_{19} models for time series are examples.

The multi-state (multiple ‘increment-decrement') models used in demography have their own limitations.8,9,10 These models are not generally applicable when the number of non-absorbing states in the model is greater than two. Also, the methods do not permit the direct parametric modeling of the age dependence on transition probabilities. Furthermore, formulae for the standard errors of the working life expectancy estimates need development.

### Empirical findings

The Finnish data pertain to a period of time 10 to 20 years ago when the official retirement age from some municipal occupations was as low as 55 years, which is several years lower than today. Although the average retirement age in Finland rose from 1995 to 2000 by two years, from 57 to 59 years, premature retirement has been very frequent in Finland compared to most other countries, and it can entail far-reaching socioeconomic consequences. Mature age workers have the potential to make a continuing social and economic contribution to Finland. The extent of this contribution will depend upon the ability of mature age workers to continue to work, and their interest in doing so. It will also be influenced by the changing nature of work and the attitudes of employers towards the value of mature age workers.

In Finland, it has not been especially difficult to retire on a disability pension. In recent years, about 80% of all the applications have been accepted. Compensated work disability pension is by law always based on a medically diagnosed disability caused by disease, deficiency or disablement. Perceived lowering of work ability is associated with many factors such as increased demands of work, burdening, long absence from work due to unemployment, elongated incapability caused by sickness or accident, poor professional competence, poor personal relations, etc. Not all of these problems are medical, and they should not be examined and treated as such. There are efficient means to prevent disability and to sustain work ability, such as training services, changing work tasks, rehabilitation, occupational health care services, work protection, work supervision measures, commitment of the management, communication with employees, coordination with organized labour, workplace redesign and proactive return-to-work programs. The illness-based concept of disability mainly concentrates on the loss of work ability, but the recently introduced Finnish functional capacity approach via vocational rehabilitation focuses on residual ability instead of emphasizing the incapacity. _{20} In disability policy, this approach would mean not only the integration of disabled persons into labor force but also the improvement of the quality of working life. _{21}

We have been able to quantify the most significant and anticipated result of this study which is that for all female and male cohorts aged 45 to 55 years in 1981, the persons initially in a state of fair or poor work ability are estimated to leave the work force due to disability or death earlier than those with an excellent of good initial capacity for work. The usefulness of quantifying this finding becomes even more compelling when the so-called large age cohorts (the age groups born after the Second World War in 1945-1949 were 30% larger than the preceding ones), aged 53 to 57 years in 2002, will in a few years reach retirement age and will require more demanding health services and costlier medical care. The situation is accentuated by the increase in life expectancy by 6 years in Finland over the past 30 years. In preparation for this aging demographic progression and in order to alleviate the pressure for growing health care expenditures, the Parliament of Finland passed in 2003 new legislation for the private sector to postpone pensioning off from work. The legislative package includes several measures. First of all, the possibility to move on to early old-age pension will be changed from 60 to 62 years, and the upper limit for work life will be extended from 65 to 68 years. Secondly, the transfer to a disability pension will be made easier, but the threshold will be clearly higher than that of the present early retirement on personal pension. The latter type of retirement is intended for persons whose work ability has diminished, but who are not entitled to disability pension. Personal pension will no longer, as of year 2004, be granted for persons born after 1943. Thirdly, the age limit for part-time pension will be elevated from 56 to 58 years for persons born in 1947 or thereafter (with a lowering of the amount of subsequent old-age pension). Finally, unemployment pension will be discontinued. All these measures, which will come into effect in 2005, are geared at providing more incentives in the future for people to remain in the labour force over a longer period. Moreover, the National Program for Aging Workers (1998-2002) aimed to tackle the pension problem by developing new means to improve working conditions and to promote the work capacity of aging workers. A proposal for a new law on pensions for the municipal sector was put forth in 2002.

The impressive difference in working life expectancies between genders is an interesting but intriguing result of this study. Yet it is difficult to explain with certainty. Can this gender difference be explained simply in terms of the true health status, or, alternatively, does the Finnish health system intervene differently on men and women? In particular, the question is whether men's claims for early retirement are more easily accepted than women's claims, or whether women are offered more opportunities than men to shift to less demanding working roles (e.g. job-sharing) following a deterioration of health status, or both.

We contemplated on four factors that are likely to bear on the issue of gender-difference in working life expectancies. First, women are employed more often than men in branches that have had and still have a lot of lowered professional retirement ages in reference to the present, statutory retirement age (i.e. 63-65 years). At the time of the study 44% of women and 22% of men had a retirement age less than 63 years. Hence a part of women did not have time to be on sick leave long enough so that they would be transferred to a disability pension. Second, women tend to report much than men complaints that are subjectively experienced as troublesome, but for which the medical finding remains slight. These ailments include, for example, various disorders of the musculoskeletal system such as fibromyalgia. In a majority of these cases, pension fund institutes assess the reduction of work ability to be much less than the self-assessments of the concerned individuals. For women this practice may contribute to the increase of the number of adverse decisions on receiving a pension. In principle, there exist no differences between the genders, but, in reality, the accumulation of such ‘slight' ailments among women could, in part, explain the result. Third, men's work in the municipal sector is frequently technical and physically burdensome. In these cases, the rejection of a disability pension is unlikely. An indication of this is the lower than general rejection statistics of Finnish mutual pension insurance companies that cover, for example, construction workers, longshoremen and agricultural workers. Fourthly, there does not seem be differences in Finland between genders in the offering of occupational rehabilitation and putting it into practice. However, for women who are employed in the social and health care fields it may be easier than for men to find lighter work. This is because in these fields workplaces are on the average larger and the possibilities of finding jobs are more plentiful than in the typically male dominated fields. On the other hand, the numbers of rehabilitated persons are relatively small, and considering that there are less men than women working in the municipal sector (in 1981, 72% of the workers were women), this gender imbalance may not be a significant factor.

### Summary

In brief, we restate the major findings and conclusions. Working life expectancy shows the number of years one expects to continue to participate in the work force. This concept takes into account the fact that a person may choose to leave the work force, or may be unable due to health reasons to continue to work during his or her lifetime. Contingencies to be considered include disability, retirement and mortality. We found that men permanently leave the work force due totaking a disability old-age pensionor death earlier than women in all age groups, regardless of whether they commenced in a better or worse work ability state. For example, for 45-year-old men initially with an excellent or good work ability the expected occupation time in the disabled state was 4.1 years, whereas for those with initially a fair or poor capacity for work it was markedly longer, 6.3 years. For women the corresponding expectancies were shorter 3.0 years and 5.3 years. The usefulness of this quantification becomes vital considering that the Finnish population is aging, and that the large age cohorts will in a few years retire and will require more intensive and more costlier health care. On the other hand, women tend to retire on old-age or similar pension before men, especially those women with an initially fair or poor capacity for work at ages 49-55 years. This gender difference in the duration of active work life is accentuated by women's longer life expectancy. The data from this study suggest that the work ability of Finnish aging workers deteriorates prematurely and individuals leave employment before the statutory retirement age. This adverse development can lead to serious socioeconomic consequences for the Finnish society. These results stress the importance of following the work ability of employed persons already at sufficiently early ages when it is still possible to intervene in the process. Finally, we hope that the newly developed Markov chain methodology for analysing transitions between health states as applied in this paper to the Finnish working life survey data will prove useful in similar epidemiologic studies elsewhere as well.

### Appendix: Some statistical issues

The results of Davis et al.1 adapted to the present application of two absorbing and two non-absorbing states, are briefly described in this appendix. It must be remembered that an important feature of the data is that observations are possible only on three occasions.The assumptions stated below concern the evolution of a cohort of workers in a longitudinal system whose state space consists of the four states of work ability. These assumptions will be taken to hold in all further discussion on modeling and estimation of the transition probabilities.

Assumption 1. At the initial age of observation x_{0} , the l(x_{0}) workers are distributed between transient states 1 and 2 with known initial frequencies l1(x_{0}) and l2(x_{0}).

Assumption 2. As age increases from x_{0} by integer increments, the l(x_{0}) individuals evolve independently. That is, at any age x ≥ x_{0} , the transitions of any two individuals are mutually stochastically independent.

Assumption 3. For all l(x_{0}) individuals, the same discrete-time nonhomogeneous Markov chain X(x) governs transitions between the states of the longitudinal system.

Consider the situation in which, for a sequence of ages {x_{1}, x2, ... , xn } with x_{0} ≥ x_{r} < x_{r} + 1 , the transition frequencies between the four states of X(x) from x_{r} to x_{r} + 1 are observable. Let l̃_{ij} (x_{r}, x_{r} + 1 ) be the random variable denoting the number of individuals in state i at age x_{r} and in state j at age x_{r} + 1 . Further let p_{ij} (x_{r}, x_{r} + 1 ) be the conditional probability that an individual is in state j at age x_{r} + 1 given that they were in state i at age x_{r} . Random variables will always carry a superscript ,tilde, as in l̃i(x_{r}), the random variable denoting the number of individuals in state i at age x_{r} . The corresponding symbol without the tilde will denote an expectation or sometimes a limit in probability. Note that on occasion the transition probability will depend on both age x and year t of transition, in which case the transition probability will be denoted p_{ij} ((x_{r},t_{r}),(x_{r+1}, t_{r+1})).

Note that

From assumptions 2 and 3 it follows that for x_{r+1} > x_{r} ≥ x_{0} , givenl̃i(x_{r}), the conditional distribution of l̃il(x_{r}, x_{r+1}), l̃i2(x_{r},x_{r+1}), l̃i3(x_{r},x_{r+1}), l̃i4(x_{r},x_{r+1})) is multinomial with parameters l̃i(x_{r}), and p̃i1(x_{r},x_{r+1}), p̃i2(x_{r},x_{r+1}), p̃i3(x_{r},x_{r+1}), p̃i4(x_{r},x_{r+1})). Then

For time interval (x_{r}, x_{r+1}) and transient state i, as in (1) consider the log partial odds Θ_{ij} (x_{r}, x_{r+1}) of transition to state j ≠ i defined by

The Θ_{ij} (x_{r}, x_{r+1}) parameters are naturally estimated by the observed log partial odds Θ̃_{ij} (x_{r}, x_{r+1}) = log{l̃_{ij} (x_{r}, x_{r+1})/ l̃_{ii}(x_{r}, x_{r+1})}

The assumptions that the l(x_{0}) individuals evolve independently and that transitions in work ability are governed by a Markov process imply, by Theorem 1(ii) of Davis et al.1 , that for the non-absorbing states i = 1, 2, as l(x_{0}) tends to infinity the vector

is asymptotically normally distributed with mean

and covariance matrix

where is the 3x1 matrix with each of its entries equal to one.

For example, the asymptotic covariance matrix of Θ̃(-1, x_{r}, x_{r+1}) - Θ(-1, x_{r}, x_{r+1}) is

For x_{r} < x_{r+1} and xs < xs+1 , by Theorem 1(ii) of Davis et al.1 , x_{r} ≠ xs or i1 ≠ i2 implies Θ̃(-i1, x_{r}, x_{r+1}) and Θ̃(-i2, xx, xx+1) are asymptotically independent as l(x_{0}) tends to infinity.

If the one-step log odds Θ(-i,x,x+1) are parametrised as Z(-i,x)β_{i} then the two vectors β_{1} and β_{2} can be estimated by minimising

Because X(x) is a Markov process there is a recurrence between the transition probabilities, for example,

and hence a recurrence between the log odds. This fact can be used to establish recurrences for the derivative and Hessian of L(β_{1}, β_{2})and so to obtain estimates of β_{1}, β_{2} , the parameters of the one-step probabilities. Technicalities are complicated but are given in Davis et al.1

In the case of the Finnish work ability data, for each gender, there are seven birth cohorts aged 45 – 51 years in 1981 whose transitions are observed for the periods 1981 to 1985 and 1985 to 1992. We assume that the same parameters apply to all 14 data points and so the summation in equation (12) is taken over all available points. Also for this dataset the parametrisation Θ(x_{r},x_{r+1}) = Θ(β, x_{r}, x_{r+1}) involves both age and year of transition as significant explanatory variables. That is, our Markov model is nonhomogeneous with respect to both age and time. However, for ease of notation we have suppressed the arguments indicating dependence on time.

### Acknowledgments

We thank Kaija Tuomi and Jorma Seitsamo for making available the aggregate data from the Finnish work ability surveys as well as Tuula Nurminen for valuable comments and Terttu Kaustia for the English language revision.