Bias-adjusted exposure odds ratio for misclassified data

Tze-San  Lee

Bias-adjusted exposure odds ratio for misclassified data

T Lee

Keywords

case-control study, exposure misclassification, odds ratio, sensitivity, specificity

Citation

T Lee. Bias-adjusted exposure odds ratio for misclassified data. The Internet Journal of Epidemiology. 2008 Volume 6 Number 2.

Abstract

If a dichotomous exposure variable is misclassified in a case-control study, a bias-adjusted exposure odds ratio with its asymptotic variance is presented to account for the misclassification bias. A simple, yet powerful, method is given to calculate the true sensitivity and specificity based only on the data available in the main study, regardless of whether a validation sample is available or not. Two practical examples without or with validation data are given to illustrate how to calculate first the true sensitivity and specificity for cases and controls and then the bias-adjusted exposure odds ratio with its 95% confidence interval.

Abbreviations

BAOR Bias-adjusted [exposure] odds ratio
CI Confidence interval
COR Crude [exposure] odds ratio

Introduction

In the realm of epidemiology the problem of misclassification has been thoroughly studied. In practical applications, the exposure misclassification mainly occurs when proxy respondents are used in the survey interview to classify the subject’s exposure status. For example, in a study of identifying the possible etiologic factors for Alzheimer’s disease, information were uniformly obtained only from close family members, usually spouse, because of the patient’s mental impairment (₃₁₃₂).

Historically, this problem was first studied in ₃ and later included other related issues that were investigated by other people (₁₄₅₁₀₁₂₁₃₁₄₁₅₁₆₁₇₁₈₁₉₂₀₂₂₂₃₂₄₂₆₂₇₂₉₃₀₃₃). Epidemiologic examples about the effect of misclassification bias were also widely studied. See, for example, ₆₇₈₂₁₃₅₃₇₃₉₄₀₄₁.

So far, all proposed methods for correcting the misclassification bias either require a second validation sample to estimate the sensitivity and specificity of the classified procedure or conduct a conventional/probabilistic sensitivity analysis. No methods available in the literature are able to calculate the true sensitivity and specificity. The aim of this paper is to present a method to calculate the true sensitivity and specificity from the data in the main study only, regardless of whether a validation sample is available.

Background

Consider a case-control study in which there is no disease misclassification, but misclassification has occurred in determining the subject’s exposure status. First, three random variables, E, E^* and D, are defined as follows:

E = 1 if a subject is truly exposed, 0 otherwise, E^* = 1 if a subject is classified as exposed, 0 otherwise. D = 1 if a subject belongs to the case group, 0 otherwise.

Note that E^* is a surrogate classification variable for the exposure variable E and D is a disease variable. Let p₀ and p₁ denote, respectively, the true proportions of subjects in the control and the case population, who are exposed to a certain risk factor under study. The probability distributions for cases and controls are given, respectively, by the first and second column of Table 1.

Table 1: The true distributions of the cell probabilities for the cases and controls

Figure 1

As a measure of the relative risk in case-control studies, the [exposure] odds ratio of exposed versus unexposed is given by (₉)

Figure 2

where 0 < p₀ < 1 (q₀ = 1- p₀) and 0 < p₁ < 1 (q₁ = 1 – p₁) are defined in Table 1.

Suppose that n₀ controls and n₁ cases are sampled with the positive count frequencies n_ij, i, j = 0, 1. By the method of moments, we obtain from Table 2

Figure 3

and

Figure 4

where p̂₀ of equation 2 and p̂₁ of equation 3 are the traditional sample estimates for the prevalence among controls and cases, respectively, and are unbiased estimators for the true p₀ and p₁ in Table 1, provided that there is no misclassification on the exposure variable E.

Table 2: The observed cell counts for a case-control study.

Figure 5

However, p̂_i , i = 0, 1, of equations 2-3 are no longer unbiased estimators for p_i of Table 1 whenever a surrogate variable E^* of the exposure variable E for the study subjects is misclassified (₁₃). Indeed, once the exposure misclassification has occurred, it is easily shown that

Figure 6

where φ_i and ψ_i, i = 0, 1, called bias parameters (₁₇), denote sensitivity and specificity probabilities for controls and cases, respectively, and are defined by (₉)

Figure 7

and

Figure 8

Moreover, if n_i · p̂_i ’s, i = 0, 1, are assumed to follow binomial distributions with means n_i · [ p_i · ( φ_i + ψ_i - 1) - ψ _i ], the variances of p̂_i ’s, i = 0, 1, are given by

Figure 9

From equation 4, it is easily seen that p̂_i ’s, i = 0, 1 are no longer unbiased estimators of p_i unless there is no exposure misclassification for both cases and controls, that is, φ_i = ψ_i = 1, i = 0, 1. As a result, the bias unavoidably appears in the crude [exposure] odds ratio given by

Figure 10

since it does not account for the misclassification bias. This motivates epidemiologists and statisticians to search for the corrected [exposure] odds ratio which is able to account for the misclassification bias (₆). In this paper, an estimator, called bias-adjusted [exposure] odds ratio, is proposed which is able to account for the misclassification bias in the estimation of the true R of equation 1.

Method

In epidemiology, an exposure misclassification is said to be non-differential if sensitivity and specificity are the same for cases and controls, that is, classification rates are independent of the disease; otherwise, the exposure misclassification is called differential. Because non-differential misclassification is a special case of differential misclassification, I only consider differential misclassification in my derivation.

By using equations 2-4 with an approximation, E(p̂_i) ≈ p̂_i , it is easily shown that for i = 0, 1,

Figure 11

and

Figure 12

are unbiased estimators, respectively, for p_i and q_i, conditioned on that both φ_i and ψ_i , i = 0, 1, are known, where Δ_i is given by

Figure 13

Clearly, equation 11 must not equal to zero; otherwise, equations 9-10 are undefined.

Now, p^* _i of equation 9 (or q^* _i of equation 10) is said to be admissible if equations 9-10 are positive numbers between 0 and 1. In addition, φ_i and ψ_i , i = 0, 1, are said to be feasible if φ₁ , ψ₁ , φ₀ , and ψ₀ must satisfy the following constraints:

Figure 14

Figure 15

and

Figure 16

It can be easily shown that p̆_i of equation 9 (or q̆_i of equation 10) is admissible if φ_i and ψ_i , i = 0, 1, are feasible. Note that equations 12-14 are merely one set of feasibility constraints for φ_i and ψ_i , i = 0, 1 so that equations 9-10 are admissible estimators for the unknown p_i and q_i. By simply reversing the direction of inequalities in equations 12-14, we could obtain another set of feasibility constraints. Mathematically, these two sets of feasibility constraints are equivalent because they are just mirror images of one another with respect to the straight line of φ_i + ψ_i = 1 in the two dimensional space of ordered pairs (φ_i ,ψ_i ) (₃₈). But, equation 14 is preferable because it has a practical implication, that is, a good classification procedure should perform better than random (₁₆). In addition, the variance of equation 9 is readily given by

Figure 17

where Var(p̂_i) is given by equation 7.

By replacing the true unknown parameters p_i and q_i, i = 0, 1, in equation 1 with equations 9-10, the bias-adjusted [exposure] odds ratio (BAOR) R^* is defined by

Figure 18

where φ_i and ψ_i , i = 0, 1, are feasible. The BAOR of equation 16 is said to be admissible if it is a positive real number. Note that because of the feasibility constraints of equations 12-14, equation 16 is always a positive real number once true sensitivity and specificity are given. Clearly, equation 16 is admissible if p^* _i of equation 9 (or q^* _i of equation 10) are admissible.

By using the delta method, the asymptotic variance of ln(R^* ) is given by (₁₁)

Figure 19

where p_i and q_i , i = 0, 1, are given, respectively, by Table 1, φ_i and ψ_i , i = 0, 1, are feasible, and Var(p̂_i)is given by equation 15. A detailed derivation of equation 17 is given in the appendix The approximation of the right side of equation 17 is adequate so long as n₀ and n₁, sample sizes for cases and controls, are sufficiently large. When equation 17 is used in practical applications, the unknown parameters of p_i and q_i , i = 0, 1, are replaced, respectively, by p^* _i of equation 9 and q^* _i of equation 10, and the true classification rates can be calculated exactly from the observed data in the main study under an assumption that the correct classified table is known as shown later in the next section.

According to the asymptotic theory of large sample distribution (₂), the sampling distribution of the following test statistic

Figure 20

can be shown to follow a standard normal distribution, where s.e.(ln(R^*)) denotes the standard error of ln(R̆), that is, a square root of equation 17. To account for misclassification errors in identifying any risk factor, equation 18 will be used to test whether the adjusted odds ratio of equation 3.8 is significant or not. In addition, equation 18 can be used to find the 100% × (1 – α) confidence interval (0 < α < 1) for R is given by

Figure 21

where z_1-(α/2) is the 100 × ( 1 − (α/2)) percentile of the unit normal distribution.

To use equations 16-19 we need to know the true sensitivity and specificity for cases and controls. I will show below by using two practical examples how to calculate the true sensitivity and specificity from the data of the main-study. Basically, we need to know what the truly classified table is. This information is contained in the observed data of the main study. Even though we do not know exactly what the truly classified table is, our reverse thinking hints us that the truly classified table must be one of the reclassified tables from the observed one in the main study. Hence, we can obtain the truly classified table by assuming hypothetically that it is simply a table which is (either under- or over-) misclassified from the observed one by 1 subject, or 2 subjects, or … in the exposed category. Once we obtain the [hypothetically] true table, we’re thus able to calculate the sensitivity (or specificity) from the observed and this true table according to the following formula, that is,

Figure 22

Results

In our first example validation data is not available, while a validation sample is available in the second example.

Example 1 In a study of deaths caused by landslides that occurred in the State of Chuuk, Federated States of Micronesia, a case-control design was used for the study. The U.S. Geological Survey reported 265 landslides of various sizes and 12 of these landslides caused 43 deaths and injured more than one hundred people. A case is defined as a person who died as a result of the landslides. Proxies are identified by the surviving villagers to provide information for decedents or persons in the control group who are too young to answer. A control group of 52 survivors were interviewed regarding their experience during the landslides, while only 40 proxies were interviewed regarding the circumstances of the death of their relatives or neighbors, because the study team was unable to find proxies for three of the victims (₃₄). For an illustrative purpose we only take one table from their study regarding whether a person see natural warning signs (Table 3a). Here the exposure variable E is defined by the event that a person did not see any natural warning signs. By inspection of Table 3a, 95% (= 37/39) of people for cases did not see any natural warning signs, but only 52% (= 27/52) for controls did not either. Indeed, the crude odds ratio (COR) is obtained as 17.1 with p < 5.3×10^-6 by Fisher’s exact test (₃₆). Hence, based on this COR value, a tentative inference is drawn that whether seeing natural warning sign is a significant risk factor for the death caused by landslides.

However, since the data were collected through an interview survey from proxies or survivors, there was a possibility that misclassification might have occurred. Suppose that Table 3a is misclassified. To account for the misclassification bias, we have to use equation 16 to calculate the bias-adjusted exposure odds ratio. In this study no validation data were collected at all. Nevertheless, I’ll show first how to calculate the true sensitivity and specificity for cases. Evidently, it depends on our knowledge about what a truly classified table is. Even though we do not know exactly what the truly classified table is, we’re confident that it must be one of those 37 tables by under-misclassifying 1 subject or over-misclassifying 1, or 2, …, or 36 subjects in the category of “No signs” as shown in the first two columns and continued in the fifth and sixth columns of Table 3bi. Since I do not know exactly which one of these 37 possible scenarios is a truly classified table, I simply assume that each one of them is a desired correctly classified table and then calculate one by one the values of sensitivity and specificity accordingly. By using equation 20, the sensitivity and specificity pairs for cases and controls are given, respectively, in Tables 3b(i-ii). By taking the first entry in column 3 of Table 3bi as an example, we obtain that φ₁ = 1- |38 – 37|/(38+37) = 1- 0.0133 = 0.9867. However, after checking out if feasibility constraints [or equations 12-14] were satisfied, only the first four pairs of sensitivity and specificity were found to be feasible (highlighted in Table 3bi). Similarly, although there were altogether 50 possible truly classified tables for controls, only 34 pairs of sensitivity and specificity were found to be feasible for controls (Table 3bii).

To see in what direction the misclassification might bias the BAOR from the null-hypothesis value, I used all four feasible pairs of sensitivity and specificity for cases and only ten pairs, under- or over-misclassified up to five subjects in the same category of “No signs”, for controls to compute 40 (= 4×10) BAORs. The results of these 40 BAORs with its 95% confidence intervals (CI) are given in Tables 3c(i-iv). By inspection of 95% CI in Tables 3c(i-ii), the BAORs were overall significantly biased further away from the null value (R = 1) than the COR if just one subject was over-misclassified in the category of “No signs” for cases, and up to five persons were either under- or over-misclassified for controls (Tables 3cii), while they were significant and biased yet a little toward the null value than the COR if just one subject was under-misclassified in the category of “No signs” for cases and up to five persons were either under- or over-misclassified for controls (Tables 3ci). However, Tables 3c(iii-iv) painted a totally different picture, namely, the BAOR was overall biased away, yet not significant, from the null value, provided that more than one person was under-misclassified in the category of “No signs” for cases and up to up to five persons were under- or over-misclassified in the same category for controls.

Table 3(a): The survey data whether a person saw natural warning signs for cases and controls.

Figure 23

Table 3b(i): All possible pairs of sensitivity and specificity for cases.

Figure 24

Table 3b(ii): All feasible pairs of sensitivity and specificity for controls.

Figure 25

^a The “+” (or “-”) sign inside the parenthesis denotes the number of persons to be over- (or under-) misclassified.

Table 3c: Bias-adjusted exposure odds ratios (95% CI) for all 40 feasible pairs of sensitivity and specificity for cases and controls when:

Figure 26

Example 2 The data used here is taken from a case-control study on sudden infant death syndrome (SIDS) (₁₅). Among women for whom only interview data were examined, 122 out of 564 case mothers and 101 out of 580 control mothers reported antibiotic use during pregnancy. By using equation 1, the obtained crude odds ratio 1.31 with a 95% CI: 0.98–1.76 (p-value = 0.07) is not statistically significant. In this study a second external validation sample based on the medical record (a gold standard) was available. From the data of this validation sample, the estimated sensitivity and specificity for cases and controls were given by (ψ̂₁ , ψ̂₁ ) = (0.6304, 0.8667) and (ψ̂₀ , ψ̂₀ ) = (0.5676, 0.9333), respectively. By assuming these estimated values as if they were true sensitivity and specificity probability for the study population given by Table 3, we are able to calculate the corresponding correctly cell counts (rounded to the largest integer) for Table 3 which should be: n^* ₁₁ = 56, n^* ₁₀ = 40, n^* ₀₁ = 338, and n^* ₀₀ = 419. Now since n^* ₁₁ + n^* ₀₁ = 394 and n^* ₁₀ + n^* ₀₀ = 459 are not the same as the marginal total in the table of the main study, we therefore convert proportionally these new marginal totals 394 and 459 to match, respectively, overall to that of the main-study table (564 and 580) for the entire table. We thus obtained: n^** ₁₁ = 80, n^** ₁₀ = 51, n^** ₀₁ = 484, and n^** ₀₀ = 529. It seems rational to assume that this new table is the truly classified table for the main-study. Then, we are able to calculate the true sensitivity and specificity for this SIDS study which are given by (◖₁ ,ψ₁ ) = (0.7921, 0.9546) and ( φ₀ , ψ₀ )= (0.6711, 0.9504). By substituting the above values into equation 16, the BAOR is given by 1.18 with a 95% CI: 0.90–1.56 (p-value = 0.23) which is not statistically significant. The BAOR-value 1.18 is biased toward the null value more than the COR (1.31). Hence, there seems no association between the use of antibiotic during pregnancy and the SIDS.

However, suppose that the validation sample is reliable. Then, according to the above calculation, 42 and 50 women are over-misclassified in the antibiotic use during pregnancy for cases and controls, respectively. These numbers of misclassification on the antibiotic use seem quite big. I therefore conducted a sensitivity analysis by assuming arbitrarily (under- or over-) misclassified numbers ranging from 4 to 40 subjects in the category of the antibiotic use. First, I calculated the sensitivity and specificity for cases and controls, respectively (Table 4b). Next, I calculated the BAORs by using the 14 pairs of sensitivity and specificity given in Table 4b. By browsing the result in Table 4c, almost all 14 BAORs were significant except the last two values which used the last two pairs of sensitivity and specificity of Table 4b. The last two pairs of sensitivity and specificity in Table 4b correspond to that 34 and 40 women are over-misclassified in the category of “use” for both cases and controls. As a result, this implies that the BAORs will become significant and biased away from the null value if less than 34 women are over-misclassified in the category of “use” for both cases and controls in the main study. If this error of misclassification sounds more reasonable, then the inference drawn from using the validation data is clearly misleading.

Table 4a: The data of SIDS study of the exposure variable of interview response between cases and controls.

Figure 27

Table 4b: Pairs of the [true] sensitivity and specificity for cases and controls.

Figure 28

^a The “+” (or “-”) sign inside the parenthesis denotes the number of persons to be over- (or under-) misclassified.

Table 4c: Bias-adjusted exposure odds ratios with its p-value and 95% confidence interval.

Figure 29

Discussion

Some observations are worthwhile for a discussion below:

A unique strength of this paper is that a simple, yet powerful, method is presented to calculate the true sensitivity and specificity based on the data available in the main study only, regardless whether the validation data is available or not. Hence, this new method can free researchers in epidemiology from the no-validation-data scare (Example 1). Of course, we do not know exactly in example 1 what the truly classified table is. Nevertheless, we can get a general feeling about the misclassification effect on the estimation of the true exposure odds ratio. When the validation data are available, our method can not only identify the truly classified table, but also assist to assess the reliability of the validation data as I did in Example 2.

For both cases and controls, the sensitivity and specificity have to be calculated in pairs. This is resulted from a fact that the marginal totals are required to be fixed in case-control studies. As a consequence, we can not arbitrarily assign values to sensitivity and specificity separately as was done in the probabilistic/traditional sensitivity analysis (₁₀₁₆) or in the simulation study (₅).

The significance of the BAOR depends not only on its sheer magnitude, but also on its standard error (Table 4c(iii-iv)). Note that equation 17 is a nonlinear function of the true sensitivity and specificity parameters; hence its value can become very large too.

Even just misclassifying one subject the direction of the misclassification effect for the under- and over-misclassified scenario can be very different (Example 1).

In both of the above examples I could conduct an exhaustive sensitivity analysis, but chose not do it, because my intention was just for the sake of illustration.

Appendix

Let γ_i denote the estimation errors between n_i · p̂_i and n_i · E(p̂_i), namely,

Figure 30

where p_i’s are defined in Table 1, p̂_i , ψ_i, and Δ_i are given, respectively, by equations 2-3, 5-6, and 11. By using equations 2-3, equation A.1 can be also expressed in terms of q̂_i as follows:

{image:31}

It is easily shown that the first two moments of γ_i are given, respectively, by

{image:32}

and

{image:33}

Let ε_i be the estimation errors on estimating the “logit” transformation of the odds that are defined by (₁₁)

{image:34}

where p̆_i ’s are defined by equation 9 and p̆_i by equation 10.

By using equations 9-10 with 2-3 and A.1-A.2, we obtain from applying to equation A.5 a Taylor series expansion of the natural logarithmic function (₂₅)

{image:35}

By dropping all the terms of γ_i with its power greater than or equal to two, we have from using equation A.6

{image:36}

Equation 17 follows immediately from taking the variance of equation A.7 with the use of equations A.3-A.4 and independence property of γ_i, i = 0, 1.

Acknowledgements

This research was motivated by the author’s participation in the project of the Chuuk Landslide Study. The author would like to thank Dr. J. Malilay for her invitation to take part in that study.

Part of the results in this paper was presented by the author in the 2004 ASA Joint Statistical Meetings held in Toronto, Canada and subsequently being published in the ASA Proceedings (₂₈).

Correspondence to

Tze-San Lee, Centers for Disease Control and Prevention, Mail Stop F-58, 4771 Buford Highway, Chamblee, GA 30341-3717, USA Phone: +1-770-488-3729; Fax: +1-770-488-1540 E-mail: tjl3@cdc.gov

References

1. Barron BA. The effects of misclassification on the estimation of relative
risk. Biometrics 1997; 33:414-418.
2. Bickel PJ, Doksum, KA. Mathematical Statistics. Oakland, California:
Holden-Day, Inc., 1977.
3. Bross I. Misclassification in 2 × 2 tables. Biometrics 1954; 10:478-486.
4. Chen TT. A review of methods for misclassified categorical data in
epidemiology. Stat Med 1989; 81:1095-1106.
5. Chu H, Wang Z, Cole SR, Greenland S. Sensitivity analysis of misclassification:
A graphical and a Bayesian approach. Ann Epidemiol 2006; 16:834-841.
6. Copeland KT, Checkoway H, Mcmichael AJ, Holbrook RH Bias due to
misclassification in the estimation of relative risk. Am J Epidemiol 1977; 105:
488-495.
7. Espeland MA, Hui SL. A general approach to analyzing epidemiologic data that
contain misclassification errors. Biometrics 1987; 43:1001-1002.
8. Flegal KM, Brownie C, Haas JD. The effect of exposure misclassification on estimates
of relative risk. Am J Epidemiol 1986; 123: 736-751.
9. Fleiss J, Levin B, Paik MC. Statistical Methods for Rates and Proportions, 3rd edition.
New York: Wiley, 2003.
10. Fox MP, Lash TL, Greenland S. A method to automate probabilistic sensitivity
analyses of misclassified binary variables. Int J Epidemiol 2005; 34:1370-1377.
11. Gart JJ, Zweifel JR. On the bias of various estimators of the logit and its variance
with application to quantal bioassay. Biometrika 1967; 54:181-187.
12. Geng Z, Asano C. Bayesian estimation methods for categorical data with
misclassification. Comm Stat A 1989; 18:2935-2954.
13. Goldberg JD. The effects of misclassification on the bias in the difference
between two proportions and the relative odds in the fourfold table. J Am Stat Assoc
1975; 70:561-567.
14. Greenland S. The effect of misclassification in the presence of covariates. Am J
Epidemiol1980; 112:564-569.
15. __________. Variance estimation for epidemiologic effect estimates under
misclassification. Stat Med 1988; 7:745-757.
16. __________. Basic methods for sensitivity analysis of biases. Int J Epidemiol
1996; 25:1107-1116.
17. __________. Multiple bias modeling for analysis of observational data (with
discussion). J Royal Stat Soc A 2005; 168:267-308.
18. __________. Maximum likelihood and closed-form estimators of epidemiologic
measures under misclassification. J Stat Plan Infer 2008; 138:528-538.
19. Greenland S, Kleinbaum DG. Correcting for misclassification in two-way tables and
matched-pair studies. Int J Epidemiol 1983; 12:93-97.
20. Greenland S, Robins JM. Confounding and misclassification. Am J Epidemiol
1985; 122:495-506.
21. Gullen WH, Bearman JE, Johnson EA. Effects of misclassification in epidemiologic
studies. Public Health Rep 1968; 83:914-918.
22. Gustafson P. Measurement Error and Misclassification in Statistics and
Epidemiology: Impacts and Bayesian Adjustments. Boca Raton, FL: Chapman &
Hall/CRC, 2004.
23. Gustafson P, Greenland S. Curious phenomena in Bayesian adjustment for exposure
misclassification. Stat Med 2006;25:87-103.
24. Gustafson P, Le ND, Vallee M. Case-control analysis with partial knowledge of
exposure misclassification probabilities. Biometrics 2001; 57:598-609.
25. Haldane JBS. The estimation and significance of the logarithm of a ratio of
frequencies. Ann Human Genetics 1956; 20:309-311.
26. Jurek A, Maldonado GM, Greenland S. How far from nondifferential exposure or
disease does misclassification have to be to bias measures of association away from
the null? Int J Epidemiol 2008; 37:382-385.
27. Kleinbaum DG, Kupper LL, Morgenstern H. Epidemiologic Research: Principles and
Quantitative Methods. Belmont, CA: Lifetime Learning, 1982.
28. Lee T-S. Adjusting the odds ratio for the misclassification bias in case-control
studies. The Proceedings of the American Statistical Association, Biometrics
Section [CD-ROM], 2004:383-387.
29. Lyles RH. A note on estimating crude odds ratios in case-control studies with
differentially misclassified exposure. Biometrics 2002; 58:1034-1037.
30. Morrissey MJ, Spiegelman D. Matrix methods for estimating odds ratios with
misclassified exposure data: extensions and comparisons. Biometrics 1999;
55:338-344.
31. Nelson LM, Longstreth WT, Koepsell TD, van Belle G. Proxy respondents in
epidemiologic research. Epidemiol Rev 1990; 12:71-86.
32. Rocca WA, Fratiglioni L, Bracco L, Pedone D, Groppi C, Schoenberg BS. The use of
surrogate respondents to obtain questionnaire data in case-control studies of
neurological diseases. J Chron Dis 1986; 39:907-912.
33. Rothman KJ, Greenland S. Modern Epidemiology, 2nd ed. Philadelphia, PA.:
Lippincott Williams & Wilkins, 1998.
34. Sanchez C, Lee T-S, Young S, Batts D, Benjamin J, Malilay J. Risk factors for
mortality during 2002 landslides in the State of Chuuk, Federated States of
Micronesia. J Disasters (to appear).
35. Sosenko JM, Gardner LB. Attribute Frequency and misclassified bias. J Chron Dis
1987; 40:203-207.
36. Stokes ME, Davis C, Koch GG. Categorical Data Analysis Using The SAS System.
Cary, North Carolina: SAS Institute Inc, 2000.
37. Wacholder S, Armstrong B, Hartge P. Validation studies using an alloyed gold
standard. Am J Epidemiol 1993; 137:1251-1258.
38. Wahrendorf MBJ. What does an observed relative risk convey about possible
misclassification?. Method Information Med 1984; 23:37-40.
39. Walker AM, Velema JP, Robins JM. Analysis of case-control data derived in part
from proxy respondents. Am J Epidemiol 1988; 127:905-914.
40. Walter SD, Irwig LM. Estimation of test error rates, disease prevalence, and relative
risk from misclassified data: A review. J Clin Epidemiol 1988; 41:923-937.
41. Willet W. An overview of issues related to the correction of non-differential
exposure measurement error in epidemiologic studies. Stat Med 1989; 8:1031-1040.

Bias-adjusted exposure odds ratio for misclassified data

Keywords

Citation

Abstract

Abbreviations

Introduction

Background

Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Figure 10

Method

Figure 11

Figure 12

Figure 13

Figure 14

Figure 15

Figure 16

Figure 17

Figure 18

Figure 19

Figure 20

Figure 21

Figure 22

Results

Figure 23

Figure 24

Figure 25

Figure 26

Figure 27

Figure 28

Figure 29

Discussion

Appendix

Figure 30

Acknowledgements

Correspondence to

References

Author Information