Substantial Interobserver Variation Among Morphopathological Diagnosis Of Canine Mammary Gland Tumours From Veterinary And Human Pathologists
P Chu, A Liao, K Yeh, C Liu
canine, comparative pathology, interobserver agreement, kappa, mammary gland tumour
P Chu, A Liao, K Yeh, C Liu. Substantial Interobserver Variation Among Morphopathological Diagnosis Of Canine Mammary Gland Tumours From Veterinary And Human Pathologists. The Internet Journal of Pathology. 2010 Volume 13 Number 1.
Because the morphopathological diagnosis for the benignity or malignance of tumours has played a key role of therapeutic decisions and prognosis, the reproducibility and coherence among pathologists is important. To determine the level of interobserver variation for comparative studies among veterinary and human pathologists for the diagnosis of canine mammary gland tumours (MGTs), one hundred and thirty-six selected canine MGTs with histological slides were randomly divided into two groups: 68 cases were evaluated independently by three veterinary pathologists and one human pathologist; other 68 cases were reviewed blindly by one veterinary pathologists and three human pathologists. All the participating pathologists designated whether the canine MGTs are benign or malignant. Kappa (κ) statistics were calculated to evaluate the level of agreement. For three human pathologists involved in this study, moderate to good levels of agreement among their diagnosis (κ=0.68-0.81). However, three veterinary pathologists participated in this study show less than chance to slight levels agreement (κ= -0.06 to 0.25). In this study, human pathologists tend to diagnosed canine MGTs as benign lesions while veterinary pathologists tend to diagnose as malignant lesions. We conclude that substantially high interobserver variations do exist among pathologists and will greatly influence the research result and therefore; internationally accepted pathological diagnostic definitions with high accuracy and reproducibility among pathologists are required to assist global researchers in characterizing the tumour biology, natural history, and the comparison of treatment modalities of canine MGTs.
Pathologists, either in the veterinary medicine or human medicine, play a key role in the diagnosis of tumor or tumour-related lesions. Pathological diagnosis remains the “gold standard” for cancer diagnosis in almost daily practices. Mammary gland tumours are the one of the mostly encountered neoplasm in the female dogs10 and are known for their complex pathological features resulting in misdiagnosis of malignant tumours as benign in about 10% of the tumours13.
A few previous studies evaluating prognostic factors for canine MGT, using univariate and multivariate analysis have been conducted but fail to get consistent conclusion 4,5,11,19,22,28,29. Although it has been stressed that prognostic studies of canine MGT were performed based on groups of dogs and their tumours; and, therefore the prognostic significance of tumour histology can’t consistently apply to individual cases. To compare with epidemiological studies of human, the case number of dogs involved in the prognostic studies is significantly lesser 3 and may be an important reason for inconsistent conclusion; however, a major possible reason for this phenomenon is interobserver variation of pathological diagnosis among pathologists.
Although diagnosis of canine MGTs posed certain difficulty in veterinary pathologists, no English article discussing about the interobserver variation of canine MGTs in the well-known website PUBMED12 (http://www.ncbi.nlm.nih.gov/pubmed/) to the best of the authors’ search. In contrast to canine MGTs, there were dozens of studies concerning interobserver variation in pathological reports of human breast lesions 2,21,24.
The histological classification of canine MGTs has been the subject of eliciting many debates, with various points of view have been proposed 3,6,9,15. The World Health Organization/Armed Forces Institute of Pathology (WHO/AFIP) classification for canine MGTs, published in 1999, is currently the most widely used system worldwide 14.
Through years of efforts, less and less diagnostic inconsistency exists among pathologists on the human breast tumours. On the contrary, lesser consensus about canine MGTs do occur and the possibly low reproducibility and coherence of pathological diagnosis on canine MGTs may hamper further tumour-related studies.
In clinical practice, pathological examination relies on a certain degree of subjective interpretation by observers and results in interobserver variation. The interobserver variation was assessed by calculation of Kappa(κ) value, which is a widely used parameter of agreement 8,27.
Our purpose of this study primarily aimed to assess the interobserver variation and the reproducibility and coherence among the veterinary pathologists. At the same time, human pathologists were also included for the purpose of comparative studies.
Materials and methods
Source of slides of canine MGTs
Randomly selected one hundred and thirty-six slides with original diagnosis of benign or malignant canine MGTs that were obtained between the year 2001 to 2008 were selected from the archives of the pathological files of the School of Veterinary Medicine, National Taiwan University, Taiwan. All the tumours were surgically resected and provided adequate tissue to evaluate detailed pathological changes. Each slide was cut from a formalin-fixed, paraffin-embedded tissue block with 4μm thick and was routinely stained with hematoxylin-eosin.
Six pathologists, including three veterinary pathologists (DVM1, DVM2, and DVM3) and three human pathologists (MD1, MD2, and MD3) examined the randomly selected canine MGTs. To be eligible to participate in this study, all pathologists must have been activity practicing general surgical pathology, have regularly evaluated mammary tissue. All of six participated pathologists in this study were involved in daily diagnostic practice and their basic professional profiles were summarized in Table 1. Two raters, a senior veterinary pathologist (DVM1) and a human pathologist (MD2) were participated in all the review process and evaluated in all the 136 slides. The diagnosis from DVM1 was used as a comparison of other pathologists.
In a total of 136 slides, group 1 including three veterinary pathologists (DVM1, DVM2, and DVM3) and one human pathologist (MD2) reviewed 68 slides and group 2 including three human pathologists (MD1, MD2, and MD3) and one veterinary pathologist (DVM1) examined the other 68 slides. All the participated pathologists received the identical slides without any hint of history or immunohistochemistry and all of them knew that their diagnoses were going to be compared with those of others, but they reviewed the slides independently without discussion with each other during this study.
An ordinal scale of two categories: benign and malignant were classified by all the pathologists. Veterinary pathologists were further requested to diagnose tumour according to the WHO/AFIP histological classification of mammary gland tumours of the dog 14. Human pathologists made their interpretations according to the WHO tumour classification system for human breast cancers published in 2003 7. All the pathologists diagnosed these slides as only benign or malignant mammary lesions without further classification. Any reference books could be consulted during the slides reviewing process, and there were no time and place restriction for them to finish these works.
Kappa (κ) statistics were applied to evaluate agreement between pathologists. The Kappa statistics measures level of agreement adjusted for agreement expected to occur by chance alone. The Kappa value ranges from -1 to 127. The Kappa values less than 0.4 mean slight to fair agreement, values of 0.4 to 0.8 represent moderate to good agreement, and values of more than 0.8 represent almost perfect agreement.
Kappa values among the participated pathologists (group 1 and group 2) are shown in Table 2 and Table 3. Moderate to good agreement among three human pathologists were observed (Kappa=0.69 to 0.81) (shown in Table 2). Less than chance to slight agreement among three veterinary pathologists were noted (Kappa=-0.06 to 0.25) (shown in Table 3). Although moderate to good agreement were achieved among the three human pathologists, only fair to moderate agreement were noted (Kappa=0.29 to 0.46) between three human pathologists and the senior veterinary pathologist. In the group 2 study, less than chance to slight agreement among three veterinary pathologists was seen but moderate agreement could be achieved among the senior veterinary pathologist and human pathologist.
By analyzing the raw pathological diagnosis data, human pathologists had the tendency to diagnose canine MGTs as benign tumours and young veterinary pathologists tend to diagnose canine MGTs as malignant tumours. In the 68 cases, three human pathologists considered 16, 16, and 23 cases diagnosed as malignant tumours by senior veterinary pathologist to be benign lesion; and in the contrary, only 2, 2, and 2 cases diagnosed as benign tumours by senior veterinary pathologist were considered as malignant lesions by three human pathologists (shown in Table 4). In group 2 study of the other 68 cases, two veterinary pathologists considered 24 and 20 cases diagnosed as benign tumours by senior veterinary pathologist to be malignant; and in the contrary, only 5 cases diagnosed as benign tumour by professor of veterinary pathology were considered as malignant by the human pathologist (shown in Table 5).
Kappa statistics have been applied in the measurement to assess interobserver variations in many fields of clinical medicine. Although a lot of interobserver variation studies have been conducted in human pathology to make the pathological diagnosis more consistent, no such similar study related to tumour diagnosis has been performed in veterinary pathology.
Correct and reliable histological diagnosis of canine MGTs has a great impact on not only clinical practice as an indicator for subsequent therapy but also further epidemiological and molecular tumour-related studies, thus reliability in assessing benign and malignant tumours in histological specimens is especially important.
From our present study, it is obvious that veterinary pathologists have considerable disagreement in the diagnosis of canine MGTs. It can reach moderate to good agreement among three human pathologists; however, when comparing with the diagnosis of the senior veterinary pathologist, only fair to moderate agreement can be achieved. It is our observation that the Kappa value among three veterinary pathologists is much lower than the Kappa value among three human pathologists. Human pathologists are consistently prone to diagnosis canine MGTs as benign tumours in comparison to veterinary pathologists.
There are some explanations accounts for our results of interobserver variations. First and mostly important is the complexity of histological architecture of canine MGTs. As for human pathologists, they are stick to the rule that malignant tumours, such as invasive carcinoma or carcinoma in situ should reveal totally loss or marked decreased expression of myoepithelial cells surrounding the tumour nodules. For veterinary pathologists, nuclear atypia, increased mitotic activity and infiltrative border are important factors that master their final judgment. Nuclear atypia is sometimes the result of reactive reaction rather than malignant change. Increased mitotic activity is also not diagnostic of malignant tumours. Furthermore, infiltrative nests of cells are occasional hard to be differentiated from entrapped mammary glands. The most reliable criteria for diagnosing malignancy in human breast tumours is the loss of myoepithelial cells 1,20,23. Immunohistochemical stains for myoepithelial cells such as p63 are routinely performed in difficult cases in daily diagnostic practice in human pathology 16,25; therefore, it is not surprising that three human pathologists involved in this present study seek for the presence of myoepithelial cells and are inclined to view more canine mammary gland tumours as benign tumours. Secondly, loose definitions, too short descriptions, and only several black photos in the WHO/AFIP histological classification of canine MGTs lead the youngsters in the field of diagnostic veterinary pathology have difficulties in unifying diagnosis of canine MGTs. In the contrary, stricter definitions, detailed descriptions, and nearly two hundreds colorful illustrations are presented in the book of WHO tumour classification for human breast for human pathologists for reference. Junior residents of human pathology can quickly get familiar with the diagnostic terminologies used in breast pathology and therefore, get considerable substantial agreement with senior pathologists.
It is generally accepted that there is inevitable to be some disagreement between individual pathologists and one pathologist on different times in the diagnosis of neoplastic lesions from studies conducted to look at the variability of pathological diagnosis in different organs in human17,18,26. However, if the interobserver interpretation of canine MGTs among pathologist for the diagnosis of whether the lesion is benign or malignant can’t achieve almost perfect agreement, it will enormously hamper the subsequent tumour-related treatment and research.
A fundamental problem with our study, as with other similar studies of interobserver agreement in pathological diagnosis, was the lack of a “gold standard diagnosis” against which individual diagnosis could be compared. The majority diagnosis or the diagnosis from senior pathologists may not be necessarily more correct than that of the other individual. It is our suggestions that only by larger scale of studies collecting thorough clinical and pathological data with the aid of ancillary tools such as immunohistochemistry and molecular pathology, as well as panel meetings by experts in canine MGTs from worldwide will incredibly increase the level of diagnosis agreement among pathologists and facilitate the future canine MGTs research.
The authors gratefully acknowledge the enthusiastic support of all the pathologists involved in the study. We thank Drs. Chia-Ling Liu, and Hui-Ting Hsu, who participated in reading slides and Miss Ya-Fen Liang for help with computing.
This work was supported by grants 98-CCH-IRP-48 and 98-CCH-IRP-49 from Changhua Christian Hospital (Taiwan).