O Baker, S Abdul-Kareem
O Baker, S Abdul-Kareem. Soft Computing In Medicine: Nasopharangeal Carcinoma Prognosis. The Internet Journal of Medical Informatics. 2004 Volume 2 Number 1.
Over the last twenty years, Soft Computing has developed rapidly as a discipline and method for the diagnosis and prognosis in Medical Informatics. This is a review of soft computing in the cancer subdomain of nasopharyngeal carcinoma cancer. An overview of the major categories of soft computing techniques is presented, and instances of these in NPC cancer research are discussed. An insight into the plausible makeup of NPC data, for current and future systems is presented. The review concludes by highlighting methods of employing unexploited areas of soft computing in NPC cancer. The research of Abdul Kareem group is the only research who applied artificial neural network and genetic algorithm to the NPC prognosis domain as well as considered a first step towards establishing a medical informatics research group and an attempt to increase the participation of Malaysians in the area of medical informatics by investigating the role of artificial Intelligence techniques as a cancer prognosticator.
Cancer is a term for diseases in which abnormal cells divide uncontrollably, invading nearby tissues and spreading to other parts of the body via the bloodstream or lymphatic system (AJCC, 1998).Various types of cancer exist, including: lung, skin, breast, ovary, nasopharyngeal carcinoma, and cervix. In this study we are concerned with NPC because is one of the most common cancers in Malaysia, with incidence rates among Chinese males and females at 17.3 and 7.3 per 100,000 in the state of Selangor. The rates among the Malay males and females were 2.5 and 0.3 and among the Indian males, 1.1 while the incidence rate of Indian females are unknown (Prasad, 1992).
Nasopharyngeal cancer is a disease in which malignant (cancer) cells form in the tissues of the nasopharynx, the nasopharynx is the upper part of the pharynx (throat) behind the nose.
Soft Computing comprises principally of genetic algorithms, artificial neural networks, and fuzzy logic.
Artificial Neural Network (ANN)
An artificial neural network (ANN) is an information processing system that tries to simulate biological neural networks, ANN are distributed, adaptive, generally nonlinear learning machines built from many different processing elements (PE). Each PE receives connections from other PE and/or itself. The interconnectivity defines the topology. The signals flowing on the connections are scaled by adjustable parameters called weights.
Neural networks are typically arranged in layers. Each layer in a layered network is an array of processing elements or neurons. Information flows through each element in an input-output manner (Mizumoto and Shi, 1997)
Genetic Algorithm (GA)
Genetic algorithms are global optimization algorithms based on the mechanics of natural selection and natural genetic and they employ a structured yet randomized parallel multipoint search strategy that is biased toward reinforcing search points of “high fitness”. Genetic algorithms are similar to simulated annealing in that they employ random (probabilistic) search strategies. However, one of the apparent distinguishing features of genetic algorithms is their effective implementation of parallel multipoint search. (Yuan and Suarga, 1995)
Fuzzy Logic (FL)
Fuzzy logic provides an approach to approximate reasoning in which the rules of inference are approximate rather than exact. Fuzzy logic is useful in manipulating information that is incomplete, imprecise, or unreliable. Also called fuzzy set theory, fuzzy logic extends the simple Boolean operators, can express implication.
A FL system has a series of rules comprising of an antecedent and a consequent, combined as if –then semantics. An antecedent is a conjunction of input variables, each as an expressed and evaluatable degree of fuzzy set (membership function).a consequent is a single output variable as an expressed and evaluatable degree of some fuzzy set. A FL system consists of four stages in its operation (with various methods establishing a specific system): fuzzification, inference (Kosko, 1992), (Wang and Mendel, 1992) composition/ aggregation often combined with defuzzification process and defuzzification (Odusanya et al, 2000).
NPC And Soft Computing Research
Evidently, there are many literatures on the use of Soft Computing in Cancer prognosis, especially from international publications (Odusanya et al, 2000, 2001, 2002) unfortunately there are very few write-ups on the applications of Soft Computing to NPC. Most of the studies are exploratory studies on neural network and its application on Medical informatics. There has been no reported instance of genetic algorithms or fuzzy logic in NPC cancer prognosis research.
Recently experiments of applying ANN on the prediction of survival outcome it has been gaining attractiveness. The researches on survival analysis using neural network are mostly in the domain of breast cancer, ovarian cancer, bladder cancer and prostate cancer only the research of Abdul-kareem has investigated the use of neural network technique to the NPC prognosis (Sameem et al, 1999, 2000a, 2000b , 2001a,2001b).
Abdul Kareem's group created and trained several back propagation neural networks using the same number of neurons, layers, epochs, goal and various training algorithms. Initially several training algorithms were used with respect to the NPC data model in order to determine the ideal training algorithm as is described in . Subsequently the resilient back propagation training algorithm was chosen based on the performance of the algorithm. The results obtained using different algorithms are shown in Table 1
A comparison was also made between the performance of the back propagation neural network and the recurrent neural network for the prediction of the survival of nasopharyngeal carcinoma. The results are shown in Table 2.
The main advantage of neural network technology is that the internal representation and distribution of data need not be known. Although neural network has not been tested extensively for modeling survival data, it is considered a good alternative for the prediction of survival of individual patients and it offers no obstacle to handling censored data (Prasad and Rampal, 1992).
GA and FL
Soft computing researches into NPC cancer has been embarked on recently, and would continue to attract attention, as more successes are reported. Currently Baker and Abdul-Kareem are developing a new system based on the genetic algorithms for determining the prognosis of NPC.
For this purpose two models are created, namely:
A genetic algorithm that evolves algebraic rule-based classifiers.
A genetic algorithm with a hybrid function strategy that evolves algebraic rule-based classifiers.
The system in this instance would involve a decision as to the survival or non-survival of a patient within a time range of twenty years. When a patient is deemed to survive, this is specified for at least twenty years, when a patient is deemed not to survive the time of expected death is provided to the nearest one year, up to twenty years. To achieve the one year interval-based classifier, a series of sub-classifiers are generated to predict a specific time range each, and then chained together to operate in unison and resolve efficiently the prognostic outcome for a given patient (Baker and Abdul Kareem, 2004).
For the numeric NPC cancer data, a series of rules that have the following form are proposed:
C = C11+C12+C13+C14+C21+C22+C23+C24 (1)
Where C = prognostic value that is used to determine if the patient will live or die according to algorithm 1
Cmn = C24 M = rule no N= chromosome no C = K1F1(X1) + K2F2(X2).........+KnFn(Xn) (2)
Where Xn : n = 1 to 22 represent one of the giving prognostic factor
Fj (j = 1 to 4) represent one of the following functions: F(x) = x, F(x) = x0.02, F(x) = arctan (x), F(x) = x0.45 Kj (j =1 to 4) are coefficients to be determined
A threshold value of 1.5 is imposed for absolute numeric values evaluated for each classifier. The classifier value C is then compared with a threshold, as in Figure 1.
Finally hybrid functions namely (fminsearch, patternsearch and fminunc) were examined with the genetic algorithm system and comparisons are made between the different methods. It is worthy of note that there has been no reported instance of the successful use of FL in the NPC domain.
NPC Data Structure
The information to be analysed in NPC cancer could either be images or variables, parameters or markers.
Abdul- kareem has pointed out key markers in order of finding the ideal soft computing technique for NPC survival, where she developed a list of twenty-two variables that are considered important to NPC prognosis and are described in Table 3.
The choice of the above variables is supported by numbers of specialist and the experiment result that conducted by Abdul-Kareem (Abdul-Kareem et al, 2000a).
While soft computing has been widely employed in breast cancer and ovarian cancer, there are very few to the area of nasopharyngeal carcinoma cancers, most specially the soft computing fields of the GA and FL systems.
Medical prognosis is a prediction of the future course and outcome of a disease and an indication of the likelihood of recovery from that disease. The researches on survival analysis using neural network are mostly in the domain of breast cancer, ovarian cancer, bladder cancer and prostate cancer only the research of Abdul-Kareem and Baker has investigated the use of neural network technique and Genetic algorithm in the NPC prognosis. This shows that while there are a number of medical informatics researchers who are keen on investigating the potential application of neural networks in the various types of cancer, many previous studies have omitted the use of genetic algorithm in the field of NPC prognostic.