ISPUB.com / IJTM/4/1/10150
  • Author/Editor Login
  • Registration
  • Facebook
  • Google Plus

ISPUB.com

Internet
Scientific
Publications

  • Home
  • Journals
  • Latest Articles
  • Disclaimers
  • Article Submissions
  • Contact
  • Help
  • The Internet Journal of Tropical Medicine
  • Volume 4
  • Number 1

Original Article

Prioritization of Malaria endemic zones in Arunachal Pradesh: A novel application of self organizing maps (SOM)

U Muty, N Arora

Citation

U Muty, N Arora. Prioritization of Malaria endemic zones in Arunachal Pradesh: A novel application of self organizing maps (SOM). The Internet Journal of Tropical Medicine. 2006 Volume 4 Number 1.

Abstract

Malaria continues to pose a serious threat to public health in North- Eastern states of India. Arunachal Pradesh is highly endemic for Malaria predominately with Plasmodium falciparium infections. Despite continuous efforts by government, a desirable level of control has not been achieved. The present study describes the application of self organizing maps (Kohonen maps), a data mining tool for prioritization of malaria endemic zones in this region. 60 PHCs (Public Health Centers) were randomly selected from Arunachal Pradesh and 6 malariometric parameters via Annual Blood Examination rate (ABER), Annual Parasite Incidence (API), Slide Positivity Rate (SPR), Annual Falciparum Incidence (AFI) and Slide Falciparum Rate (SFR) were considered which reflected the intensity of malaria transmission in this region. Self Organizing Maps yielded 9 clusters based on neighborhood distance, which reflects about zones based on status of intensity of malaria epidemiology. Such maps would make it possible to target control measures at high-risk areas and greatly increase the cost efficiency of malaria control programmes.

 

Introduction

Malaria, the third leading cause of death attributable to an infectious disease worldwide, has plagued mankind for countless generations. Malaria remains a public health problem in 90 countries in the world (1) and causes more than 300 million acute illnesses and at least one million deaths annually (2).The annual malaria burden in India estimates to nearly 2 to 2.5 million cases. North-Eastern region of India is in the Indo-Chinese hill zone of Macdonald's classification of stable malaria (3) and contributes nearly 9% of total malaria cases in India (4). In this region, efficient malaria transmission is maintained during most months of the year and slashes potential economic growth and thus is a major impediment to the overall development and progress of these areas.

Despite of several anti-malaria programmes being implemented under National Vector Borne Diseases Control Programme, this region has seen little tangible progress in alleviating the burden of malaria (5, 6). Apparently, there are definite inadequacies that continue to dampen the spirit of public health specialists even during the halcyon days of malaria eradication.

On closer scrutiny, it was evident that, there being financial and technical constraints common to all states of India, operational difficulties are hampering the effective malaria control in the North-Eastern region (7). These very areas remain inaccessible owing to floods and poor road communication. The major reasons of perennial and persistent malaria transmission are predominance of Plasmodium falciparum (8, 9), difficult terrain (10), congenial eco-climatic conditions (8), lack of proper implementation of control operations and ineffective communication between health researchers and policy makers. The problem of drug resistance (11, 12, 13), exophilic and exophagic vector behaviour and high efficiency of vectors (11, 14, 15) further aggravate the situation. Due to these various factors encountered in the North-Eastern region, malaria continues to present health services with an immensely difficult and complex challenge. The highly focal nature of malaria requires targeting of interventions to specific regions and malaria control interventions must be preceded by the identification and prioritization of the most vulnerable. Hence, there is an imperative need for exploitation of advanced Information Technology tools which can prioritize the endemic zones to liberate the region from manacles of this pandemic. Information technology has been successfully exploited in different spheres of control of vector- borne diseases. Computer applications in database management and data mining have paved way for control of malaria(16) and Filariasis (17, 18) and have proved to be a valuable tool in decision making in vector identification (19). Computer simulation models of vector borne diseases have been used to develop early warning system for the epidemics (19). Various expert systems have also been developed employing Artificial Intelligence (AI) for insect pest management and for forecasting of outbreak of vectors. SOMs (originally developed by Teuvo Kohonen) are special architecture of neural networks or special types of Artificial Neural Networks (ANN) that cluster the high dimensional data vectors according to a similarity measure. SOMs have successfully been applied for classification of DNA sequences based on codon usage (20, 21), nucleotide frequencies (22), virtual potentials (23), protein sequences analysis (24, 25) and clustering of microarray data (26). SOM have been successfully exploited in various epidemiological studies (27, 28, 29) In accordance with their important role in easy visualization of complex epidemiological data, Self Organizing Maps (SOM) are customized and used in the present study to prioritize the malaria endemic zones for disease management in Arunachal Pradesh.

Materials and Methods

Study site: Arunachal Pradesh is the largest state area-wise situated in the North-East region of India, sharing a long international border with Bhutan, China and Myanmar. This state is situated between latitude 26 30' N and 29 30 ' N and longitude 91 30' E and 97 30' E. The climate of the state is dominated by the Himalayan system and the altitudanal variations. The climate is highly hot and humid at the lower altitudes and in the valleys covered by swampy dense forest particularly in the eastern section, while it becomes exceedingly cold in the higher altitudes. Average temperature during the winter months ranges from 150 C to 210C and 220C to 300C during monsoon. Forested terrain and perennial streams are congenial for rapid multiplication and longevity of malaria vectors. Population of state is estimated to be 1091117 according to 2001 census. The state has a major population of 20 scheduled tribes and numerous sub-tribes. Agriculture is the primary driver of the economy. Nearly 80% of the population is engaged in. agriculture. The traditional method of agriculture is Jhumming, a kind of shifting cultivation. The main crops are rice, maize, millet, wheat and mustard.

Methods

Data Collection: Raw data was collected from the Directorate of Health, Govt. of Arunachal Pradesh, which consists of Epidemiological aspects of Malaria cases encountered in 60 randomly selected Public Health Centers belonging to 12 districts of Arunachal Pradesh in 2005.Raw data pertaining to malaria incidence was collected and standard malariometric parameters (ABER, API, SPR, SFR, AFI) were calculated based on this data to be used in this study. (Table1)

Figure 1
Table 1: Malariometric indices of Public Health centers in Arunachal Pradesh

Data Analysis: Data mining – Self Organizing Maps

In SOM, neurons compete with each other to earn the right of representing the input data (30, 31). As a result, data in the multidimensional attribute space can be abstracted to a much smaller number of latent dimensions organized on a basis of a predefined geometry in a space of lower dimensionality, usually a regular two-dimensional array of neurons. Via this way the structures embedded in the input data can be revealed which is placed in the input space and is spanned over the inputs distribution. Using a SOM network, it is possible to obtain a map of input space where closeness between units or clusters in the map represents closeness of the input data. Processing units in the SOM lattice is associated with weights of the same dimension of the input data. Using the weights of each processing unit as a set of coordinates the lattice can be positioned in the input space. During the learning stage the weights of the units change their position and “move” towards the input points. This “movement” becomes slower and at the end of the learning stage, the network is “frozen” in the input space. After the learning stage the inputs can be associated to the nearest network unit. When the map is visualized, the inputs can be associated to each cell on the map. One or more cell that clearly contains similar objects can be considered as a cluster on the map. These clusters are generated during the learning phase without any other information. It is not necessary to supply to the network cluster prototypes or examples. SOMs cluster the data in a manner similar to cluster analysis, but have an additional benefit of ordering the clusters and enabling the visualization of large numbers of clusters. These clusters are arranged in a low-dimensional topology-usually a grid structure that preserves the neighborhood relations in the high dimensional data (32, 33). This technique is particularly useful for the analysis of large datasets where similarity matching plays a very important role (34). The characteristic that distinguishes the SOM net from the other classification algorithms is that not only similar inputs are associated to the same cell but also neighborhood cells contain similar documents. This property together with the easy visualization makes the SOM map a useful tool for visualization and clustering of large data sets.

Parameters identified for SOM: Parameters like Annual Blood Examination rate (ABER), Annual Parasite Incidence (MPI), Slide Positivity rate (SPR), Slide Falciparum rate (SFR) and Annual Falciparum Rate (AFR) and Plasmodium falcipraum % were considered for this study. All the factors were given equal importance.

Data Normalization: Summarized data is normalized linearly such that minimum value in each category is 0 and the maximum 1. This is done to ensure that all the parameters are given equal importance when clustering is done.

Figure 2

Results

Normalized data is clustered using SOM yielded 9 clusters on a 3x3 (shown in figure1). Unsupervised learning was done on the fly using the data using a learning constant of 0.01 and for 10,000 iterations following which the data got clustered among clusters based on the neighborhood distance.

Figure 3
Figure 1: SOM Clusters showing different endemicity levels of Public Health centers.

Legends: ABER= Annual Blood Examination Rate, API= Annual Parasite Incidence, SPR= Slide Positivity Rate, SFR= Slide Falciparum Rate, AFI= Annual Falciparum incidence

Discussion

Cluster (1, 1), Cluster (1, 2) and Cluster (1, 3): These PHCs shows very low ABER and thus cuts a poor picture where because of the lack of appropriate surveillance, there is a possibility of underestimating the disease burden. There is an immediate need to investigate the causes hindering the proper survey and strengthening the HealthCare infrastructure. In spite of moderate SPR observed, some regions like Jairampur, Bordumsa and Bhalukpong are showing very high Pf% and thus requires immediate attention to curb this deadly parasite from its transmission to other regions.

Figure 4

Figure 5

Cluster (2, 1): Though the PHCs under this cluster shows low to moderate incidence of malaria yet a very high Pf% immediates the need of focusing on the drug administration on priority.

Figure 6

Cluster (2, 2): Only one PHC Khimiyong fell in this cluster and indicates moderate incidence of malaria. This region requires both drug administration and vector control measures at relatively less priority.

Figure 7

Cluster (2, 3): Low to moderate AP and SPR was observed in the 10 PHCs under this cluster and almost nil falciparum infections indicates that drug administration and vector control operations can be done with less priority.

Figure 8

Cluster (3, 1): The PHCs in this cluster shows an alarming rate of high malariometric indices. Seppa shows very high level of all malariometric parameters and reflects high endemicity. The PHCs clustered in this cluster are showing predominance of P. falciparum and thus drugs particularly targeting drug resistant falciparum malaria should be administered in this region. Inspite of efforts and various control operations undertaken by NMEP, malaria is still deeply entrenched in these regions. These regions warrant further investigation and more focused efforts on active surveillance and require both drug administration and vector control measures at relatively high priority and need a radical overhaul in the way it tackles the disease.

Figure 9

Cluster (3, 2): 5 PHCs clustered in this region shows moderate level of all malariometric indices and clearly reflects less falciparum trend as indicated by SFR, AFI and PF%. Only Sille shows very high API, and thus needs malaria control interventions at priority.

Figure 10

Figure 12

Cluster (3, 3): This cluster included the regions where due to efficient efforts of government machinery, ABER is considerably high which will help in getting a clear picture of status of malaria in the regions and will truly be a reflection of surveillance. These regions demonstrate a very high annual Parasite ncidence(API) and moderate to high SPR but very low or almost negligible falciparum incidence is reflected by SFR and AFI.3 PHCS namely Telam, Rani and Nari are demonstrating very high API This shows that despite highly satisfactory surveillance leading to high ABER malaria was still refractory to intervention measures and hence, there is a need to excogitate our strategy for control again.

{image:12}

Conclusions

The application of Data mining and artificial intelligence in Epidemiology is still in its infancy. In spite of numerous evidences of incorporation of artificial intelligence as an aid in data analysis of various epidemiological studies, medical entomologists are still unable to tap its potential in vector control except for data acquisition and storage. Information Technology in vector control operations has been extended to construction of Databases on different aspects of vector borne diseases and various forecasting systems based on computer simulation models (35). Application of Artificial Intelligence in combating vector borne diseases can give a completely new dimension to existing control programs. Artificial neural Networks such as Kohonen Maps have a natural propensity to learn–they learn how to solve problems from data as opposed to solving problems based on explicit problem specification (36). Self Organizing maps (SOM) are deemed as being highly effective as a sophisticated visualization tool for visualizing high dimensional complex data with inherent relationships between the various features comprising the data. These have been successfully exploited in Medical and Health Informatics in fields as varied as Medical image processing (37), disease diagnosis(38), gene prediction(39), gene sequence analysis(40), expression analysis (41,42), structural recognition of protein families (43) and drug designing (44) and drug utilization(45). In recent past, SOMs have been employed for data exploration in major public health diseases like Diabetes (27), Glaucoma (46). In this paper, we have shown the use of Self Organizing Maps as valuable tool in prioritization of malaria endemic zones which will assist in decision making on the location and deployment of health care services and prioritization of intervention strategies. In areas like Arunachal Pradesh which suffer from perennial malaria transmission, and where difficult terrain and geographical features are big hurdles in carrying out effective and timely vector control operations, SOM will be very effective in bridging the gap between policy makers and Health workers. Recognizing consistent foci of cases would permit control efforts to be directed at specific geographic areas, reducing costs and increasing effectiveness. In a country like ours where resources are scarce, reliable methods for the stratification of zones on basis of the prevalence or transmission intensity of malaria are urgently required. Such clustering and data visualization tools are essential for assessing the austerity of the problem, and hence the resources needed to emulate malaria. This approach will serve as yardstick for assessing the progress of control and indicate which geographic areas should be prioritized, so that large amount of man power and resources can be saved. Because of underlying simplicity in data visualization, SOM will prove to be a powerful weapon in arsenal in fight against this dreaded disease. This strategy will play a crucial role in bridging research and control and it is quite likely that besides reducing the malaria burden, the entire public health system will benefit from such a strategy if adopted and extrapolated to other regions across the world for other vector borne diseases.

Acknowledgements

Authors are grateful to the Director, IICT, Hyderabad for his continuous support and encouragement. Neelima Arora thanks CSIR for Senior Research Fellowship.

References

1. World Health Organization (WHO), Expert Committee on Malaria. WHO Expert Committee on Malaria,Twentieth Report. Geneva, WHO. (1998)
2. World Health Organization (WHO).The World health Report, 1999: WHO, eneva.(1999)
3. MacDonald G .The epidemiology and control of malaria. Oxford University Press, London, 1957
4. Shiv Lal, Sonal GS , Phukan PK. Status of Malaria in India. Indian Academy of Clinical.Medicine 2000; 5(1):19 -23
5. Sen PK. Resurgence of malaria in eastern and north-eastern region of India: a critical appraisal.Indian J Public Health 1994; 38(4):155-8
6. Mohapatra PK, Prakash A, Bhattacharyya DR, et al. Malaria situation in northeastern region of India. ICMR Bull 1998;28: 21-30
7. Mohapatra PK, Narain K, Prakash A, et al. Risk factors of malaria in the fringes of an evergreen monsoon forest of Arunachal Pradesh. J.Natl MedJ India. 2001;14(3):139-142
8. Dev V, Hira CR, Rajkhowa Mk. Malaria-attributable morbidity in Assam, north-eastern India. Ann Trop.Med Parasitol 2001 ;95(8):789-96
9. Mohapatra PK, Prakash A, Taison K., et al. Evaluation of chloroquine (CQ) and sulphadoxine/ pyrimethamine (SP) therapy in uncomplicated falciparum malaria in Indo-Myanmar border areas. Tropical Medicine and International Health 2005; 10(5): 478-483
10. Yadava RL, Sharma RS. Malaria problem and its control in north eastern states of India. J.Commun Dis. 1995; 27(4):262-6.
11. Kondrashin AV, Rooney W and Singh N. Dynamics of P. falciparum ratio - An indication of malaria resistance or a result of control measures? Indian J Malariol 1987; 24: 89-94
12. Satyanarayana S, Sharma SK, Cheeleng PK et al. Chloroquine resistant P. falciparum malaria in Arunachal Pradesh. Indian J.Malariol 1991 ;28(2):137-40.
13. Mohapatra PK, Namchoom NS, Prakash A, et al. Therapeutic efficacy of anti-malarials in Plasmodium falciparum malaria in an Indo-Myanmar border area of Arunachal Pradesh. Indian J Med Res 2003 ; 118:71-6.
14. Sharma VP. Re-emergence of malaria in India. Indian J.Med.Res.1996; 103:26-45
15. Dev V, Bhattacharyya PC, Talukdar R. Transmission of Malaria and its Control in The Northeastern Region of India JAPI 2003; 51:1073-1076
16. Upadhyayula Suryanaryana Murty, Mutheneni Srinivasa Rao, Neelima Arora and Amirapu Radha Krishna. Database management system for the control of malaria in Arunachal Pradesh, India. Bioinformation.2006. 1(6): 194-196
17. Kumar DVRS, Sriram K, Madhusudhan Rao K, et al. Management of filariasis using prediction rules derived from data mining. Bioinformation 2005; 1 (1): 8-11.
18. Murty USN, Kumar DVRS, Sriram K, et al. A Web based relational database management system for filariasis control .Bioinformation 2005;1(1):19-20
19. Murty USN , Kumar DVRS, Srinivasa Rao M, et al. Rapid identification of female Culex mosquito species using Expert System in the South East Asian region. Bioinformation 2005; 1(2):40-41
20. Kanaya,S. et al. (2001) Analysis of codon usage diversity of bacterial genes with a self-organizing map (SOM): characterization of horizontally transferred genes with emphasis on the E.coli O157 genome. Gene, 276, 89-99
21. Supek,F. and Vlahovicek,K. (2004) Inca: synonymous codon usage analysis and clustering by means of self-organizing map. Bioinformatics, 20, 2329-2330.
22. Abe, T. et al. (2003) Informatics for unveiling hidden genome signatures. Genome Res., 13, 693-702.
23. Aires-de-Sousa,J. and Aires-de-Sousa,L. (2003) Representation of DNA sequences with virtual potentials and their processing by (seqrep) Kohonen self-organizing maps. Bioinformatics, 19, 30-36.
24. Ferran,E.A. and Ferrara,P. (1992) Clustering proteins into families using artificial neural networks. Comput. Appl. Biosci., 8, 39-44.
25. Ferran,E.A. et al. (1994) Self-organized neural maps of human protein sequences. Protein Sci., 3, 507-521.
26. Toronen,P. et al. (1999) Analysis of gene expression data using self-organizing maps. FEBS Lett., 451, 142-146
27. Veli-Pekka Valkonen, Mikko Kolehmainen, Hanna-Maaria Lakka, et al. Insulin resistance syndrome revisited: application of self-organizing maps. International Journal of Epidemiology 2002;31:864-871
28. Moshou D,Cheddad A,Van Hirtum A,et al.Neural Recognition System For Swine cough.Mathematics and Computers in Simulation 2001;56(4):475-48.
29. Tambouratzis G, Papakonstantinou S, Stamatelopoulos , N. Analyzing the 24-hour blood pressure and heart-rate variability with self-organizing feature maps. International journal of Intelligent systems2002;17(1):63-76.
30. Oja, E. and Kaski, S., (editors). 1999. Kohonen Maps (Amsterdam: Elsevier Science).
31. Kohonen, T., 2001. Self-Organizing Maps. 3rd edition (Berlin, Heideberg: Springer Press).
32. Kohonen, T., 1982. Self -Organized Formation of topologically Correct Feature Maps. Biological Cybernetics, 43:59-69.
33. Nurnberger, A., and Detyniecki, M., 2002. Visualizing changes in data collections using growing self -organizing maps. In Proceedings of the 2002 International Joint Conference on Neural Networks (IJCNN 2002), 2:1912-1917
34. Cuadros-Vargas E., Romero, R., and Obermayer, K., 2003. Speeding up algorithms of SOM Family for Large and High Dimensional Databases. In Yamakawa T., editor, In Proceedings of the WSOM, 167-172.
35. Moshe B Hoshen, Andrew P Morse .A weather -driven model of malaria transmission. Malaria Journal 2004, 3:32
36. J. Lampinen & E. Oja. Clustering Properties of Hierarchical Self-Organizing Maps. Journal of Mathematical Imaging and Vision 1992; 2:261-272.
37. Braccini G, Edenbrandt L, Lagerholm M,et al. Self­organizing maps and Hermite functions for classification of ECG complexes. In Computers in Cardiology 1997, pages 425--8. IEEE, New York, NY, USA
38. Juhola M, Laurikkala J, Viikki K, et al. Classification of patients on the basis of otoneurological data by using kohonen networks. Acta Otolaryngologica 2001, 50-52.
39. Shaun Mahony, James O McInerney, Terry J Smith, and Aaron Golden. Gene prediction using the Self-Organizing Map: automatic generation of multiple gene models. BMC Bioinformatics. 2004; 5: 23.
40. Dollhopf S L, Hashsham S A, Tiedje J M. Interpreting 16s rDNA t­RFLP data: Application of self­ organizing maps and principal component analysis to describe community dynamics and convergence . Microbial Ecology2001; 42(4):495-505.
41. Nikkila J, Toronen P, Kaski S,et al.Analysis and Visualization of gene expression data using self -organizing maps .Neural Netw 2002;15(8-9):953-66.
42. . Kari Torkkola, Robert Mike Gardner, Tamma Kaysser-Kranich and Calvin Ma. Self-organizing maps in mining gene expression data. Information Sciences 2001, Volume 139(1-2), 79-96
43. Andrade, M. A., Casari, G., Sander,et al. Classification of protein families and detection of the determinant residues with an improved self organizing map. Biol. Cyb 1997;76:441--50.
44. Anzali S, Gasteiger J, Holzgrabe U, et al. The use of self organizing neural networks in drug design. Perspectives in Drug Discovery and Design 1998; 9-11:273--299.
45. Hsin-Chuan Chou, Ching-Hsue Cheng, Jing-Rong Chang. Extracting drug utilization knowledge using self-organizing map and rough set theory. Expert Systems with Applications: An International Journal .2007, Vol 33(2), 499-508
46. Sanjun Yan, Syed Sibte Raza Abidi, Paul Habib Artes. Analyzing Sub-Classifications of Glaucoma via SOM Based Clustering of Optic Nerve Images.Connecting Medical Informatics and Bio-Informatics R. Engelbrecht et al. (Eds.) ENMI, 2005; 483-488.

Author Information

U.S.N. Muty, Ph.D.
Scientist E1, Deputy Director, Head, Biology Division, Indian Institute of Chemical Technology

Neelima Arora, M.Sc.
Senior Research Fellow, Biology Division, Indian Institute of Chemical technology

Download PDF

Your free access to ISPUB is funded by the following advertisements:

 

 

BACK TO TOP
  • Facebook
  • Google Plus

© 2013 Internet Scientific Publications, LLC. All rights reserved.    UBM Medica Network Privacy Policy

Close

Enter the site

Login

Password

Remember me

Forgot password?

Login

SIGN IN AS A USER

Use your account on the social network Facebook, to create a profile on BusinessPress