Open Access   Article Go Back

Machine Learning: An Effective Technique for Health Data Classification

Nishant Behar1 , Manish Shrivastava2

Section:Research Paper, Product Type: Journal Paper
Volume-07 , Issue-03 , Page no. 41-47, Feb-2019

Online published on Feb 15, 2019

Copyright © Nishant Behar, Manish Shrivastava . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Nishant Behar, Manish Shrivastava, “Machine Learning: An Effective Technique for Health Data Classification,” International Journal of Computer Sciences and Engineering, Vol.07, Issue.03, pp.41-47, 2019.

MLA Style Citation: Nishant Behar, Manish Shrivastava "Machine Learning: An Effective Technique for Health Data Classification." International Journal of Computer Sciences and Engineering 07.03 (2019): 41-47.

APA Style Citation: Nishant Behar, Manish Shrivastava, (2019). Machine Learning: An Effective Technique for Health Data Classification. International Journal of Computer Sciences and Engineering, 07(03), 41-47.

BibTex Style Citation:
@article{Behar_2019,
author = {Nishant Behar, Manish Shrivastava},
title = {Machine Learning: An Effective Technique for Health Data Classification},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {2 2019},
volume = {07},
Issue = {03},
month = {2},
year = {2019},
issn = {2347-2693},
pages = {41-47},
url = {https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=676},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_spl_paper_view.php?paper_id=676
TI - Machine Learning: An Effective Technique for Health Data Classification
T2 - International Journal of Computer Sciences and Engineering
AU - Nishant Behar, Manish Shrivastava
PY - 2019
DA - 2019/02/15
PB - IJCSE, Indore, INDIA
SP - 41-47
IS - 03
VL - 07
SN - 2347-2693
ER -

           

Abstract

Health is precious for every life. But there are many diseases which fall under the category of dangerous or critical due to it mortality rate. Such diseases can be cured or at least prevented if they are identified in their earlier stages. For the proper diagnosis of these diseases, data mining techniques using machine learning methods- k-NN, Naïve Base, Decision trees, Support Vector Machine plays very significant role. In this paper the focus was on finding techniques applied for the common disease classifications, the accuracy of methods reported, dataset used and pros and cons of these methods and concluded with the open challenges and opportunities for further research in health care sector.

Key-Words / Index Term

Machine Learning, Data, classification

References

[1] U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, “The KDD process for extracting useful knowledge from volumes of data”. Communications of the ACM, vol. 39(11), pp. 27-34, 1996.
[2] P. Chandrasekar, K. Qian, H. Shahriar, P. Bhattacharya., "Improving the Prediction Accuracy of Decision Tree Mining with Data Preprocessing," IEEE 41st Annual Computer Software and Applications Conference (COMPSAC), Turin, pp. 481-484, 2017.
[3] A. Dewan, M. Sharma, "Prediction of heart disease using a hybrid technique in data mining classification," 2nd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, pp. 704-706,2015.
[4] N.Lavrac, “Selected Techniques for data mining for medicine”, Artificial Intelligence ,Medical, vol. 16, pp. 3-23, 1999.
[5] K. Chuchra, A. Chhabra, "Evalauting the performance of tree based classifiers using Ebola virus dataset," 1st International Conference on Next Generation Computing Technologies (NGCT), Dehradun, pp. 494-499, 2015.
[6] Sagar S Nikam. A Comparative study of Classification Techniques in Data Mining Algorithms.Oriental Journal of Computer Science & Technology.Vol 8(1),(2015),pp 13-19
[7] G. I. Salama , M. B. Abdelhalim, M. A. Zeid, "Experimental comparison of classifiers for breast cancer diagnosis," Seventh International Conference on Computer Engineering & Systems (ICCES), Cairo, pp.180-185, 2012. doi: 10.1109/ICCES.2012.6408508
[8] Mehmed Kantardzic, “Data Mining: Concepts,Models, Methods, and Algorithms, ,John Wiley & Sons, 2003. ISBN: 0471228524

[9] J. Zhu, Q. Xie, K. Zheng, “An improved early detection method of type-2 diabetes mellitus using multiple classifier system”. Information Sciences, vol. 292, pp. 1-14. 2015.
[10] M. Fatima , M. Pasha, “Survey of Machine Learning Algorithms for Disease Diagnostic”. Journal of Intelligent Learning Systems and Applications, vol. 9, 2017.
[11] M. Giardina, F. Azuaje, P. McCullagh , R. Harper, "A Supervised Learning Approach to Predicting Coronary Heart Disease Complications in Type 2 Diabetes Mellitus Patients," Sixth IEEE Symposium on BioInformatics and BioEngineering (BIBE`06), Arlington, VA, pp. 325-331. 2006,
[12] A. Bar-Or, D. Keren, A. Schuster,R Wolff., "Hierarchical decision tree induction in distributed genomic databases," in IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 8, pp. 1138-1151, 2005.
[13] Yeh, Jinn-Yi, “Applying data mining techniques for cancer classification on gene expression data”,in Taylor & Francis, Cybernetics and Systems , vol. 39 (6), pp. 583-602, 2008.
[14] S. Uhmn, D. H. Kim, S. W Cho,. J. Y Cheong, J. Kim, "Chronic Hepatitis Classification Using SNP Data and Data Mining Techniques," Frontiers in the Convergence of Bioscience and Information Technologies, Jeju City, pp. 81-86.2007
[15] V. Vijayan V., Anjali C., "Decision support systems for predicting diabetes mellitus — A Review," Global Conference on Communication Technologies (GCCT), Thuckalay, pp. 98-103, 2015.
[16] Vinod Chandra S.S, Anand Hareendran S, “Artificial intelligence and machine learning”,PHI learning Private Limited ,Delhi, 2014.
[17] S Ali, A. Abraham, “An empirical comparison of kernel selection for support vector machines”, 2nd International Conference on Hybrid Intelligent Systems, Soft Computing systems: Design, Management and Applications, IOS Press,The Netherlands, pp. 321-330, 2002.
[18] B. K Mishra, P. Lakkadwala, N. K Shrivastava, "Novel Approach to Predict Cardiovascular Disease Using Incremental SVM," International Conference on Communication Systems and Network Technologies, Gwalior, pp. 55-59, 2013.
[19] Wang Wei and Yang Jiong, “Mining High-Dimensional Data” Data Mining and knowledge Discovery Handbook, Springer,pp 793-799, 2005.
[20] E. Gürbüz, E. Kılıç,” A new adaptive support vector machine for diagnosis of diseases”. Expert Systems, vol. 31(5), pp. 389-397, 2014.
[21] J. Park, D. W. Edington.” A sequential neural network model for diabetes prediction”. ArtifIntellMed , vol. 23, pp. 277– 93, 2001.
[22] S. N. Yu, K. T. Chou, "Combining Independent Component Analysis and Back propagation Neural Network for ECG Beat Classification," International Conference of the IEEE Engineering in Medicine and Biology Society, New York,pp. 3090-3093, 2006.
[23] H. Kahramanli, N. Allahverdi, “Design of a hybrid system for the diabetes and heart diseases”. Expert systems with applications, vol. 35(1), pp. 82-89, 2008.
[24] J. Kim, H. S. Shin, K. Shin, M. Lee, “Robust algorithm for arrhythmia classification in ECG using extreme learning machine” BioMedical Engineering OnLine, vol-8,issue-https://doi.org/10.1186/1475-925X-8-31.
[25] S. Ding, H. Zhao, Y. Zhang, X. Xu, R. Nie, “ Extreme learning machine: algorithm, theory and applications”, The Artificial Intelligence Review, vol. 44(1), pp. 103-15, 2015.
[26] M. F. Caglar, B. Cetisli, İ. B. TOPRAK, “ Automatic recognition of Parkinson`s disease from sustained phonation tests using ANN and adaptive neuro-fuzzy classifier”, Mühendislik Bilimleri ve Tasarım Dergisi,vol. 1(2), pp. 59-64, 2010.
[27] M. S. Shanker, “ Using neural networks to predict the onset of diabetes mellitus”. J Chem Inf ComputSci. Vol. 36, pp. 35–41.1996.
[28] M. Lopez, J. Ram´ırez, J. M G´orriz., D. Salas-Gonz´alez., I. A´lvarez, F.Segovia, C.G. Puntonet., “Automatic tool for Alzheimers disease diagnosis using PCA and Bayesian classification rules”, Electronics Letters,vol. 45, pp.389–391, 2009.
[29] F. Calle-Alonso, C.J. Pérez, J. P. Arias-Nicolás., J Martín, “Computer-aided diagnosis system: A Bayesian hybrid classification method”, Computer Methods and Programs in Biomedicine, Vol. 112, Issue 1, pp. 104-113, 2013.
[30] Divya Tomar and Sonali.Agrawal, “A survey on Data Mining approaches for Healthcare”, International Journal of BioScience and Bio-Technology, vol. 5, pp. 241-266, 2013.
[31] T. J. Peter, K. Somasundaram, "An empirical study on prediction of heart disease using classification data mining techniques," IEEE-International Conference On Advances In Engineering, Science And Management (ICAESM -2012), Nagapattinam, Tamil Nadu, pp. 514-518. 2012.
[32] K. Polat, S. Güneş, “ An improved approach to medical data sets classification: artificial immune recognition system with fuzzy resource allocation mechanism”, Expert Systems, vol. 24, pp. 252–270. 2007.
[33] Yue Huang, Paul McCullagh, Norman Black, Roy Harper, “Feature selection and classification model construction on type 2 diabetic patients data”, Artificial Intelligence in Medicine, Vol. 41, Issue 3, Pp. 251-262, 2007.
[34] Pasi Luukka, “Similarity classifier using similarity measure derived from Yu`s norms in classification of medical data sets”, Computers in Biology and Medicine, Vol. 37, Issue 8, pp. 1133-1140, 2007.
[35] S. N. Ghazavi, T. W. Liao, “Medical data mining by fuzzy modeling with selected features”, Artificial Intelligence in Medicine, vol. 43(3), pp. 195-206. 2008.
[36] Kemal Polat, Salih Güneş Ahmet, Arslan, “ A cascade learning system for classification of diabetes disease: Generalized Discriminant Analysis and Least Square Support Vector Machine”, Expert Systems with Applications, Vol. 34, pp. 482-487, 2008.
[37] Rong-Ho Lin, “An intelligent model for liver disease diagnosis”, Artificial Intelligence in Medicine, Vol. 47, pp. 53-62, 2009.
[38] I. Gadaras, L.Mikhailov, An interpretable fuzzy rule-based classification methodology for medical diagnosis. Artificial intelligence in medicine, vol. 47(1), pp. 25-41, 2009.
[39] B.M Patil, R.C Joshi, Durga Toshniwal, “Hybrid prediction model for Type-2 diabetic patients”, In Expert Systems with Applications, Vol. 37, pp. 8102-8108, 2010,.
[40] Pasi Luukka, “Feature selection using fuzzy entropy measures with similarity classifier”, Expert Systems with Applications, Vol. 38, pp. 4600-4607, 2011.
[41] A. Ozcift, Gulten,” Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms”, Computer methods and programs in biomedicine, vol. 104(3), pp. 443-451, 2011.
[42] Mostafa Fathi Ganji, Mohammad Saniee Abadeh, “A fuzzy classification system based on Ant Colony Optimization for diabetes disease diagnosis”, Expert Systems with Applications, Vol. 38, pp. 14650-14659, 2011.
[43] Oscar Bedoya, Irene Tischer, “Remote homology detection incorporating the context of physicochemical properties”, Computers in Biology and Medicine, Vol. 45, pp. 43-50, 2014.
[44] F. Beloufa, M. A. Chikh, “ Design of fuzzy classifier for diabetes disease using Modified Artificial Bee Colony algorithm. Computer methods and programs in biomedicine, vol. 112(1), pp. 92-103. 2013.
[45] H. R. Marateb, M. Mansourian, E. Faghihimani, M. Amini, D. Farina, “A hybrid intelligent system for diagnosing microalbuminuria in type 2 diabetes patients without having to measure urinary albumin”, Computers in biology and medicine, vol. 45, pp. 34-42, 2014.
[46] N. Yilmaz, O. Inan, M. S. Uzer, “ A new data preparation method based on clustering algorithms for diagnosis systems of heart and diabetes diseases”, Journal of medical systems, vol. 38(5), pp. 148–59. 2014.
[47] N. Salari, S. Shohaimi, F. Najafi, M. Nallappan, I. Karishnarajah, “ A Novel Hybrid Classification Model of Genetic Algorithms, Modified k-Nearest Neighbor and Developed Backpropagation Neural Network”, PLoS ONE vol. 9(11), 2014.
[48] A. Purwar, S. K. Singh, Hybrid prediction model with missing value imputation for medical data. Expert Systems with Applications, vol. 42(13), pp. 5621-5631, 2015.
[49] M. Seera, C.P. Lim, , S.C. Tan, Chu Kiong Loo, “A hybrid FAM–CART model and its application to medical data classification”, Neural Comput & Applic, Springer London, 2015.