Open Access   Article Go Back

A Review of Clustering Methods forming Non-Convex clusters with, Missing and Noisy Data

Sushant Bhargav1 , . Mahesh Pawar2

Section:Review Paper, Product Type: Journal Paper
Volume-4 , Issue-3 , Page no. 39-44, Mar-2016

Online published on Mar 30, 2016

Copyright © Sushant Bhargav ,. Mahesh Pawar . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Sushant Bhargav ,. Mahesh Pawar, “A Review of Clustering Methods forming Non-Convex clusters with, Missing and Noisy Data,” International Journal of Computer Sciences and Engineering, Vol.4, Issue.3, pp.39-44, 2016.

MLA Style Citation: Sushant Bhargav ,. Mahesh Pawar "A Review of Clustering Methods forming Non-Convex clusters with, Missing and Noisy Data." International Journal of Computer Sciences and Engineering 4.3 (2016): 39-44.

APA Style Citation: Sushant Bhargav ,. Mahesh Pawar, (2016). A Review of Clustering Methods forming Non-Convex clusters with, Missing and Noisy Data. International Journal of Computer Sciences and Engineering, 4(3), 39-44.

BibTex Style Citation:
@article{Bhargav_2016,
author = {Sushant Bhargav ,. Mahesh Pawar},
title = {A Review of Clustering Methods forming Non-Convex clusters with, Missing and Noisy Data},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {3 2016},
volume = {4},
Issue = {3},
month = {3},
year = {2016},
issn = {2347-2693},
pages = {39-44},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=824},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=824
TI - A Review of Clustering Methods forming Non-Convex clusters with, Missing and Noisy Data
T2 - International Journal of Computer Sciences and Engineering
AU - Sushant Bhargav ,. Mahesh Pawar
PY - 2016
DA - 2016/03/30
PB - IJCSE, Indore, INDIA
SP - 39-44
IS - 3
VL - 4
SN - 2347-2693
ER -

VIEWS PDF XML
1950 1697 downloads 1449 downloads
  
  
           

Abstract

Clustering problem is among the foremost quests in Machine Learning Paradigm. The Big Data sets, being versatile, multisourced & multivariate, could have noise, missing values, & may form clusters with arbitrary shape. Because of unpredictable nature of Big Data Sets, the clustering method should be able to handle missing values, noise, & should be able to make arbitrary shaped clusters. The partition based methods for clustering does not form non-convex clusters, The Hierarchical Clustering Methods & Algorithms are able to make arbitrary shaped clusters but they are not suitable for large data set due to time & computational complexity. Density & Grid Paradigm do not solve the issue related to missing values. Combining different Clustering Methods could eradicate the mutual issues they have pertaining to dataset’s geometrical and spatial properties, like missing data, non-convex shapes, noise etc.

Key-Words / Index Term

Clustering, convex, non-convex, missing values, Big Data, noisy data, data mining, density based

References

[1] Cisco, V. N. I. "The Zettabyte Era: Trends and Analysis." Updated :( Jun 23, 2015), http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/VNI_Hyperconnectivity_WP.pdf ; Document ID :1458684187584791 Accessed :Jan 2016
[2] Najlaa, Zahir, Abdullah, Ibrahim, Albert, Sebti, Bouras Fahad, "A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis," IEEE Transactions on Emerging Topics in Computing, vol. 2, no. 3, 2014.
[3] Leiserson, Rivest, Stein Cormen, Introduction to Algorithms, 3rd ed. ISBN 978-0262033848: Page 43-97, MIT Press & TMH, 2009.
[4] J.B.Macqueen, "Some Methods for classification and Analysis of Multivariate Observations," in 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, Berkeley, 1967, pp. 281-297.
[5] Boomija, "Comparison of Partition Based Clustering Algorithms," Journal of Computer Applications, vol. 1, no. 4, p. 18, Oct-Dec 2008.
[6] A.K Jain and H.C. Martin, "Law, Data clustering: a user’s dilemma," in In Proceedings of the First international conference on Pattern Recognition and Machine Intelligence, 2005.
[7] A.K.Jain, "Data clustering: 50 years beyond K-means," Pattern Recognition Letters, vol. 31, no. 8, pp. 651-666, June 2010.
[8] Vipin Kumar, Pang-Ning Tan, and Michael Steinbach, Introduction to data mining.: Addison-Wesley, 2005. ISBN : 9780321321367
[9] Joulin, Bach Hocking, "Clusterpath An Algorithm for Clustering using Convex Fusion Penalties," in 28th International Conference on Machine Learning , Bellevue, WA, USA, 2011.
[10] Martin, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu Ester, "A density-based algorithm for discovering clusters in large spatial databases with noise," in In Kdd, vol. 96, no. 34, 1996, pp. 226-231.
[11] Amineh, W. Ying Amini, "DENGRIS-Stream: A density-grid based clustering algorithm for evolving data streams over sliding window," in International Conference on Data Mining and Computer Engineering, 2012, pp. 206-210.
[12] Ulrike Von Luxburg, "A tutorial on spectral clustering," Statistics and computing, vol. 17, no. 4, pp. 395-416, 2007.
[13] Pabitra Mitra, Sankar K. Pal, and Aleemuddin Siddiqi, "Non-convex clustering using expectation maximization algorithm with rough set initialization," Pattern Recognition Letters, vol. 24, no. 6, pp. 863-873, 2003.
[14] Saline S Singh & N C Chauhan, "K-means vs K-Medoid: A Comparative Study," in National Conference on Recent Trends in Engineering & Technology, (NCRTET) BVM College, Gujarat, India, 2011.
[15] pafnuty.blog, By Aman Ahuja, Updated: (2013, Aug) https://pafnuty.wordpress.com/2013/08/14/non-convex-sets-with-k-means-and-hierarchical-clustering/ Accessed :Jan 2016
[16] R Core Team (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.
[17] Chourasia, Richa, and Preeti Choudhary. "An approach for web log preprocessing and evidence preservation for web mining." (2014): 210-215.