Open Access   Article Go Back

Clustering approach based on Efficient Coverage with Minimum Weight for Document Data

D.S. Rajput1 , R.S. Thakur2 , G.S. Thakur3

Section:Research Paper, Product Type: Journal Paper
Volume-1 , Issue-1 , Page no. 6-13, Sep-2013

Online published on Sep 30, 2013

Copyright © D.S. Rajput, R.S. Thakur, G.S. Thakur . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: D.S. Rajput, R.S. Thakur, G.S. Thakur, “Clustering approach based on Efficient Coverage with Minimum Weight for Document Data,” International Journal of Computer Sciences and Engineering, Vol.1, Issue.1, pp.6-13, 2013.

MLA Style Citation: D.S. Rajput, R.S. Thakur, G.S. Thakur "Clustering approach based on Efficient Coverage with Minimum Weight for Document Data." International Journal of Computer Sciences and Engineering 1.1 (2013): 6-13.

APA Style Citation: D.S. Rajput, R.S. Thakur, G.S. Thakur, (2013). Clustering approach based on Efficient Coverage with Minimum Weight for Document Data. International Journal of Computer Sciences and Engineering, 1(1), 6-13.

BibTex Style Citation:
@article{Rajput_2013,
author = {D.S. Rajput, R.S. Thakur, G.S. Thakur},
title = {Clustering approach based on Efficient Coverage with Minimum Weight for Document Data},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {9 2013},
volume = {1},
Issue = {1},
month = {9},
year = {2013},
issn = {2347-2693},
pages = {6-13},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=8},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=8
TI - Clustering approach based on Efficient Coverage with Minimum Weight for Document Data
T2 - International Journal of Computer Sciences and Engineering
AU - D.S. Rajput, R.S. Thakur, G.S. Thakur
PY - 2013
DA - 2013/09/30
PB - IJCSE, Indore, INDIA
SP - 6-13
IS - 1
VL - 1
SN - 2347-2693
ER -

VIEWS PDF XML
4821 4744 downloads 4622 downloads
  
  
           

Abstract

At present time huge amount of useful data is available on web for access, and this huge amount of data is shared information which can be used by anyone intended to use. The availability of different types and nature of document data has lead to the task of clustering in large dataset. Clustering is one of the very important techniques used for classification of large dataset and widely applicable many areas. High-quality and fast document clustering algorithms play a significant role to successfully navigate, summarize and organize the information. Recent studies have shown that partitional clustering algorithms are suit- able for large datasets. The k-means algorithm [9, 10] is generally used as partitional clustering algorithm because it can be easily implemented and is most efficient in terms of execution time. The major problem with this algorithm is its sensitivity in selection of the initial partition and its convergence to local optima. In this research study we have refined the useful information from document data set using minimum spanning tree for document clustering and good quality of clusters have been generated on several document datasets, and the output show obtained indicates effective improvement in performance.

Key-Words / Index Term

Minimum Spanning Tree, Document Clustering, World Wide Web, K-Means Algorithm

References

[1] A. Vathy-Fogarassy, A. Kiss, and J. Abonyi , �Hybrid Minimal Spanning Tree and Mixture of Gaussians Based Clustering Algorithms�, Proceeding. IEEE International Conferance Tools with Artificial Intelligence, pp 73-81, 2006.
[2] Andreas C. Muller, S. Nowozin, christoph H. Lampert, �Information theoretic clustering using minimum spanning tree� Pattern Recognition, pp. 205-215, 2012.
[3] Bhaskar Adepu, K.K. bejjanki, �A Novel Approach for Minimum Spanning Tree based Clustering Algorithm�
[4] B. Eswara Reddy, K. Rajendra Prasad, �reducing runtime values in minimum spanning tree based clustering by visual access tendency� International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.2, No.3, pp 11-22, May 2012.
[5] C. Zahn. �Graph-theoretical methods for detecting and describing gestalt clusters�. IEEE Transactions on Computers, C-20:pp. 68-86, 1971.
[6] Chang, J., Luo, J., Huang, J.Z., Feng, S., Fan, J.: Minimum spanning tree based classification model for massive data with mapreduce implementation. In: Fan, W., Hsu, W., Webb, G.I., Liu, B., Zhang, C., Gunopulos, D., Wu, X. (eds.) ICDM Workshops,. IEEE Computer Society pp. 129�137, 2010.
[7] Congnan Luoa, Yanjun Lib, Soon M. Chungc, �Text document clustering based on neighbours� Data & Knowledge Engineering Volume 68, Issue 11, Pages 1271�1288, November 2009.
[8] D.S Rajput, R.S. Thakur, G.S. Thakur �Rule Generation from Textual Data by using Graph Based Approach�, International Journal of Computer Application (IJCA) 0975 � 8887, New york USA, ISBN: 978-93-80865-11-8, Vol. 31� No.9,pp. 36-43 , October 2011.
[9] D. S. Rajput, R. S. Thakur, G. S. Thakur ,Neeraj Sahu, � Analysis of Social Networking Sites Using K- Mean Clustering Algorithm�, International Journal of Computer & Communication Technology (IJCCT) ISSN (ONLINE): 2231 - 0371 ISSN (PRINT): 0975 �7449 Vol-3, Iss-3, pp. 88-92, 2012.
[10] Han I and Kamber M, �Data Mining concepts and Techniques,� M. K. Publishers, pp.335�389, 2000.
[11] Jiaxiang Lin, Dongyi Ye, Chongcheng Chen, Miaoxian Gao, �Minimum Spanning Tree Based Spatial Outlier Mining and Its Applications�, Third International Conference, RSKT 2008, Chengdu, China, May 17-19,. pp 508-515, 2008.
[12] J. Zhang and N. Wang, �Detecting outlying subspaces for high-dimensional data: the new task, Algorithms and Performance�, Knowledge and Information Systems, 10(3):pp. 333-555, 2006.
[13] Lijuan Zhou , Linshuang Wang ; Xuebin Ge ; Qian Shi , �A clustering-Based KNN improved algorithm CLKNN for text classification�, Informatics in Control, Automation and Robotics (CAR), 2nd International Asia Conference on Vol.- 3
pp: 212 � 215, 2010.
[14] M. Laszlo and S. Mukherjee, �Minimum Spanning Tree Partitioning Algorithm for Micro aggregation�, IEEE Transaction, Knowledge and Data Engineering, Vol. 17, no 7, pp 902-911, July 2005.
[15] O. Grygorash, Y. Zhou, Z. Jorgensen, �Minimum spanning tree based clustering algorithm�, in Proceeding of the 18th International Conference on Tools with Artificial Intelligence, pp. 73�81, 2006.
[16] Piotr Juszczak, David M.J. Taxa, Elżbieta Pe�kalskab, Robert P.W. Duina, �Minimum spanning tree based one-class classifier �Advances in Machine Learning and Computational Intelligence, Volume 72, Issues 7�9, , pp. 1859�1869, March 2009.
[17] P.Sampurnima, J Srinivas & Harikrishna, �Performance of Improved Minimum Spanning Tree Based on Clustering Technique� Global Journal of Computer Science and Technology Software & Data Engineering, ISSN: 0975-4172 Volume 12 Issue 13 pp 16-22, 2012.
[18] Vathy-Fogarassy , A.Kiss, J.Abnoyi,�Hybrid Minimal Spanning tree based clustering and mixture of Gaussians based clustering algorithm�, Foundations of Information and Knowledge systems, Springer, pp 313-330, 2006.
[19] William B. March, Parikshit Ram, Alexander G. Gray �Fast Euclidean minimum spanning tree: algorithm, analysis, and applications� In proceeding of: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, July 25-28, 2010.
[20] Y.Xu, V.Olman and D.Xu. �Minimum spanning trees for gene expression data clustering�. Genome Informatics, 12: pp 24-33, 2001.