Enhanced K-Means Clustering through Density-Based Inter-Centroid Distance Optimization

Snehal K Joshi

Open Access Article Go Back

Enhanced K-Means Clustering through Density-Based Inter-Centroid Distance Optimization

Snehal K Joshi¹

Dept. of Computer, Dolat-Usha Institute of Applied Sciences, Affiliated to Veer Narmad South Gujrat University, Valsad, India.

Section:Research Paper, Product Type: Journal Paper
Volume-13 , Issue-4 , Page no. 68-77, Apr-2025

CrossRef-DOI: https://doi.org/10.26438/ijcse/v13i4.6877

Online published on Apr 30, 2025

Copyright © Snehal K Joshi . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View

PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Citation

IEEE Style Citation: Snehal K Joshi, “Enhanced K-Means Clustering through Density-Based Inter-Centroid Distance Optimization,” International Journal of Computer Sciences and Engineering, Vol.13, Issue.4, pp.68-77, 2025.

MLA Citation

MLA Style Citation: Snehal K Joshi "Enhanced K-Means Clustering through Density-Based Inter-Centroid Distance Optimization." International Journal of Computer Sciences and Engineering 13.4 (2025): 68-77.

APA Citation

APA Style Citation: Snehal K Joshi, (2025). Enhanced K-Means Clustering through Density-Based Inter-Centroid Distance Optimization. International Journal of Computer Sciences and Engineering, 13(4), 68-77.

BibTex Citation

BibTex Style Citation:
@article{Joshi_2025,
author = {Snehal K Joshi},
title = {Enhanced K-Means Clustering through Density-Based Inter-Centroid Distance Optimization},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {4 2025},
volume = {13},
Issue = {4},
month = {4},
year = {2025},
issn = {2347-2693},
pages = {68-77},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=5793},
doi = {https://doi.org/10.26438/ijcse/v13i4.6877}
publisher = {IJCSE, Indore, INDIA},
}

RIS Citation

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v13i4.6877}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=5793
TI - Enhanced K-Means Clustering through Density-Based Inter-Centroid Distance Optimization
T2 - International Journal of Computer Sciences and Engineering
AU - Snehal K Joshi
PY - 2025
DA - 2025/04/30
PB - IJCSE, Indore, INDIA
SP - 68-77
IS - 4
VL - 13
SN - 2347-2693
ER -

VIEWS	PDF	XML
20	32 downloads	9 downloads

Bar Line

Abstract

K-Means is a popular algorithm used in unsupervised machine learning for clustering tasks, especially when working with data that lacks predefined labels. It is widely employed to divide such data into meaningful groups. This research presents an improved version of the traditional K-Means algorithm, adapting it for use on labeled datasets to evaluate its effectiveness in segmentation. The study compares this modified version with the standard K-Means method, focusing on aspects like accuracy, efficiency, and computational demand. A dataset containing more than 3,000 records is used for experimentation. The standard approach starts with K=2, randomly selecting initial centroids and refining them through iterations until results stabilize. This is repeated up to K=9. In the revised method, however, a top-down approach is implemented. Instead of selecting centroids randomly, the algorithm uses a density-based technique to place initial centroids in densely populated data regions. Clusters are formed based on these regions and refined iteratively. After each convergence, the process continues by further dividing the clusters, up to K=9. Results from the study reveal that the new approach improves performance by speeding up convergence—reducing iterations by over 20%—and lowering computational costs, while also boosting overall clustering accuracy and efficiency.

Key-Words / Index Term

Clustering, K-Means Algorithm, Density-Based Centroid Selection, Top-Down Approach, Segmentation Efficiency

References

[1] T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, and A. Y. Wu, “An efficient k-means clustering algorithm: analysis and implementation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.24, pp.881–892, 2002. DOI:10.1109/TPAMI.2002.1017616
[2] K. Wagstaff, C. Cardie, S. Rogers, and S. Schrödl, “Constrained k-means clustering with background knowledge,” Proceedings of the Eighteenth International Conference on Machine Learning, Williamstown, MA, USA, vol. 1, pp. 577–584, 2001. DOI:10.5555/645530.655646
[3] Z. Huang, “Extensions to the k-means algorithm for clustering large datasets with categorical values,” Data Mining and Knowledge Discovery, vol.2, no. 3, pp.283–304, 1998. DOI:10.1023/A:1009769707641
[4] P. J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,” Journal of Computational and Applied Mathematics, vol. 20, pp. 53–65, 1987. DOI:10.1016/0377-0427(87)90125-7
[5] S. Singh and N. S. Gill, “Analysis and study of k-means clustering algorithm,” International Journal of Engineering Research & Technology (IJERT)*, vol. 2, issue 7, pp. 2546–2551, 2013.
[6] F. Yuan, Z. H. Meng, H. X. Zhang, and C. R. Dong, “A new algorithm to get the initial centroids,” Proceedings of the 3rd International Conference on Machine Learning and Cybernetics, pp. 26–29, 2004.
[7] K. Arai and A. R. Barakbah, “Hierarchical K-means: An algorithm for centroids initialization for k-means,” International Journal of Computer Science, vol. 36, no. 1, pp. 1–6, 2007.
[8] P. Gupta and R. Sharma, “Adaptive K-means clustering for dynamic data,” Journal of Machine Learning Research, vol.18, no. 1, pp. 230–242, 2017.
[9] J. Kim, H. Park, and S. Lee, “A spatially aware K-means algorithm for image segmentation,” International Journal of Computer Vision and Image Processing, vol. 9, no. 4, pp. 145–158, 2018.
[10] A. Kumar and V. Singh, “Parallelized K-means for large-scale data clustering,” Journal of Computational Science, vol. 23, no. 2, pp.305–318, 2017.
[11] Y. Lee, S. Park, and J. Choi, “Hybrid K-means clustering with decision trees for automated cluster determination,” Pattern Recognition Letters, vol. 45, no. 6, pp. 74–86, 2019.
[12] L. Zhang, Y. Xu, and Z. Li, “A hybrid K-means algorithm based on density clustering,” Journal of Computational Mathematics, vol. 34, no. 8, pp. 1013–1025, 2016.
[13] Z. Zhao, S. Yang, and X. Liu, “Outlier-resistant K-means clustering based on data preprocessing,” Computational Statistics & Data Analysis, vol. 100, no. 3, pp. 65–80, 2016.
[14] M. H. Ali and A. Zubair, “K-means clustering for high-dimensional data: A performance evaluation,” International Journal of Computer Science and Information Security, vol. 16, no. 2, pp. 112–121, 2018.
[15] A. Banerjee and S. Ghosh, “Enhanced K-means clustering algorithm for anomaly detection in big data,” Journal of Big Data, vol. 5, no. 1, pp. 35–47, 2017.
[16] L. Yang and R. Zhang, “A comparative study of K-means clustering algorithms in data mining,” International Journal of Data Mining and Knowledge Discovery, vol. 27, no.4, pp.456–467, 2019.
[17] J. Wang and C. Li, “Improving K-means clustering through density-based initialization,” Journal of Applied Mathematical Modeling, vol. 40, no. 9, pp.5631–5638, 2016.
[18] G. Kaur and V. Kumar, “An improved K-means clustering approach for bioinformatics data analysis,” Journal of Bioinformatics and Computational Biology, vol. 15, no.1, pp.21–32, 2017.
centroid selection for large-scale datasets,” Data Science and Engineering, vol. 4, no. 3, pp.112–121, 2019.
[20] R. Smith, K. Johnson, and M. Patel, “A study on K-means clustering for high-dimensional datasets,” Journal of Data Science and Machine Learning, vol. 15, no. 3, pp. 45–60, 2018.
[21] Zhou, H., & Yang, S., Density-based k-means clustering for imbalanced datasets. International Journal of Computational Intelligence, 9(2), pp.122-130, 2017.
[22] Garcia, A., Lopez, J., & Singh, A., Hybrid approach of k-means clustering and machine learning models for improved classification accuracy. Journal of Computational Methods in Applied Sciences, 21(4), pp.275-289, 2019.
[23] Wang, Q., & Li, J., Adaptive clustering for dynamic selection of k in k-means algorithm. Journal of Computational Intelligence and Data Mining, Vol.25, Issue.6, pp.90-102, 2017.
[24] Patel, A., Kumar, V., & Singh, P., Influence of initialization methods on k-means clustering performance. International Journal of Data Analysis and Applications, Vol.14, Issue.1, pp.55-72, 2018.
[25] Tan, Y., & Zhang, Z., GA-K-means: A genetic algorithm-based optimization of k-means clustering. Journal of Evolutionary Computing and Data Science, Vol.11, Issue.3, pp.129-145, 2019.
[26] D. Saidulu, V. Devasekhar, and V. Swathi, "Secured MapReduce Based K-Means Clustering in Big Data Framework," International Journal of Computer Sciences and Engineering, Vol.7, No.5, pp.1427–1430, 2019.
[27] K. Gandhimathi and N. Umadevi, "Prediction of Type 2 Diabetics Based on Clustering Algorithm," International Journal of Computer Sciences and Engineering, Vol.8, No.11, pp.72–78, 2020.
[28] K. Sarkar and R. K. Mudi, "Fuzzy Clustering Exploiting Neighbourhood Information for Non-image Data," International Journal of Computer Sciences and Engineering, vol.12, No.2, pp.1–8, 2024.
[29] M. Kasthuri, S. Kanchana, and R. Hemalatha, "Comparative Study on Various Clustering Techniques in Data Mining," International Journal of Computer Sciences and Engineering, Vol.6, No.11, pp.1–8, 2018.
[30] B. Bhawna and A. Asha, "Study of Clustering Algorithm for Student Analysis," International Journal of Computer Sciences and Engineering, Vol.7, No.6, pp.937–940, 2019.
[31] A. Jawale and G. Magar, "Survey of Clustering Methods for Large Scale Dataset," International Journal of Computer Sciences and Engineering, Vol.7, No.5, pp.1338–1344, 2019.
[32] B. Janghel and A. Ambhaikar, "Study of Clustering Algorithm for Student Analysis," International Journal of Computer Sciences and Engineering, Vol.7, No.6, pp.937–940, 2019.
[33] D. Saidulu, V. Devasekhar, and V. Swathi, "Secured MapReduce Based K-Means Clustering in Big Data Framework," International Journal of Computer Sciences and Engineering, Vol.7, No.5, pp.1427–1430, 2019.

Citations	8797
h-index	34
i10-index	152

Impact Factor :	3.802
ISSN :	2347-2693 (Online)