Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis in An Optimized Manner

R.S. Nyaykhor, N.T. Deotale

Open Access Article Go Back

Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis in An Optimized Manner

R.S. Nyaykhor¹ , N.T. Deotale²

Section:Research Paper, Product Type: Journal Paper
Volume-2 , Issue-3 , Page no. 31-35, Mar-2014

Online published on Mar 30, 2014

Copyright © R.S. Nyaykhor, N.T. Deotale . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View

PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Style Citation: R.S. Nyaykhor, N.T. Deotale, “Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis in An Optimized Manner,” International Journal of Computer Sciences and Engineering, Vol.2, Issue.3, pp.31-35, 2014.

MLA Style Citation: R.S. Nyaykhor, N.T. Deotale "Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis in An Optimized Manner." International Journal of Computer Sciences and Engineering 2.3 (2014): 31-35.

APA Style Citation: R.S. Nyaykhor, N.T. Deotale, (2014). Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis in An Optimized Manner. International Journal of Computer Sciences and Engineering, 2(3), 31-35.

BibTex Style Citation:
@article{Nyaykhor_2014,
author = {R.S. Nyaykhor, N.T. Deotale},
title = {Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis in An Optimized Manner},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {3 2014},
volume = {2},
Issue = {3},
month = {3},
year = {2014},
issn = {2347-2693},
pages = {31-35},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=63},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=63
TI - Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis in An Optimized Manner
T2 - International Journal of Computer Sciences and Engineering
AU - R.S. Nyaykhor, N.T. Deotale
PY - 2014
DA - 2014/03/30
PB - IJCSE, Indore, INDIA
SP - 31-35
IS - 3
VL - 2
SN - 2347-2693
ER -

VIEWS	PDF	XML
3903	3618 downloads	3654 downloads

Bar Line

Abstract

Data mining is the domain which has utility in real world applications. Data sets are prepared from regular transactional databases for the purpose of data mining. However, preparing datasets manually is time consuming and tedious in nature as it involves aggregations, sub queries and joins. Moreover the traditional SQL Structured Query Language) aggregations such as MAX, MIN etc. can generate single row output which is not useful in generating datasets. Therefore it is essential to build horizontal aggregations that can generate datasets in horizontal layout. These data sets can be used further for data mining in the real world applications. This paper focuses on building user-defined horizontal aggregations such as PIVOT, SPJ (SELECT PROJECT JOIN) and CASE whose underlying logic uses SQL queries.

Key-Words / Index Term

Data Mining, Horizontal Aggregations, PIVOT, CASE, SQL, Data Sets

References

[1] J. Gray, A. Bosworth, A. Layman, and H. Pirahesh. ï¿½Data cube: A relational aggregation operator generalizing group-by, cross-tab and subtotalï¿½. In ICDE Conference, pages 152ï¿½159,1996 .
[2] E.F. Codd, ï¿½Extending the Database Relational Model to Capture More Meaning,ï¿½ ACM Trans. Database Systems, vol. 4, no. 4, pp. 397-434, 1979.
[3] Rajesh Reddy Muley, Sravani Achanta and Prof.S.V.Achutha Rao, ï¿½Query Optimization Approach in SQL to prepare Data Sets for Data Mining Analysisï¿½, International Journal of Computer Trends and Technology (IJCTT) ï¿½ vol.4, no.8, pp 1-5,August 2013.
[4] J.A. Blakeley, V. Rao, I. Kunen, A. Prout, M. Henaire, and C. Kleinerman, ï¿½.NET Database Programmability and Extensibility in Microsoft SQL Server,ï¿½ Proc. ACM SIGMOD Intï¿½l Conf. Management of Data (SIGMOD ï¿½08), pp. 1087-1098, 2008.
[5] C. Ordonez. ï¿½Integrating K-means clustering with a relational DBMS using SQL,ï¿½ IEEE Transactions on Knowledge and Data Engineering (TKDE), 18(2):188ï¿½201, 2006.
[6] H. Wang, C. Zaniolo, and C.R. Luo.ï¿½ATLaS: A small but complete SQL extension for data mining and data streamsï¿½. In Proc. VLDB Conference, pages 1113ï¿½1116, 2003.
[7] S. Sarawagi, S. Thomas, and R. Agrawal, ï¿½Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications,ï¿½ Proc. ACM SIGMOD Intï¿½l Conf. Management of Data (SIGMOD ï¿½98), pp. 343-354, 1998.
[8] A. Witkowski, S. Bellamkonda, T. Bozkaya, G. Dorman, N. Folkert, A. Gupta, L. Sheng, and S. Subramanian, ï¿½Spreadsheets in RDBMS for OLAP,ï¿½ Proc. ACM SIGMOD Intï¿½l Conf. Management of Data (SIGMOD ï¿½03), pp. 52-63, 2003.
[9] H. Garcia-Molina, J.D. Ullman, and J. Widom, ï¿½Database Systems: The Complete Bookï¿½, first ed. Prentice Hall, 2001.
[10] C. Galindo-Legaria and A. Rosentahl, ï¿½Outer Join Simplification and Reordering for Query Optimization,ï¿½ ACM Trans. Database Systems, vol.22, no.1, pp.43-73, 1997.
[11] G. Bhargava, P. Goel, and B.R. Iyer, ï¿½Hypergraph Based Reorderings of Outer Join Queries with Complex Predicates,ï¿½ Proc. ACM SIGMOD Intï¿½l Conf. Management of Data (SIGMOD ï¿½95), pp. 304-315, 1995.
[12] J. Gray, A. Bosworth, A. Layman, and H. Pirahesh. ï¿½Data cube: A relational aggregation operator generalizing group-by, cross-tab and subtotalï¿½. In ICDE Conference, pages 152ï¿½159,1996.
[13] G. Graefe, U. Fayyad, and S. Chaudhuri, ï¿½On the Efficient Gathering of Sufficient Statistics for Classification from Large SQL Databases,ï¿½ Proc. ACM Conf. Knowledge Discovery and Data Mining (KDD ï¿½98), pp. 204-208, 1998.
[14] J. Clear, D. Dunn, B. Harvey, M.L. Heytens, and P. Lohman, ï¿½Non- Stop SQL/MX Primitives for Knowledge Discovery,ï¿½ Proc. ACM SIGKDD Fifth Intï¿½l Conf. Knowledge Discovery and Data Mining (KDD ï¿½99), pp. 425-429, 1999.
[15] C. Cunningham, G. Graefe, and C.A. Galindo-Legeria, ï¿½PIVOT AND UNPIVOT: Optimization and Execution Strategies in an RDBMS,ï¿½Proc: 13th Intï¿½l Conf. Very Large Data Bases (VLDSï¿½04), pp.998-1009, 2004.
[16] C. Ordonez, ï¿½Horizontal Aggregations for Building Tabular Data Sets,ï¿½ Proc. Ninth ACM SIGMOD Workshop Data Mining and Knowledge Discovery (DMKD ï¿½04), pp. 35-42, 2004.
[17] C. Ordonez, ï¿½Horizontal Aggregations for Building Tabular Data Sets,ï¿½ Proc. Ninth ACM SIGMOD Workshop Data Mining and Knowledge Discovery (DMKD ï¿½04), pp. 35-42, 2004.
[18] C. Ordonez, ï¿½Vertical and Horizontal Percentage Aggregations,ï¿½ Proc. ACM SIGMOD Intï¿½l Conf. Management of Data (SIGMODï¿½04), pp. 866-871,2004.
[19] Carlos Ordonez and Zhibo Chen,ï¿½ Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysisï¿½, IEEE transactions on knowledge and data engineering, vol. 24, no. 4, pp 1-14, April 2012.
[19] G. Luo, J.F. Naughton, C.J. Ellmann, and M. Watzke, ï¿½Locking Protocols for Materialized Aggregation Join Views,ï¿½ IEEE Trans. Knowledge and Data Eng., vol. 17, no.6, pp. 796-807, June 2005.
[20] Jasna S and Manu J Pillai. Article: Preparing Data Sets for the Data Mining Analysis using the Most Efficient Horizontal Aggregation Method in SQL. International Journal of Computer Applications 86(13):32-36, January 2014.

Citations	2325
h-index	16
i10-index	47