Open Access   Article Go Back

Comparative Analysis of Big Data Technologies

C. Jasmine1 , A. Abinaya2

Section:Research Paper, Product Type: Journal Paper
Volume-7 , Issue-8 , Page no. 49-57, Aug-2019

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v7i8.4957

Online published on Aug 31, 2019

Copyright © C. Jasmine, A. Abinaya . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: C. Jasmine, A. Abinaya, “Comparative Analysis of Big Data Technologies,” International Journal of Computer Sciences and Engineering, Vol.7, Issue.8, pp.49-57, 2019.

MLA Style Citation: C. Jasmine, A. Abinaya "Comparative Analysis of Big Data Technologies." International Journal of Computer Sciences and Engineering 7.8 (2019): 49-57.

APA Style Citation: C. Jasmine, A. Abinaya, (2019). Comparative Analysis of Big Data Technologies. International Journal of Computer Sciences and Engineering, 7(8), 49-57.

BibTex Style Citation:
@article{Jasmine_2019,
author = {C. Jasmine, A. Abinaya},
title = {Comparative Analysis of Big Data Technologies},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {8 2019},
volume = {7},
Issue = {8},
month = {8},
year = {2019},
issn = {2347-2693},
pages = {49-57},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=4788},
doi = {https://doi.org/10.26438/ijcse/v7i8.4957}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v7i8.4957}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=4788
TI - Comparative Analysis of Big Data Technologies
T2 - International Journal of Computer Sciences and Engineering
AU - C. Jasmine, A. Abinaya
PY - 2019
DA - 2019/08/31
PB - IJCSE, Indore, INDIA
SP - 49-57
IS - 8
VL - 7
SN - 2347-2693
ER -

VIEWS PDF XML
336 305 downloads 147 downloads
  
  
           

Abstract

Recent technological advances and reduction in storage prices has led to accumulation of huge amount of data known as Big Data. This data, belonging to different applications and timelines, is difficult for organisations to process. In order to solve this difficulty, Doug Cutting and Mike Cafarella came up with a framework called Hadoop. Becoming open source in 2012, Hadoop went on to include Pig, Hive and many more products. Following this, Spark was developed by MatieZaharia in 2009 which was open sourced in 2010. Meanwhile, many organisations came up with their own platforms to deal with Big Data. Hence, sprouting from Google`s MapReduce paper, these tools have grown into a wide array of technologies. This project focusses on comparing three main big data technologies which are used widely these days namely Pig, Hive and R. Similar problem statements are executed on all three platforms and performance is judged based upon the query execution time.

Key-Words / Index Term

Hadoop, HDFS, Big Data, Pig, Hive, R

References

[1] SherifSakr, AmalElgammal, Towards a Comprehensive Data Analytics Framework for Smart Healthcare Services, Intl Journal of Big Data Research (Elsevier), Pg:44-58, Vol. 4, (2016)
[2] Zhijiiang Chen, GuobinXu, VivekMahalingam, LinqiangGe, James Nguyen, Wei Yu, Chao Lu, A Cloud Computing Based Network Monitoring and Threat Detection System for Critical Infrastructures, Pg:44-58, Vol. 4, (2016)
[3] Xiaolong Jin, Benjamin W. Wah, Xueqi Cheng, Yuanzhuo Wang, Signifance and Challenges of Big Data Research, Pg:59-64, Vol. 2, (2015)
[4] Mohammad NaimurRahman, Amir Esmailpour, Junhui Zhao, Machine Learning Generation Forecasting System, Pg:9-15, Vol. 5, (2015)
[5] Yaxiao Liu, Henan Wang, Guoliang Li, JunyangGao, Huiqi Hu, Wen-Syan Li, ELAN: An Efficient Location-Aware Analytics System, , Pg:16-21, Vol. 5, (2016)
[6] SurajitChaudhuri, What Next? A Half-Dozen Data Management Research Goals for Big Data and the Cloud
[7] Daniel Nunan, Maria Di Domenico, Market research & the ethics of big data
[8] Jeffrey Dean, Sanjay Ghemawat, MapReduce Simplified Data Processing on Large Clusters.
[9] Hsinchun Chen, Roger H. L. Chiang, Veda C. Storey, BUSINESS INTELLIGENCE AND ANALYTICS: FROM BIG DATA TO BIG IMPACT
[10] Kyong-Ha Lee Yoon-Joon Lee, Hyunsik Choi Yon Dohn Chung, Bongki Moon, Parallel Data Processing with MapReduce: A Survey, SIGMOD Record, December 2011 (Vol. 40, No. 4)
[11] AbdelrahmanElsayed, Osama Ismail, and Mohamed E. El-Sharkawi, MapReduce: State-of-the-Art and Research Directions.
[12] Xindong Wu, Xingquan Zhu, Gong-Qing Wu and Wei Ding, Data Mining with Big Data (IEEE), Vol. 26, NO. 1, JANUARY 2014
[13] Michele De Gennaro, Elena Paffumi, Giorgio Martini, Big Data for Supporting Low-Carbon Road Transport Policies in Europe: Applications, Challenges and Opportunities, Intl Journal of Big Data Research (Elsevier), 2 June 2016
[14] Cui Yu, Josef Boyd, FB+- tree for Big Data Management, Intl Big Data Research (Elsevier), Pg: 25-36, Vol. 4, June 2016
[15] Diamantoulakis, P.D., Kapinas, V.M. Karagiannidis, G.K., Big Data Analytics for Dynamic Energy Management in Smart Grids, Intl Big Data Research (Elsevier), Pg: 94-101, Vol. 2, September 01 2015
[16] Paakkonen, P., Pakkala, D., Reference Architecture and Classification of Technologies, Products and Services for Big Data Systems, Intl Big Data Research (Elsevier), Pg: 166-186, Vol. 2, December 01 2015
[17] Bong-Hwa-Hong and Hae-Jong Joo, A Study on The Monitoring Model for Traffic Analysis and Application of Big Data, Intl Research on Big Data (Elsevier), Pg.- 30-35, Vol. 43, 2013
[18] Mikin K. Dagli and Brijesh B. Mehta, Big Data and Hadoop: Review, Intl research on Big Data (Elsevier), Pg.-192-196, Vol.2, February 2014
[19] Thusoo, Ashish, JoydeepSenSarma, Namit Jain, Zheng Shao, Prasad Chakka, Ning Zhang, Suresh Antony, Hao Liu, and Raghotham Murthy. "Hive-a petabyte scale data warehouse using hadoop." In Data Engineering (ICDE), 2010 IEEE 26th International Conference on, pp. 996-1005. IEEE, 2010
[20] Olston, Christopher, Benjamin Reed, UtkarshSrivastava, Ravi Kumar, and Andrew Tomkins. "Pig latin: a not-so-foreign language for data processing." In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pp. 1099-1110. ACM, 2008
[21] Gates, Alan F., Olga Natkovich, Shubham Chopra, PradeepKamath, Shravan M. Narayanamurthy, Christopher Olston, Benjamin Reed, SanthoshSrinivasan, and UtkarshSrivastava. "Building a high-level dataflow system on top of Map-Reduce: the Pig experience." Proceedings of the VLDB Endowment 2, no. 2 (2009): 1414-1425