Open Access   Article Go Back

The Real Time Big Data Processing Framework: Advantages and Limitations

Vairaprakash Gurusamy1 , S. Kannan2 , K. Nandhini3

  1. School of IT, Madurai Kamaraj University, Madurai, India.
  2. School of IT, Madurai Kamaraj University, Madurai, India.
  3. Technical Support Engineer, Concentrix India Pvt Ltd, Chennai, India.

Correspondence should be addressed to: skannanmku@gmail.com.

Section:Review Paper, Product Type: Journal Paper
Volume-5 , Issue-12 , Page no. 305-312, Dec-2017

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v5i12.305312

Online published on Dec 31, 2017

Copyright © Vairaprakash Gurusamy, S. Kannan, K. Nandhini . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Vairaprakash Gurusamy, S. Kannan, K. Nandhini, “The Real Time Big Data Processing Framework: Advantages and Limitations,” International Journal of Computer Sciences and Engineering, Vol.5, Issue.12, pp.305-312, 2017.

MLA Style Citation: Vairaprakash Gurusamy, S. Kannan, K. Nandhini "The Real Time Big Data Processing Framework: Advantages and Limitations." International Journal of Computer Sciences and Engineering 5.12 (2017): 305-312.

APA Style Citation: Vairaprakash Gurusamy, S. Kannan, K. Nandhini, (2017). The Real Time Big Data Processing Framework: Advantages and Limitations. International Journal of Computer Sciences and Engineering, 5(12), 305-312.

BibTex Style Citation:
@article{Gurusamy_2017,
author = {Vairaprakash Gurusamy, S. Kannan, K. Nandhini},
title = {The Real Time Big Data Processing Framework: Advantages and Limitations},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {12 2017},
volume = {5},
Issue = {12},
month = {12},
year = {2017},
issn = {2347-2693},
pages = {305-312},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=1621},
doi = {https://doi.org/10.26438/ijcse/v5i12.305312}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v5i12.305312}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=1621
TI - The Real Time Big Data Processing Framework: Advantages and Limitations
T2 - International Journal of Computer Sciences and Engineering
AU - Vairaprakash Gurusamy, S. Kannan, K. Nandhini
PY - 2017
DA - 2017/12/31
PB - IJCSE, Indore, INDIA
SP - 305-312
IS - 12
VL - 5
SN - 2347-2693
ER -

VIEWS PDF XML
1297 400 downloads 391 downloads
  
  
           

Abstract

Big data is a blanket term for the non-traditional strategies and technologies needed to gather, organize, process, and gather insights from large datasets. While the problem of working with data that exceeds the computing power or storage of a single computer is not new, the pervasiveness, scale, and value of this type of computing have greatly expanded in recent years. In this paper, we will take a look at one of the essential components of a big data system: processing frameworks. Processing frameworks compute over the data in the system, either by reading from non-volatile storage or as it is ingested into the system. Computing over data is the process of extracting information and insight from large quantities of individual data points.

Key-Words / Index Term

Big Data, Hadoop, HDFS, Spark, Storm, Flink, Samza

References

[1] A. Alexandrov, R. Bergmann, S. Ewen, J.-C. Freytag, F. Hueske, A. Heise, O. Kao, M. Leich, U. Leser, V. Markl, F. Naumann, M. Peters, A. Rheinlander, M. J. Sax, S. Schelter, M. Hoger, K. Tzoumas, and D. Warneke. The stratosphere platform for big data analytics. The VLDB Journal, 23(6):939-964, 2014.
[2] K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The Hadoop Distributed File System. In IEEE MSST, 2010.
[3] S. Aridhi and E. M. Nguifo. Big graph mining: Frameworks and techniques. Big Data Research, 6:1-10, 2016.
[4] Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst. The hadoop approach to large-scale iterative data analysis. The VLDB Journal, 21(2):169-190, Apr. 2012.
[5] P. Carbone, A. Katsifodimos, S. Ewen, V. Markl, S. Haridi, and K. Tzoumas. Apache inkTM: Stream and batch processing in a single engine. IEEE Data Eng. Bull., 38(4):28-38, 2015.
[6] J. Dean and S. Ghemawat. MapReduce: simpli_ed data processing on large clusters. Commun. ACM, 51(1):107-113, 2008.
[7] D. Eadline. Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem. Addison-Wesley Professional, 1st edition, 2015.
[8] B. Elser and A. Montresor. An evaluation study of bigdata frameworks for graph processing. In IEEE International Conference on Big Data, pages 60-67, 2013.
[9] A. Gandomi and M. Haider. Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management, 35(2):137-144, 2015.
[10] R. Li, H. Hu, H. Li, Y. Wu, and J. Yang. Mapreduce parallel programming model: A state-of-the-art survey. International Journal of Parallel Programming, pages 1-35, 2015.
[11] X. Liu, N. Iftikhar, and X. Xie. Survey of real-time processing systems for big data. In Proceedings of the 18th International Database Engineering & Applications Symposium, pages 356-361. ACM, 2014.
[12] D. Singh and C. K. Reddy. A survey on platforms for big data analytics. Journal of Big Data, 2(1):8, 2014.
[13] M. Tatineni, X. Lu, D. Choi, A. Majumdar, and D. K. D. Panda. Experiences and bene_ts of running rdma hadoop and spark on sdsc comet. In Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale, XSEDE16, pages 23:1-23:5, New York, NY, USA, 2016. ACM.
[14] R. S. Xin, J. E. Gonzalez, M. J. Franklin, and I. Stoica. Graphx: A resilient distributed graph system on spark. In First International Workshop on Graph Data Management Experiences and Systems, GRADES `13, pages 2:1-2:6, New York, NY, USA, 2013. ACM.